Tutorial for PARK data fitting - PowerPoint PPT Presentation

1 / 26
About This Presentation
Title:

Tutorial for PARK data fitting

Description:

cygwin. Client: wxPython: version = 2.6. matplot. Most services may need ... Or in cygwin in Windows. cd Lib/site-packages/park/servers. python mapServer.py ... – PowerPoint PPT presentation

Number of Views:40
Avg rating:3.0/5.0
Slides: 27
Provided by: EIT71
Category:

less

Transcript and Presenter's Notes

Title: Tutorial for PARK data fitting


1
Tutorial for PARK data fitting
  • Paul KIENZLE, Wenwu CHEN and Ziwen FU
  • Reflectometry Group

2
Objective Distributed Computing Environment
User
User
User
User
User
User/Client ServiceServer Management Working
Server
Cluster
Service Server Master Node
Working Nodes
3
Prerequisite
Python version gt 2.40 Windows
cygwin Client wxPython version gt
2.6 matplot Most services may need numpy
4
Setup of park
  • Download Source code
  • Source code svn co svn//svn_at_danse.us/park
  • Package for unix/linux park-0.2.0.tar.gz
    park-0.2.0.tar.bz2
  • Package for windows park-0.2.0.zip
  • Edit cluster config file
  • park/config/hosts
  • Start service server
  • park/servers/mapServer.py
  • Start client
  • park/client/AppJob.py
  • Provide services
  • park/services

5
Setup of park in Unix/Linux
  • Download park-0.2.0.tar.gz or park-0.2.0.tar.bz2
    from http//danse.us
  • Unzip the file
  • tar xvzf park-0.2.0.tar.gz
  • Make the installation
  • cd park-0.2.0
  • make install
  • or
  • setup.py install install-purelibhome_directory
    _of_park
  • The command make install is equivalent to
    setup.py install install-purelib. It will
    install park in directory /park.

6
Setup of park in Windows
  • Download park-0.2.0.zip or park-0.2.0.tar.bz2
    from http//danse.us
  • Unzip the file
  • unzip park-0.2.0.zip
  • Make the installation in MSDOS window
  • cd park-0.2.0
  • setup.py install
  • It will install park in directory
    /Lib/site-packages/park.

7
Edit the config file
  • The server makes use of park/config/hosts to
    configure the working nodes.
  • Example of park/config/hosts
  • hosts configure file for park
  • example for compufans.ncnr.nist.gov cluster
  • 4 nodes, each node with 2 cpus
  • the format is similar to that of /ect/hosts
  • ip_address full_name alias_nameportnumber_of_
    cpus
  • 127.0.0.1 localhost.localdomain
    localhost53002
  • 172.16.255.251 n4.ncnr.nist.gov n465002
  • 172.16.255.252 n3.ncnr.nist.gov n363002
  • 172.16.255.253 n2.ncnr.nist.gov n262002
  • 172.16.255.254 n1.ncnr.nist.gov n161002

8
Start the server
  • The server is park/servers/mapServer.py
  • cd park/servers
  • python mapServer.py
  • Or in cygwin in Windows
  • cd Lib/site-packages/park/servers
  • python mapServer.py
  • The full command is
  • python mapServer.py port port host host_name
    log log_file_name.

9
Start the server
  • Make sure that python and its environments are
    set correctly.
  • Make sure that RSH defined in park/servers/environ
    .py is set to the remote shell command for
    cluster with multiple working nodes
  • Make sure that this remote shell command can
    start the remote command without the password.
  • Make sure that the services are executable files.
  • Common Error
  • Errno 2 No such file or directory
    '/park/config/hosts' no configure file hosts.
  • ERROR (111, 'Connection refused')
  • the working server doesnt start.
  • make sure that the port is not used
  • ERROR (xxx, port is used')
  • Wait a while before restart the server
  • make sure that the port is not used

10
Stop the server
  • Shut down the service server by Ctrl-C or kill
    command.
  • Use kill without -9 command, which will also stop
    the working server program. Otherwise the working
    server will continue to work even the service
    server is killed.

11
Start the client
  • Enter /park/client
  • Run the client application
  • python AppJob.py
  • Connect the server
  • server gt server port (default port is 5400)
  • click connect button to connect the server.
  • Prepare and submit the service request
  • shell gt load load xml service request, which
    will be shown in the upper text field
  • click submit button to submit the service request
  • the message related to service request is shown
    in the lower text field.
  • View the service results
  • view to view the results.
  • There are 3 types of data to be viewed
    experimental data (with error bar), simulation
    data, and chi square. The experimental and
    simulation data only show the best results, and
    chi square shows the improvement of chisq for
    data fitting. Under the panel is a toolbar, which
    can be used to zoom in/out, save figure, and
    change the properties of figure (property
    button).
  • Shutdown the client
  • server gt disconnect then close the window
  • or close the window directly.

12
Map-reduce parallel pattern
  • Map master node assigns working unit i to
    working node j
  • map(fn, inputi ) outputi to working node j
  • Reduce master node collection message from each
    working node
  • and perform reduce function, and send the result
    to the user
  • reduce(gn, output0, , outputn ) gt send to
    the user client

Service Server reducing
Service Server Mapping
Working Nodes
13
Service request
  • lt?xml version'1.0' encoding'UTF-8'?gt
  • ltsession version'2.0.1' type'7' user'wwchen
    email'wwchen_at_nist.gov' priority'0' gt
  • ltgroup name'group1'gt
  • ltdataSetgt
  • lt/dataSetgt
  • ltreduce classname'Chisq'/gt
  • lttask cmd'longwinstr.py' gt
  • ltbufsize value'3000'/gt
  • lthome value'/home/wwchen/dansesrc/par
    k/services/tester'/gt
  • ltcwd value'/home/wwchen/dansesrc/park
    /servers/tester'/gt
  • lt/taskgt
  • ltjoblist name'job1' priority'4' cnt'4'
    gt
  • ltinput count'24'gt
  • lt/inputgt
  • lt/joblistgt
  • lt/groupgt
  • lt/sessiongt

Reduce function
map function
inputs
14
Software Infrastructure of PARKfor data fitting
Data presentation
Data reduction
User Interface
Service Server
Data View
Working Nodes
15
Reduce function
  • The class inherits from park/services/reduce/reduc
    e.Reduce.
  • class Reduce """ A base class as the reduce
    function. """
  • def __init__(self) """ constructor. """
  • self.archive None
  • self.msgqueue None
  • def setArchive(self, archive) """ set
    the archive to store data """
  • self.archive archive
  • def setMsgQueue(self, msgqueue) """ set
    the message queue. """
  • self.msgqueue msgqueue
  • def __call__(self, msg)
  • """ called by the PARK to process the reply
    from the working node. """
  • pass

16
A example of Reduce function
  • park/services/reduce/Chisq.Chisq
  • class Chisq (Reduce) """ A class to handle
    the chisq for data fitting. """
  • def __init__(self) """ constructor. """
  • Reduce.__init__(self)
  • self.chisq None
  • def __call__(self, reply)
  • keys keys'gid' reply.gid
    keys'jid' reply.id
  • self.archive.put(keys, str(reply))
  • if hasattr(reply, 'chisq')
  • chisqval self.chisq
  • if self.chisq is None
  • self.chisq chisqval
  • elif chisqval lt self.chisq
  • self.chisq chisqval
  • self.msgqueue.putMsg(reply.gid, 'sltreply
    gid"s" update"s" chisqs/gt' \
  • (XML_HEADER,
    str(reply.gid), str(reply.id), str(chisqval)))

17
map function
  • The pure python function.
  • - Running as a thread in PARK.
  • Bad scalability for SMP (due to python
    multithreading implementation)
  • Only works for pure python function.
  • Format
  • output_string function_name(input_string)
  • The executable program.
  • - Running as a separated process in PARK.
  • Excellent scalability for SMP
  • Works for any executable program
  • Need more memory and long start-up time
  • Read input from the standard in and output the
    results to standard out.

18
A example of map function
  • park/services/tester/longwinstr.py
  • if __name__ '__main__'
  • try
  • longwin()
  • except
  • sys.stderr.write('Exceptions'
    (sys.exc_info()1))

19
A example of map function
  • def longwin()
  • print 'call longwin'
  • s0 sys.stdin.read()
  • node minidom.parseString(s0).childNodes0
  • t int(node.getAttribute('count'))
  • if t gt 25
  • count t
  • else
  • count 2t
  • print ' Start work with iteration number ',
    t
  • cnt 0
  • while (cnt lt count)
  • a math.sqrt(2.0)
  • cnt 1
  • print ' finish work cnt', cnt

20
Fully Distributed Services ?
User
Client
Services
Service Register
Message Queue
Job Queue
Cluster Management
Task Management
Service Management
Data Fetching
Archive
Logging
Shared Files
21
Pull or put ?
1. Job server sends job to working server, and
working server send results to message server 2.
Job server sends job to working server, and
message server working retrieve results
from working server 3. Working server retrieves
job from job server and send results to message
server 4. Working server retrieves job from job
server and message server working retrieve
results from working server
22
Security authentication and authorization
Job Server
Security Server
MessageServer
Working Server
23
Data Transfer
  • Provide the data center server for the cluster,
    which will retrieve data from remote data server,
    and store the data for the accessing by the local
    working nodes. Necessary for diskless nodes in
    the cluster.
  • Provide the reference to the remote data (similar
    to url), and each working node will access the
    data individually.

24
UI/Visualization
MVC model Traits-UI 2D/3D
25
Multi-tier of PARK
Client Server
Explicit direct connection Implicit direct
connection Possible connection
Service Server
Reduce Server
Working Server
Data Server
All are working as both the server and the client
26
Multi-tier of PARK
Client Server
Explicit direct connection Implicit direct
connection Possible connection
Service Server
Reduce Server
Working Server
Data Server
All are working as both the server and the client
Write a Comment
User Comments (0)
About PowerShow.com