Title: Tutorial for PARK data fitting
1Tutorial for PARK data fitting
- Paul KIENZLE, Wenwu CHEN and Ziwen FU
- Reflectometry Group
2Objective Distributed Computing Environment
User
User
User
User
User
User/Client ServiceServer Management Working
Server
Cluster
Service Server Master Node
Working Nodes
3Prerequisite
Python version gt 2.40 Windows
cygwin Client wxPython version gt
2.6 matplot Most services may need numpy
4Setup of park
- Download Source code
- Source code svn co svn//svn_at_danse.us/park
- Package for unix/linux park-0.2.0.tar.gz
park-0.2.0.tar.bz2 - Package for windows park-0.2.0.zip
- Edit cluster config file
- park/config/hosts
- Start service server
- park/servers/mapServer.py
- Start client
- park/client/AppJob.py
- Provide services
- park/services
5Setup of park in Unix/Linux
- Download park-0.2.0.tar.gz or park-0.2.0.tar.bz2
from http//danse.us - Unzip the file
- tar xvzf park-0.2.0.tar.gz
- Make the installation
- cd park-0.2.0
- make install
- or
- setup.py install install-purelibhome_directory
_of_park - The command make install is equivalent to
setup.py install install-purelib. It will
install park in directory /park.
6Setup of park in Windows
- Download park-0.2.0.zip or park-0.2.0.tar.bz2
from http//danse.us - Unzip the file
- unzip park-0.2.0.zip
- Make the installation in MSDOS window
- cd park-0.2.0
- setup.py install
- It will install park in directory
/Lib/site-packages/park.
7Edit the config file
- The server makes use of park/config/hosts to
configure the working nodes. - Example of park/config/hosts
-
- hosts configure file for park
- example for compufans.ncnr.nist.gov cluster
- 4 nodes, each node with 2 cpus
-
- the format is similar to that of /ect/hosts
- ip_address full_name alias_nameportnumber_of_
cpus -
- 127.0.0.1 localhost.localdomain
localhost53002 - 172.16.255.251 n4.ncnr.nist.gov n465002
- 172.16.255.252 n3.ncnr.nist.gov n363002
- 172.16.255.253 n2.ncnr.nist.gov n262002
- 172.16.255.254 n1.ncnr.nist.gov n161002
8Start the server
- The server is park/servers/mapServer.py
- cd park/servers
- python mapServer.py
- Or in cygwin in Windows
- cd Lib/site-packages/park/servers
- python mapServer.py
- The full command is
- python mapServer.py port port host host_name
log log_file_name.
9Start the server
- Make sure that python and its environments are
set correctly. - Make sure that RSH defined in park/servers/environ
.py is set to the remote shell command for
cluster with multiple working nodes - Make sure that this remote shell command can
start the remote command without the password. - Make sure that the services are executable files.
- Common Error
- Errno 2 No such file or directory
'/park/config/hosts' no configure file hosts. - ERROR (111, 'Connection refused')
- the working server doesnt start.
- make sure that the port is not used
- ERROR (xxx, port is used')
- Wait a while before restart the server
- make sure that the port is not used
10Stop the server
- Shut down the service server by Ctrl-C or kill
command. - Use kill without -9 command, which will also stop
the working server program. Otherwise the working
server will continue to work even the service
server is killed.
11Start the client
- Enter /park/client
- Run the client application
- python AppJob.py
- Connect the server
- server gt server port (default port is 5400)
- click connect button to connect the server.
- Prepare and submit the service request
- shell gt load load xml service request, which
will be shown in the upper text field - click submit button to submit the service request
- the message related to service request is shown
in the lower text field. - View the service results
- view to view the results.
- There are 3 types of data to be viewed
experimental data (with error bar), simulation
data, and chi square. The experimental and
simulation data only show the best results, and
chi square shows the improvement of chisq for
data fitting. Under the panel is a toolbar, which
can be used to zoom in/out, save figure, and
change the properties of figure (property
button). - Shutdown the client
- server gt disconnect then close the window
- or close the window directly.
12Map-reduce parallel pattern
- Map master node assigns working unit i to
working node j - map(fn, inputi ) outputi to working node j
- Reduce master node collection message from each
working node - and perform reduce function, and send the result
to the user - reduce(gn, output0, , outputn ) gt send to
the user client
Service Server reducing
Service Server Mapping
Working Nodes
13Service request
- lt?xml version'1.0' encoding'UTF-8'?gt
- ltsession version'2.0.1' type'7' user'wwchen
email'wwchen_at_nist.gov' priority'0' gt - ltgroup name'group1'gt
- ltdataSetgt
- lt/dataSetgt
- ltreduce classname'Chisq'/gt
- lttask cmd'longwinstr.py' gt
- ltbufsize value'3000'/gt
- lthome value'/home/wwchen/dansesrc/par
k/services/tester'/gt - ltcwd value'/home/wwchen/dansesrc/park
/servers/tester'/gt - lt/taskgt
- ltjoblist name'job1' priority'4' cnt'4'
gt - ltinput count'24'gt
- lt/inputgt
- lt/joblistgt
- lt/groupgt
- lt/sessiongt
Reduce function
map function
inputs
14Software Infrastructure of PARKfor data fitting
Data presentation
Data reduction
User Interface
Service Server
Data View
Working Nodes
15Reduce function
- The class inherits from park/services/reduce/reduc
e.Reduce. - class Reduce """ A base class as the reduce
function. """ - def __init__(self) """ constructor. """
- self.archive None
- self.msgqueue None
- def setArchive(self, archive) """ set
the archive to store data """ - self.archive archive
- def setMsgQueue(self, msgqueue) """ set
the message queue. """ - self.msgqueue msgqueue
- def __call__(self, msg)
- """ called by the PARK to process the reply
from the working node. """ - pass
16A example of Reduce function
- park/services/reduce/Chisq.Chisq
- class Chisq (Reduce) """ A class to handle
the chisq for data fitting. """ - def __init__(self) """ constructor. """
- Reduce.__init__(self)
- self.chisq None
- def __call__(self, reply)
- keys keys'gid' reply.gid
keys'jid' reply.id - self.archive.put(keys, str(reply))
- if hasattr(reply, 'chisq')
- chisqval self.chisq
- if self.chisq is None
- self.chisq chisqval
- elif chisqval lt self.chisq
- self.chisq chisqval
- self.msgqueue.putMsg(reply.gid, 'sltreply
gid"s" update"s" chisqs/gt' \ - (XML_HEADER,
str(reply.gid), str(reply.id), str(chisqval)))
17map function
- The pure python function.
- - Running as a thread in PARK.
- Bad scalability for SMP (due to python
multithreading implementation) - Only works for pure python function.
- Format
- output_string function_name(input_string)
- The executable program.
- - Running as a separated process in PARK.
- Excellent scalability for SMP
- Works for any executable program
- Need more memory and long start-up time
- Read input from the standard in and output the
results to standard out.
18A example of map function
- park/services/tester/longwinstr.py
- if __name__ '__main__'
- try
- longwin()
- except
- sys.stderr.write('Exceptions'
(sys.exc_info()1))
19A example of map function
- def longwin()
- print 'call longwin'
- s0 sys.stdin.read()
- node minidom.parseString(s0).childNodes0
- t int(node.getAttribute('count'))
- if t gt 25
- count t
- else
- count 2t
- print ' Start work with iteration number ',
t - cnt 0
- while (cnt lt count)
- a math.sqrt(2.0)
- cnt 1
-
- print ' finish work cnt', cnt
20Fully Distributed Services ?
User
Client
Services
Service Register
Message Queue
Job Queue
Cluster Management
Task Management
Service Management
Data Fetching
Archive
Logging
Shared Files
21Pull or put ?
1. Job server sends job to working server, and
working server send results to message server 2.
Job server sends job to working server, and
message server working retrieve results
from working server 3. Working server retrieves
job from job server and send results to message
server 4. Working server retrieves job from job
server and message server working retrieve
results from working server
22Security authentication and authorization
Job Server
Security Server
MessageServer
Working Server
23Data Transfer
- Provide the data center server for the cluster,
which will retrieve data from remote data server,
and store the data for the accessing by the local
working nodes. Necessary for diskless nodes in
the cluster. - Provide the reference to the remote data (similar
to url), and each working node will access the
data individually.
24UI/Visualization
MVC model Traits-UI 2D/3D
25Multi-tier of PARK
Client Server
Explicit direct connection Implicit direct
connection Possible connection
Service Server
Reduce Server
Working Server
Data Server
All are working as both the server and the client
26Multi-tier of PARK
Client Server
Explicit direct connection Implicit direct
connection Possible connection
Service Server
Reduce Server
Working Server
Data Server
All are working as both the server and the client