VOLDEMORT - PowerPoint PPT Presentation

About This Presentation
Title:

VOLDEMORT

Description:

... included in 7.3 Farms workgroup post-install of Fermi Linux. Scripts tested on all flavors of Linux. Includes option to sync out changes in Fermi Linux comps file ... – PowerPoint PPT presentation

Number of Views:84
Avg rating:3.0/5.0
Slides: 28
Provided by: ValuedSon
Learn more at: http://home.fnal.gov
Category:
Tags: voldemort | linux

less

Transcript and Presenter's Notes

Title: VOLDEMORT


1
VOLDEMORT
  • VOLatile Distribution of Electronic Media Over
    Rsync Transport
  • May 22, 2003
  • LSCCW
  • Steven Timm

2
Introduction
  • Disclaimer--Any resemblance to characters in
    Harry Potter books of J.K. Rowling is pure
    coincidence.
  • Rsync is open-source package which allows to keep
    directories on remote machines synchronized with
    each other
  • Common method at many installations of
    distributing volatile local files on machines
    that are already installed.
  • Needs a set of supporting scripts to make it a
    useful tool.

3
VOLDEMORT overview
  • Currently deployed on over 700 machines at
    Fermilab
  • Works on RH Linux 6, 7, 9, Advanced Server, Sun,
    SGI
  • Written in shell scripts and perl
  • Two major uses
  • Keeping production computing farms installations
    up to date without reinstalling
  • Partitioning of US/CMS computing dynamically

4
Major Goals of Voldemort
  • Replace NIS with system to put passwd files
    locally on each node
  • Have a unified structure to push new files out to
    existing nodes and install them on new nodes
  • Have a single place where each volatile file is
    modified.
  • Keep current capability to have special files for
    a single farm, subcluster, or node.
  • Production Farms plus US/CMS have at least 13
    different hardware configurations

5
Components of Voldemort 0.6
  • Voldemort-push, includes rsync_push binary and
    scripts to clone slave servers. Installed on all
    servers.
  • Voldemort, installed on all clients, includes
    pullrsync and a number of auxiliary scripts that
    are called by pusher and puller.
  • Tree file structure, set up in VOLDEMORT_DIR/clus
    ters
  • Databases to describe the file structure
  • Available as RPM or in Fermi ups/upd format.

6
Features
  • Pullrsync included in 7.3 Farms workgroup
    post-install of Fermi Linux
  • Scripts tested on all flavors of Linux
  • Includes option to sync out changes in Fermi
    Linux comps file
  • Butnot tied to Fermi Linux..can and does work
    with other installation systems such as Rocks or
    System Imager

7
Why replace NIS?
  • NIS was stable on CDF farms-169 nodes, no
    timeouts for monthsBUT
  • We had to have at least 64 NIS slave servers to
    accomplish this
  • Pushing to all those slave servers is a network
    load in itself
  • Yppush doesnt gracefully handle when a node is
    down
  • Initial configuration of ypinit s is error prone
    and cant be automated during install.

8
Why replace NIS contd
  • Malformed map on one slave server can mess up
    several nodes
  • NIS is small amount of network traffic but is
    very sensitive to bigger network flows and is
    disrupted by them.
  • On our farms, we dont store any real passwords
    in NIS, accounts change rarely. Ideal situation
    to distribute files

9
Installer vs on-line changes
  • Whenever we made a change to the farm, we had to
    change in two placeson the nodes and in the
    installer.
  • Often this has been forgotten
  • Method of making installer changes is not
    straightforward
  • Need to make a system where any file that goes on
    the system is only changed in ONE place.

10
Down nodes problem
  • Right now, if we put extra files on the system,
    we have to go back and fix nodes that were down
    later, manually.
  • Need a system that will remember which nodes were
    down, and keep retrying until it gets them all.

11
Design goals of Voldemort
  • Do not put any node-specific info into the Fermi
    Linux workgroupwe dont want our whole structure
    available to world via anonymous FTP. (or our
    account names and groups or nfs servers).
  • Replace /export/linux/Workgroups/Farms/nodes with
    a new structure that is used both by online
    activities and the installer.
  • Keep our current capacity to have node specific,
    farm-specific, and subcluster-specific files

12
VOLDEMORT_DIR/clusters/common/db/nodes.conf
  • Nodes.confdatabase of nodes.
  • Read by both rsync_push and pullrsync
  • fncdf75cdffarm1Linux2.4.18i-acd38400N2518
    Reader
  • Fields
  • Node name APIC used in install
  • Cluster name Node specific
  • Flavor Subclusters
  • Disk arrangement
  • Baud Rate

13
VOLDEMORT_DIR/clusters/common/db/files.conf
  • Files.conf
  • Not fully populated yet.
  • Three fields
  • Full pathname to the file
  • (example fnsfo/files/Linux2.4.18/etc/passwd)
  • Files it depends on
  • common/templates/Linux2.4.18/etc/passwd
  • fnsfo/templates/NULL/yppasswd
  • Command used to make it
  • (cat the above two files together).

14
VOLDEMORT_DIR/clusters/fnpce/
  • Prescriptsscripts that have to be executed
    before a rpm or file can be installed
  • RPMS
  • Filessingle files that are pushed out to worker
    nodes
  • Scriptsusually run only by the installer
  • TarballsMainly for pushing out /local/ups
    directory to worker nodes

15
VOLDEMORT_DIR/clusters/fnpce/files
  • Under each category, space for more than one
    flavor. Right now
  • Linux2.4.18 (731)
  • Linux2.4 (711)
  • Linux2.2 (612)
  • IRIX6.5
  • Can also define arbitrary flavor foo as long as
    database matches.

16
VOLDEMORT_DIR/clusters/fnpce/files/Linux2.4.18
  • Each subdirectory of files directory gets pushed
    out independentlygoverned by .pushdir files
  • Four subdirectories (typ) /etc, /root,
    /usr/local, /var/adm
  • Three types of files
  • Passwd, group, netgroup, auto., .k5login
  • Non-standard config files for RPMS in redhat
    base
  • Hardware-specific or farm-specific files

17
VOLDEMORT_DIR/fnpce/tarballs/Linux2.4.18
  • Currently only one tarball
  • Structure same as files (.pushdir governs)
  • /local/ups/localups.tar
  • Tarball should be created to be untarred in the
    directory its pushed into.
  • Had to add this option because pushing a ups/upd
    tree of 19K files (180 Mb) was too slow.

18
VOLDEMORT_DIR/clusters/fnpce/RPMS/Linux2.4.18
  • RPMS that go here are either farm-specific or
    hardware-specific.
  • Anything for whole farm should go into farms
    workgroup.

19
VOLDEMORT_DIR/clusters/fnpce/prescripts/Linux2
.4.18
  • Scripts and prescripts are mainly executed during
    the install
  • Installer calls /sbin/pullrsync I which forces
    running of all scripts
  • Scripts should be smart enough to detect if the
    action has already been done

20
Subclusters
  • Subclusters can exist in any of five categories,
    files, tarballs, RPMS, scripts, prescripts
  • Subcluster membership determined by the database
  • Convention All hardware specific files
    (ethernet, lm_sensors) go into a subcluster named
    after the motherboard type
  • Node can be in more than one subcluster
  • For files and tarballs, a .pushdir at the top
    level.

21
Node-specific files
  • Can also have files specific to a single node
  • Enabled by having field in database be Y
    instead of the default N

22
Rsync_push
  • Rsync_push reads through the database and pushes
    to every node that matches the command-line
    options it was called with.
  • IMPORTANT Default is to push to everything!
    There is an are-you-sure option now that warns
    you what you are pushing.
  • Rsync_push r allows you to retry nodes that
    didnt push successfully the first time.
  • Default transport is kerberized rsh. Others can
    be used as well.
  • To push to a node, host principal of the server
    must be in /root/.k5login of the client node.

23
Rsync_push options 1
  • -c Push for a given cluster
  • -f Push for a given flavor
  • -b Push for a list of nodes
  • -B Push for a range of nodes
  • -l push for all the nodes in

  • If more than one is specified, we take the AND
  • Example
  • rsync_push c cdffarm f Linux2.4.18 B fncdf
    171 173-176

24
Rsync_push options 2
  • -R dont push the RPMS
  • -F dont push the files
  • -S dont push the scripts
  • -P dont push the prescripts
  • -T dont push the tarballs
  • -L dont push the Linux /etc/workgroup
  • Default is to push everything

25
Rsync_push options 3
  • -w specify the workgroup you are pushing (default
    is Farms)
  • -e use an alternative rsh command besides
    /usr/krb5/bin/rsh
  • -q quietminimum or no output
  • -v verbosethe more vs, the more verbose
  • -i Install moderun new scripts and prescripts
    when they are pushed out
  • -I Install moderun all scripts and prescripts
    when they are pushed out
  • -C Clear out the RPMS, scripts, and prescripts
    directory on the worker nodes.

26
pullrsync
  • Determines node ID and type either from local
    config file or from database read
  • Runs only if machine wasnt shut down clean, and
    during the install
  • Options h help H c -f
    M -t -R S P
    F T L w I i q v (as in rsync_push)

27
Future plans
  • Version v0_6 current, no known bugs right now.
  • Next version needs better and faster database
  • Also need ability to automatically distribute the
    push across slave servers
  • Big task, integrating more closely with ROCKS and
    RH 9.0.
  • http//www-oss.fnal.gov/scs/public/farms/doc/volde
    mort.html
Write a Comment
User Comments (0)
About PowerShow.com