Title: TCK
1TCK Configuration
- Or
- How to configure the HLT (i.e. a Gaudi job)
deterministically repeatable, and provide
accountability introspection (from C)
while allowing reconfigurations during running
without introducing more deadtime than required
Talk is based on an actual working,
proof-of-principle implementation What is
described here actually works in practice, but
the implementation is not necessarily suitable
for serious use, and needs to be cleaned up
(e.g.. It might not work on Windows as-is, uses
an ad-hoc file format to describe configurations,
) but at least it does work, and provides a
starting point for a proper design
2The General Idea
- There is a need for a trigger which can change
Configuration on the fly - E.g.. Due to decreasing instantaneous luminosity
- Configurations might change during a run as a
full start-stop cycle takes far too long. - Need to know for a given event which
configuration was used (reliably!) - Otherwise determining trigger efficiencies
becomes a lottery, and analysis virtually
impossible - So there are different Trigger Configurations
- To distinguish between them, we label each of
these with a Key, the Trigger Configuration
Key - Given a TCK (and a TCK only!) one should be able
to determine which configuration was used - i.e. which algorithms ( tools?) were used, in
which order, and how they were configured (i.e.
which cuts did they apply?) - be able to do this from C code
- In order to insure accountability for each event,
the TCK will be part of the event data (to be
precise part of the ODIN bank) - Given a TCK, the trigger (both L0 and HLT) will
have to make sure they use the matching
configuration when processing the event - Before processing an event, the trigger has to
check whether the current configuration still
matches the TCK in the event which it is about to
process - If yes, continue
- If not, first reconfigure accordingly, then
continue
3Configurations back to basics
- A Gaudi Program is configured by specifying
Properties - Either by their default properties, or by
joboptions, or through python (e.g..
Configurables, but not restricted to them), or by
explicit C calls, or - The Sequencing of Algorithms
- Which algorithms run, and the order in which they
run, is specified by the TopAlg property of the
AppMgr - And, in case one of those Algorithms is a
Sequencer, by the Members property of the
Sequencer (recursively) - Or, better, by their subalgorithms (provided they
are created by calling AlgorithmcreateSubalgorit
hm insteadof talking directly to the AppMgr not
the case for currently released GaudiSequencer) - Note ignore the Data-On-Demand service here on
purpose the trigger should avoid using it - The Configuration of Algorithms
- Specified by their Properties
- Note ignoring tools services for now
4Idea (inspired by the Gaudi HistorySvc)
- Given an (instance of an) Algorithm, provide an
object which captures its configuration - name, type, version, properties
- The HistorySvc uses the a AlgorithmHistory
class - It contains not just configuration but also
references other instances to track changes of
configuration - not needed here, and it mixes two
functionalities capturing not just state, but
tracing it as well would have been better to
split the two. - Use a similar but simpler class (i.e. could
improve integration here) which specifies a
single configuration only - Algorithm Type, Name, Version, and list of
Properties, represented by strings. - Note properties are quite nice one can easily
convert all of them to and from a text
representation!
5Example AlgorithmConfiguration
- Can create it given an Algorithm
- const Algorithm myAlgo
- AlgorithmConfig cfg(myAlgo)
- Can stream it to and from an stdostream
Name HadSingleTFRZVelo Type HltTrkFilter
Version unknown Properties
'OutputLevel'2 'Enable'True 'ErrorMax'1
'ErrorCount'0 'AuditAlgorithms'False
'AuditInitialize'False 'AuditReinitialize'Fa
lse 'AuditExecute'False 'AuditFinalize'False
'AuditBeginRun'False 'AuditEndRun'False
'MonitorService'MonitorSvc 'ErrorsPrint'True
'PropertiesPrint'False 'StatPrint'True
'TypePrint'True 'Context' 'RootInTES'
'RootOnTES' 'GlobalTimeOffset'0
'StatTableHeader' Counter sum
mean/eff rms/e.. 'RegularRowFormat'
-15.15s17t10d 11.7g
. 'EfficiencyRowFormat'-15.15s17t10d
11.5g . 'UseEfficiencyRowFormat'True
'ContextService'AlgContextSvc
'RegisterForContextService'True
'HistoProduce'True 'HistoPrint'False
'HistoCheckForNaN'True 'HistoSplitDir'False
'HistoOffSet'0 'HistoTopDir' 'HistoDir'Had
SingleTFRZVelo 'FullDetail'False
'MonitorHistograms'True 'FormatFor1DHistoTabl
e' 2-30.30s 37d 811.5g
10-11.5g1211.5g 1411.5g
'ShortFormatFor1DHistoTable' 1-15.15s 2
'HeaderFor1DHistoTable' Title Mean RMS
Skewness Kurtosis 'PassPeriod'0
'HistogramUpdatePeriod'1 'ConditionsName'
'HistoDescriptor' 'rIP,400,-1.,3.' ,
'rIPBest,400,-1.,3.' , 'Calo2DChi2,100.,0.,20.' ,
'Calo2DChi2Best,100.,0.,20.' 'DataSummaryLocat
ion'Hlt/Summary 'SelectionName'HadSingleTFRZVe
lo 'IsTrigger'False 'PatInputTracksName'
'PatInputTracks2Name' 'PatInputVerticesName'
'InputTracksName'RZVelo 'InputTracks2Name'L
0TriggerHadron 'InputVerticesName'
'PrimaryVerticesName'PV2D 'OutputTracksName'
HadSingleTFRZVelo 'OutputVerticesName'
'MinCandidates'1 'Filters' 'rIP lt 50' , '
Calo2DChi2 lt 4' 'Lines' 'rIP
bindAbsMin(TrVRIP,_HltVertices)' ,
"HltRZVeloTCaloMatch gaudi.toolsvc().create('Hlt
RZVeloTCaloMatch' , interface gbl.ITrackMatch
)" , 'Calo2DChi2 bindAbsMin(TTrMATCH(HltRZVeloTC
aloMatch),_HltTracks)' No subalgorithms.
6Overall Flow
- A complete configuration is just a set of
AlgorithmConfigurations - Ignoring tools services for now
- more on creating/managing this set later
- To tell Gaudi about them is almost trivial
- Wrote a service (without public interface, nobody
talks to it) - On Initialize
- Register it with the IncidentSvc for the
beginEvent incident - Cache a a set of ( set of algo configurations)
corresponding to a specified few TCKs (to avoid
I/O later on caused by reading configuration
information) - Update the JobOptionSvc with the set of
properties for a specified TCK - At this point, nothing more is needed until a
reconfigureis required - note at this point, no calls to
Algorithminiitalize have been made yet
services are initialized first.. - Next, algorithms get initialized, pull their
properties from the JOS things proceed as
usual. - When receiving a beginEvent incident
- Check if TCK used to configure is still valid
(right now, just increase TCK by one every 100
calls) - Note I am assuming I will be able to access the
ODIN bank for the upcoming event at this stage,
from a service. - If yes, just return
- If no, update JobOptionSvc with the the set of
properties corresponding to the new TCK
7A few words on sysReinitialize
- Most time-consuming part to implement, mainly
because people have made shortcuts in their code
assuming this does not happen - just have to make sure those algorithms
actually running in the HLT do the right thing - A lot of (most?) algortihms dont implement
reinitialize (properly) - If you only directly use properties, (i.e. do not
derive something from them), youre OK - GaudiSequencer and HltSequencer dont propagate
sysReinitialize to their subAlgoritms - Trivial fix create subalgorithms by calling
AlgorithmcreateSubAlgorithm instead of talking
explicitly to AppMgr (probably the right thing to
do regardless) - Yes, HltSequencer should be discontinued asap
- For now cheat, and only fix the few HLT
algorithms that I really need right now, and only
support some changes (i.e. different cuts) and
explicitly disallow others (i.e. changes in flow)
as that would require more invasive code changes - register callbacks for some properties that check
the above rules in one type of HLT algorithm, and
(artificially) limit differences between TCKs to
the supported ones - Supporting sysReinitialize properly in all
algorithms (tools) which the HLT uses seems
needed by the requirements regardless of any
implemenation - Its work, not insurmountable, just takes time to
get it right. - Can be verified by first configuring a
dummyTCK at initialize, then immediately switch
to the correct one before starting the first
event, and comparing the results to configuring
the right TCK already at initialize
8Current ad-hoc on-disk representation of
configurations
- Disclaimer not even intended as a final solution
- Needed something that was easy to implement the
idea, and to explore the possibilities - Already have some ideas for improvements in
structure - Will use whatever is acceptable to the online
system, provided it matches the requirements of
analysis use as well
9Current ad-hoc on-disk representation of
configurations
- Remember, a TCK (a complete configuration) is
just a collection of individual algorithm (and
tool, svc) configurations - Create one file for each algorithm (tool, svc)
configuration - A TCK is then just a directory (aka collection of
files) - Basically, a directory, labeled with the value of
the TCK, with as contents a set of Algorithm
Configurations - But since different TCKs might share subsets of
common configurations, implement this a lookup
table - Each TCK is a lookup table from Algorithm Name
to the file containing its configuration - Can be implemented as a directory full of links
10Current ad-hoc on-disk representation of
configurations
- So we have one file per unique algorithm
configuration - Expect many (most?) algorithms not to change when
updating to a new TCK, avoid duplication of
information. - Separate generating unique (algorithm)
configurations from composing complete TCK
configurations - There is a different level of paranoiabetween
these steps just adding a new configuration for
some algorithm is not quite as drastic as adding
a reference to an algorithm configuration from
some TCK - Having pre-cooked algorithm configurations
available allows easier re-use of them when
composing a TCK from building blocks - As a result, would like to refer to these
individual files from a TCK thus need a way of
creating a reference (think POOL ref, but with
slightly different requirements)
11How to recognize/refer/find an AlgorithmConfigurat
ion
- Do as POOL does, but slightly different
- Must be able to do this at any time, any where,
but given the same configuration, should give the
same unique reference - Compute a cryptographic digest given the
human-readable output. - Basically, run md5sum (or sha1sum, or ) on
the stream of characters - Choice depends on your level of paranoia, and the
amount of space youre willing to spend on a
reference - The reference to the configuration previously
shown, using md5, happens to be - 11d24cb1250f2ca664a15cff3f09c164
- If at another time the same configuration for the
same algorithm is made (not unlikely), this can
be automatically recognized - Name type are part of the content, and thus of
the MD5 checksum - Also recognizes the case of a single algorithm
instance appearing more than once in a job - Can easily answer question like which TCKs use,
for my favorite algorithm, these settings,
without inspect the contents of the files - ls l tck// grep 11d24cb1250f2ca664a15cff3f09
c164
12Idea (not yet tried)
- In order to make managing TCKs even easier would
like to turn them into hashtrees - i.e. explicitly put the MD5sum of the dependants
of a given configuration in its configuration
file - Basic idea is that if I want to know if the
configuration of a sequencer changed, one needs
to check (recursively!) whether the
configurations of its dependents has changed --
so why not put an explicit reference to the MD5
summaries of the configurations of the (direct)
descendants into the configuration. - No impact on the interface to the Gaudi side it
just gets a set of (algorithm) configurations,
doesnt care how this set was obtained - Management now easier only need to know the MD5
sum of the top algorithm(s), and can now
uniformly navigate and collect the configuration
of all dependents. - No need to interpret the Members property as
being special - Answering questions such as which TCKs used the
same hadron alley configuration is now even
easier just get the (single!) MD5 sum of the
top of hadron alley, and verify which TCKs
(eventually) refer to this. - Implicitly checks for all dependents
- Can improve divide and conquer ask each
alley for a (small) set of configurations, and
the answer will be a (small) set of MD5 values
(and the corresponding configuration files!). Now
just need to compose them appropriately. - Changes at a lower level can get automatically
propagated upwards - A TCK now just refers to (a vector of) MD5 values
of the TopAlg(s) instead of being a directory - Variation on this theme ditto, but keep the
tree structure separate from the actual
configuration data i.e. dont put MD5 of
dependents in configuration file directly, just
keep a parallel tree structure which refers (by
reference, i.e. MD5sum) to the configurations
13Creating managing sets of configurations
- In order to generate the individual
configurations, TCK, and their mapping, wrote an
Algorithm - When its execute is called for the first time,
it queries the AppMgr for a specified algorithm
(e.g.. Hlt) and dumps its configurations, and
then, if it has a property called Members,
recursively dumps their configurations. - Skips dump if configuration already present in
file system (algorithms can be in more than one
sequence). - It also creates the directory corresponding to a
specified TCK, and makes links from the dumps in
the AlgorithmConfig directory to the TCK
directory - Now that Ive fixed GaudiSequencer (and its HLT
twin HltSequencer) to use createSubAlgorithm can
drop the special treatment of the Members
property, and just call subAlgorithms to obtain
a vector of dependencies - Cloning (and modifying) an existing TCK is (in
this implementation) a matter of copying an
existing TCK directory to a new one - Can copy certain configuration to new ones (note
never modify anything in the AlgorithmConfig
directory!), modify them, compute their new
MD5sum and put them in the AlgorithmConfig
directory, and link them into the new TCK
directory - Note can always check the integrity of the
AlgorithmConfig directory by manually computing
the MD5sums of the files those should match
their filename - Once settled on the real implementation, can make
some utilities to automate this type of operation
14Known Issues / Discussion
- Right now, all properties are part of a
configuration - Easy to do, but maybe not what is wanted
- Not all properties actually affect the behavior
of algorithms - Adding, (re)moving properties (even if they dont
matter, even in a baseclass) will result in a new
configurations - Completely ignored configuration of Tools and
Services for now - But again it seems just a matter of recording
their properties, updating them, and calling
reinitialize - Configurations are now created by interrogating
the AppMgr during execution but I suspect that
one can do the same given the Configurables for a
job. - But it requires a migration to Configurables
first - Cannot go through JobOptionSvc as it does not
know the default values - Which is a problem when switching from a
non-default setting back to the default, as need
to specify the new setting to the JOS, so
defaults need to be part of the configuraiton - Size doesnt seem to be a problem
- A typical configuration file as plain text is
about 1 to 2 KB - Assume 10 different configurations each day, 100
days per year -gt 1000 configurations / year - Assume 100 algorithms per configuration
- If every algorithm would change (really, really
pessimistic!), this would be 100 -- 200MB / year - Setup could be used outside of HLT
- Grid,
- In general, Id prefer a non-HLT specific
solution that can be / is re-used by others .-) - No thought yet on a production / online setup
would first like to know if the concept is
acceptable, and find out what constraints need to
be satisfied.