preSel.py + yaml etc

rlemmon · January 20, 2022, 6:22pm

Hi,

I am interested in running an analysis over all the centrally produced files for the Z(mumu)H channel as opposed to just a single file. For example, I have already done an analysis on the file /eos/experiment/fcc/ee/generation/DelphesEvents/spring2021/IDEA/wzp6_ee_mumuH_ecm240/events_012879310.root

Therefore I need to use preSel.py to run over the 10 files in this directory (and the 100 background files …).

However I am not sure how preSel.py works, how it finds the files, etc. For example, there is a line:

basedir=os.path.join(os.getenv(‘FCCDICTSDIR’, deffccdicts), ‘’) + “yaml/FCCee/spring2021/IDEA/”

Where is the yaml file and can I read it ? Also, does preSel.py run the analysis.py file that I have setup with the variables I need from the edm4hep.root file ?

Thanks very much.

Cheers
Roy

eperez · January 21, 2022, 9:58am

Hi Roy,

However I am not sure how preSel.py works, how it finds the files, etc. For example, there is a line:
basedir=os.path.join(os.getenv(‘FCCDICTSDIR’, deffccdicts), ‘’) + “yaml/FCCee/spring2021/IDEA/”
Where is the yaml file and can I read it ?

At the beginning of preSel.py, you see this line:

from config.common_defaults import deffccdicts

Indeed, your FCCAnalyses directory contains a “config” subdirectory, in which there is a file common_defaults.py, which contains the definition of the deffccdicts variable:
deffccdicts = "/afs/cern.ch/work/h/helsens/public/FCCDicts/"

So, the yaml file is in Clement’s directory. Users should send an email to Clement, with their lxplus userid, such that he gives them read access to this directory (I think you already did so).

In the preSel.py, you specify in process_list the list of datasets that you want to process. Note that the line that you pointed:
basedir=os.path.join(os.getenv('FCCDICTSDIR', deffccdicts), '') + "yaml/FCCee/spring2021/IDEA/"
also tells that you’ll look at the datasets from the spring2021 campaign.

Also, does preSel.py run the analysis.py file that I have setup with the variables I need from the edm4hep.root file ?

Exactly. preSel.py will run your analysis.py over the files of the selected datasets. This comes from
import config.runDataFrame as rdf
myana=rdf.runDataFrame(basedir,process_list)
and from the import:
import analysis as ana
in config/runDataFrame.py.

Cheers,
E.

rlemmon · January 21, 2022, 11:54am

Hi Emmanuel,

Thank you very much for the explanation of how preSel.py works. I found the “config” subdirectory with common_defaults.py in it. In the same directory I also found runDataFrame which, as you said, contains the line “import analysis as ana” to run my analysis.py.

I start to understand now. I do:

source /cvmfs/fcc.cern.ch/sw/latest/setup.sh
(or in fact source /cvmfs/sw-nightlies.hsf.org/key4hep/setup.sh from a previous post on here)

in my .bashrc file. And it presumably sets up the FCCAnalysis directories that I can see on github ?

So I have now ran the files preSel.py, finalSel.py and plots.py for this channel using fraction=0.1 and I see the presumably correct output in the plots.

Cheers
Roy.