Hi,
I’m modernising my previous analysis ready for some detector studies and like the built in cluster submission and file bookkeeping available in the standard stage1.py etc kind of analysis steps like in the mH-recoil/mumu
examples. Is it possible to take advantage of this as well in other kinds of analysis steps. For example I use a BDT for some selection so have a step where I first pickle some files
from config import train_var_lists
def run(input_files, vars_list):
import uproot
print("input_files: ", input_files)
for inf in input_files:
print(inf)
output_file = inf.replace("/stage1", "/stage1_pickles").replace(".root", ".pkl")
#df = uproot.concatenate(inf+":events", library="pd", how="zip", filter_name=vars_list )
df = uproot.open(inf).get("events").pandas.df(vars_list)
df.to_pickle(output_file)
def main():
import argparse
parser = argparse.ArgumentParser(description="Applies preselection cuts", formatter_class=argparse.ArgumentDefaultsHelpFormatter)
parser.add_argument('--input', nargs="+", required=True, help='Select the input file(s).')
#parser.add_argument('--output', type=str, required=True, help='Select the output file.')
parser.add_argument('--vars', type=str, required=True, help='Select the variables to keep, e.g "train_vars_vtx", "train_vars_stage2"')
args = parser.parse_args()
assert( args.vars in train_var_lists.keys() )
run(args.input, vars_list=train_var_lists[args.vars][args.decay])
if __name__ == '__main__':
main()
I guess I could edit this to be similar to mH-recoil/mumu/analysis_final.py
but it’s not obvious to me how to handle controlling which file gets opened etc… Is this possible or would I have to rely on my own bookkeeping and cluster submission?