Hi everyone,
I wanted to share a quick observation in case it helps anyone doing intermediate ML steps between analysis stages.
I was using uproot between stage 1 and 2 to add BDT scores to my trees, and I realized uproot drops the eventsProcessed metadata from the (stage1 output) ROOT file.
When stage 2 runs and can’t find this tag, it seems to fall back to using the number of events in the filtered tree. If you applied any pre-selection cuts in stage 1, this quietly inflates the final cross-section scaling without throwing a warning.
I got around this by just manually copying the TParameter over using PyROOT when recreating my BDT output ROOT file, but figured I’d flag this so others don’t accidentally get inflated results!
Cheers,
Shreyas