Hierarchical Combinatoric ResNet (HCR)¶
This tutorial will work through a complete example of training baseline FvT and SvB classifier for HH4b analysis using skim datasets_HH4b_2024_v2 on rogue.
Setup environment¶
- (optional) setup apptaine cache and temp directory
- start a container and enter
/coffea4bees/directory - (optional) setup the base path for workflow files
export WFS="classifier/config/workflows/HH4b_2024_v2"
See Overview for details.
FvT Training and Evaluation¶
- set the following variables in the
${WFS}/FvT/run.shscript:MODEL: the base path to store the FvT modelFvT: the base path to store the FvT friend treesPLOT: the base path to store benchmark plots- all other variables are optional
- run the following command:
source ${WFS}/FvT/run.sh
To understand the details of the whole workflow, check the comments in the following files by order:
${WFS}/FvT/train.yml${WFS}/FvT/evaluate.yml${WFS}/common.yml${WFS}/FvT/run.sh
SvB Training and Evaluation¶
- set the following variables in the
${WFS}/SvB/run.shscript:MODEL: the base path to store the SvB modelsSvB: the base path to store the SvB friend treesFvT: the base path to the FvT friend trees (should be the same as the FvT training)PLOT: the base path to store benchmark plots- all other variables are optional
- run the following command:
source ${WFS}/SvB/run.sh
To understand the details of the whole workflow, check the comments in the following files by order (assuming you have already checked the FvT config files):
${WFS}/SvB/train.yml${WFS}/SvB/evaluate.yml(basically the same as the FvT evaluation)${WFS}/SvB/run.sh
Plotting¶
- make a local copy of config
analysis_dask/config/userdata.cfg.ymland fill all required fields - make a local copy of config
analysis_dask/config/classifier_plot_vars.cfg.ymland change the SvB and FvT friend tree paths inclassifier_outputs<var>according to the evaluation scripts and modify theclassifier_datasets<var>to match the datasets you want to plot. - run the following command:
python dask_run.py \
analysis_dask/config/userdata.local.cfg.yml \
analysis_dask/config/cluster.cfg.yml#rogue_local_huge \
analysis_dask/config/classifier_plot_vars.local.cfg.yml#2024_v2 \
analysis_dask/config/classifier_plot.cfg.yml#2024_v2
- the output will be available as
{output_dir}/classifier_plot_2024_v2_{timestamp}/hists/classifier_basic.coffea
See Histogram for details.
Tips on Performance¶
- Training:
- in main task
train, consider increasing--max-trainersto parallel multiple models (CPU, GPU, memory bounded) - in
-dataset HCR.*, consider increasing--max-workers(IO and CPU bounded, require extra memory) - in
-setting ml.DataLoader- always set
optimize_sliceable_datasettoTrueif the dataset fits in memory. This option enables a custom data loader that makes use oftorch's c++ based parallel slicing, which is significantly faster and more memory efficient than the defaulttorch.utils.data.DataLoader. - if
optimize_sliceable_datasetis disabled, consider increasingnum_workersto speed up batch generation (mainly CPU bounded, require extra memory) - consider increasing
batch_evalto speed up evaluation (mainly GPU memory bounded)
- always set
- in
-setting torch.Training, consider usingdisable_benchmarkto skip all benchmark steps.
- in main task
- Evaluation:
- in main task
evaluate, consider increasing--max-evaluatorsto parallel multiple models (CPU, GPU, memory bounded) - in
-setting torch.DataLoader, consider increasingnum_workersandbatch_eval. (IO and CPU bounded, require extra memory)
- in main task
- Merging k-folds:
- in
-analysis kfold.Merge,- consider increasing
--workers(IO and CPU bounded, require extra memory) - consider using a finite
--stepto split root files into smaller chunks.
- consider increasing
- in