Classifier¶
Folder Structure¶
classifier/: main package for classifier- Machine Learning
ml/: high-level ML workflows and utilitiesnn/: neural network models
- Task System
task/: task protocols and command-line interfaceconfig/: task configurationstest/: task configurations for testing
- Monitor System
monitor/: monitor core and components
- Others
data/: model data archivesalgorithm/: algorithms implemented withtorch.Tensorcompatibility/: 4b analysis related modulesroot/:ROOTI/O utilitiesdf/:pd.DataFrameutilitiesprocess/: multiprocessing utilitiespatch/: unreleased critical bug fixes
- Machine Learning
pyml.py: run the classifier jobs, can be used as an executable.
Getting Started¶
Setup Environment¶
Note
You are assumed to be in the /coffea4bees/ directory to run the following commands.
Use Container (Recommended)¶
Warning
You may need to change the apptainer cache and temp directory before pulling any image, especially when the home directory has a limited quota. The directories are controlled by the following environment variables:
export APPTAINER_CACHEDIR=
export APPTAINER_TMPDIR=
The docker image is available as:
docker://chuyuanliu/heptools:ml/cvmfs/unpacked.cern.ch/registry.hub.docker.com/chuyuanliu/heptools:ml(only when CVMFS is available)
The image is built from the following configurations:
base.Dockerfile: base imageml.Dockerfile: ml image derived from base imagebase.yml: used bybase.Dockerfilebase-linux.yml: used bybase.Dockerfileml.yml: used byml.Dockerfile
Run the following command to start an interactive shell:
apptainer exec \
-B .:/srv \
--nv \
--pwd /srv \
docker://chuyuanliu/heptools:ml \
bash --init-file /entrypoint.sh
where:
-B .:/srvmount the current directory to/srv--nvenable GPU--pwd /srvequivalent tocd /srvwhen starting the containerbash --init-file /entrypoint.sh(important) start a bash shell and run the initialization script.
Use Conda¶
The conda environment can be created from the base.yml, base-linux.yml and ml.yml files listed above.
classifier/env.yml is deprecated and not actively maintained.
rogue01/rogue02 specific¶
-
change the cache and temp directory for apptainer:
mkdir -p /mnt/scratch/${USER}/.apptainer-
add the following to
~/.bashrcexport APPTAINER_TMPDIR=/mnt/scratch/${USER}/.apptainer/ export APPTAINER_CACHEDIR=/mnt/scratch/${USER}/.apptainer/
Command-line Interface¶
See the Task System for details.
Setup Auto-completion¶
To register the auto-completion for the current shell session, run the following command:
source classifier/install.sh
To unregister the auto-completion, run:
source classifier/uninstall.sh
The auto-completion will be triggered when the command starts with ./pyml.py and the <tab> key is pressed. It will dynamically search for available tasks in the classifier/config directory and hint for the task name or the arguments.

Help¶
Use the following command to print help for all tasks:
./pyml.py help --all
Training and Evaluation¶
See the HCR Training for a complete example to train and evaluate a HCR model for SvB and FvT.
Monitor¶
A monitor is provided to collect logs, progresses, resource metrics and other information from worker processes/nodes. See the Monitor System for details.
Histogram¶
The histogramming is handled by dask processors for better performance and compatibility. See the Histogram for details.