Configuration System¶
The project uses Hydra for managing configurations. This provides a flexible, hierarchical configuration system that allows for easy experiment management and reproducibility.
Structure¶
The configuration is organized into several main components. Each component covers one part of the entire deep learning pipeline and serves as a base and be selectively overridden in subsequent configurations, which makes the system very versatile and adaptable.
configs/
├── data/ # Dataset configurations
├── model/ # Model architectures and parameters
├── trainer/ # Training settings and hyperparameters
├── callbacks/ # Training callbacks (early stopping, checkpointing)
├── logger/ # Logging configurations (W&B, TensorBoard)
├── paths/ # Path configurations
├── experiment/ # Complete experiment configurations
├── modality/ # Modality-specific settings
├── debug/ # Debug configurations
├── extras/ # Additional configuration components
├── hparams_search/ # Hyperparameter optimization settings
└── hydra/ # Hydra-specific configurations
The following graph shows the configuration relationship.
graph TD
E[Experiment Config] --> M[Model Config]
E --> D[Data Config]
E --> T[Trainer Config]
E --> L[Logger Config]
E --> C[Callbacks Config]
D --> MD[Modality Config]
D --> P[Paths Config]
M --> MD
subgraph "Experiment Types"
PT[Pre-training] --> E
FT[Fine-tuning] --> E
EM[Artifacts] --> E
end
subgraph "Runtime Options"
HP[Hyperparameter Search] -.-> E
DB[Debug Config] -.-> E
end
style E fill:#f9f,stroke:#333
style PT fill:#bbf,stroke:#333
style FT fill:#bbf,stroke:#333
style EM fill:#bbf,stroke:#333
style HP fill:#ddd,stroke:#333
style DB fill:#ddd,stroke:#333
Core Components¶
Data Configuration¶
The data configuration controls dataset loading and preprocessing:
data_root: ${paths.data_dir}
dataset_key: ${modality.dataset_key}
num_workers: 4
pin_memory: true
persistent_workers: true
use_train_subsample: false
Modality-specific configurations extend the default, like in the following example for CMR:
_target_: src.data.pytorch.datamodules.cmr_datamodule.CMRDataModule
defaults:
- default
batch_size: 32
augmentation_rate: 0.95
live_loading: false
cross_validation: false
fold_number: null
# Not done in eval dataset: https://github.com/oetu/MMCL-ECG-CMR/blob/bd3c18672de8e5fa73bb753613df94547bd6245b/mmcl/datasets/EvalImageDataset.py#L35
# Not done in imaging contrastive dataset: https://github.com/oetu/MMCL-ECG-CMR/blob/main/mmcl/datasets/ContrastiveImageDataset.py
manual_crop: null
# Manual cropping parameters (relative to img_size)
# This is not used for the finetuning as per https://github.com/oetu/MMCL-ECG-CMR/blob/bd3c18672de8e5fa73bb753613df94547bd6245b/mmcl/datasets/EvalImageDataset.py#L35
# This appears faulty to us.
# manual_crop:
# top: 0.21
# left: 0.325
# height: 0.375
# width: 0.375
img_size: 210
# From Turgut et. al (2025):
# "The CMR images are augmented using
# horizontal flips (probability=0.5),
# rotations (probability=0.5, degrees=45),
# color jitter (brightness=0.5, contrast=0.5, saturation=0.25),
# random resized cropping (size=210, scale=(0.6, 1))."
# Code ref eval:
# https://github.com/oetu/MMCL-ECG-CMR/blob/main/mmcl/datasets/EvalImageDataset.py#L11
# leads to: https://github.com/oetu/MMCL-ECG-CMR/blob/bd3c18672de8e5fa73bb753613df94547bd6245b/mmcl/utils/utils.py#L70
rotation_degrees: 45
brightness: 0.5
contrast: 0.5
saturation: 0.25
random_crop_scale: [0.6, 1.0]
Modality Configuration¶
Modality-specific configurations are closely tied to the data configuration but yet distinct. They define dataset specifics suchas the number of classes for a supervised task. For example, the acdc.yaml configuration for the CMR modality defines the number of classes as 5.
Model Configuration¶
Model configurations define architecture and training specifics:
_target_: src.models.ecg_classifier.ECGClassifier
defaults:
- ecg_encoder # Same because both use ViT backbone
learning_rate: 3e-6 # TODO: Check what we can reference as baseline either from paper or code
weight_decay: .05 # https://github.com/oetu/mae/blob/ba56dd91a7b8db544c1cb0df3a00c5c8a90fbb65/main_finetune.py#L112
layer_decay: 0.75 # Paper does not mention final value used, but made a sweep: (0.5, 0.75)
drop_path_rate: 0.1 # https://github.com/oetu/mae/blob/ba56dd91a7b8db544c1cb0df3a00c5c8a90fbb65/main_finetune.py#L82
# https://github.com/oetu/mae/blob/ba56dd91a7b8db544c1cb0df3a00c5c8a90fbb65/main_finetune.py#L86
mask_ratio: 0.0
mask_c_ratio: 0.0
mask_t_ratio: 0.0
# Training parameters
# “[..] over 400 epochs with a 5% warmup.” (Turgut et al., 2025, p. 5)
warmup_epochs: 5 # This should be 5% of $trainer.max_epochs
max_epochs: ${trainer.max_epochs}
smoothing: 0.1 # https://github.com/oetu/mae/blob/ba56dd91a7b8db544c1cb0df3a00c5c8a90fbb65/main_finetune.py#L135
# Downstream task parameters
num_classes: ${modality.num_classes}
global_pool: "attention_pool" # “We replace the global average pooling of fs(·) used during pre-training with the attention pooling described in [28].” (Turgut et al., 2025, p. 5)
pretrained_weights: "model_weights/signal_encoder_mmcl.pth"
task_type: ${modality.task_type}
For more details on model specifics, see the Model Architectures section.
Training Configuration¶
Training settings control the training process:
_target_: lightning.pytorch.trainer.Trainer
default_root_dir: ${paths.output_dir}
min_epochs: 1
max_epochs: 30
accelerator: auto
devices: 1
precision: 16-mixed
check_val_every_n_epoch: 1
log_every_n_steps: 0
# makes training slower but gives more reproducibility than just setting seeds
deterministic: False
Experiment Configuration¶
Experiments are the highest-level configuration component, bringing together all other core components into a complete setup. An experiment configuration defines how all components interact to create a complete training system. Here's a comprehensive example of the ecg_arrhythmia experiment coupling the ECG modality, Arrhythmia dataset and ECG classifier into a complete configuration:
# @package _global_
defaults:
- override /model: ecg_classifier
- override /modality: ${data}/arrhythmia
- override /data: ecg
data:
batch_size: 128
downstream: true
# From Turgut et. al (2025):
# "We augment the 12-lead ECG data using
# random cropping (scale=0.5 only during pre-training),
# Gaussian noise (sigma=0.25 during pre-training, 0.2 during finetuning),
# amplitude rescaling (sigma=0.5 during pretraining and fine-tuning),
# and Fourier transform surrogates (phase noise magnitude=0.1 during pre-training, 0.075 during fine-tuning)."
jitter_sigma: 0.2
rescaling_sigma: 0.5
ft_surr_phase_noise: 0.075
model:
warmup_epochs: 10 # “[..] over 400 epochs with a 5% warmup.” (Turgut et al., 2025, p. 5)
learning_rate: 3e-6 # ([..] and the learning rate (10−6, 3·10−6, 10−5, 3·10−5)) (Turgut et al., 2025, p. 5)
trainer:
max_epochs: 200
This example demonstrates how an experiment configuration:
- Specifies the model architecture and its hyperparameters
- Defines data loading and augmentation pipeline
- Sets up training parameters and optimization strategy
- Selectively overrides desired components
Types¶
The following sections describe the three main archetypes of experiments supported by our configuration system. Each type builds upon the core components above but serves a distinct purpose in the model development lifecycle.
- Pre-training: Pre-training a model on a large dataset for self-supervised learning.
- Fine-tuning: Fine-tuning a model on a specific task using a pre-trained backbone.
- Embeddings/Artifacts: Generating embeddings and saliency maps for a given dataset using a pre-trained model.
Runtime Features¶
The configuration system provides several powerful runtime features that allow you to modify and extend experiment configurations without changing the base configuration files.
Hyperparameter Optimization¶
For hyperparameter search:
# Run hyperparameter search for training
rye run train -m hparams_search=cmr_classifier
# Evaluate best model from sweep
rye run eval experiment=cmr_acdc ckpt_path=path/to/best_model
Multi-Run Experiments¶
Run multiple configurations:
# Training with different mask ratios
rye run train -m model.mask_ratio=0.65,0.75,0.85
# Evaluate multiple checkpoints
rye run eval -m ckpt_path=path/to/ckpt1,path/to/ckpt2
# Generate artifacts for multiple models
rye run generate_artifacts -m ckpt_path=path/to/ckpt1,path/to/ckpt2
Command Line Overrides¶
Override any configuration value:
# Training overrides
rye run train experiment=mae_pretraining trainer.max_epochs=200 data.batch_size=32
# Evaluation overrides
rye run eval experiment=mae_pretraining ckpt_path=path/to/model data.batch_size=64
# Artifact generation overrides
rye run generate_artifacts experiment=extract_features \
ckpt_path=path/to/model accelerator=gpu splits=[train,test]
Debug Mode¶
For development and debugging:
# Debug training
rye run train +debug=default
# Debug evaluation
rye run eval +debug=default debug.log_level=DEBUG
# Debug artifact generation
rye run generate_artifacts +debug=default debug.enabled=true
Configuration Reference¶
Available Configurations¶
Data Configs¶
data/default.yaml: Base dataset configurationdata/cmr.yaml: CMR dataset settingsdata/ecg.yaml: ECG dataset settings
Model Configs¶
model/mae.yaml: Masked Autoencodermodel/sim_clr.yaml: SimCLR modelmodel/cmr_encoder.yaml: CMR encoder (backbone of CMR classifier)model/ecg_encoder.yaml: ECG encoder (backbone of ECG classifier)model/cmr_classifier.yaml: CMR classifiermodel/ecg_classifier.yaml: ECG classifier
Trainer Configs¶
trainer/default.yaml: Default training settings
Experiment Configs¶
Various predefined experiment configurations combining the above components.
Debugging Configurations¶
As this can grow quite complex, debugging can be useful.
To debug your configuration: