Installation Guide

This guide provides detailed instructions for setting up the environment to run the Hierarchical Reasoning Model. The project requires specific versions of PyTorch and CUDA, and relies on custom CUDA extensions.

Step 1: Prerequisites - CUDA and PyTorch

The model's dependencies require a specific CUDA version. The following commands will install CUDA 12.6 and a compatible version of PyTorch.

# Install CUDA 12.6
CUDA_URL=https://developer.download.nvidia.com/compute/cuda/12.6.3/local_installers/cuda_12.6.3_560.35.05_linux.run
wget -q --show-progress --progress=bar:force:noscroll -O cuda_installer.run $CUDA_URL
sudo sh cuda_installer.run --silent --toolkit --override

# Set CUDA environment variable
export CUDA_HOME=/usr/local/cuda-12.6

# Install PyTorch with CUDA 12.6 support
PYTORCH_INDEX_URL=https://download.pytorch.org/whl/cu126
pip3 install torch torchvision torchaudio --index-url $PYTORCH_INDEX_URL

# Install additional packages for building extensions
pip3 install packaging ninja wheel setuptools setuptools-scm

Step 2: Install FlashAttention

HRM uses FlashAttention for optimized attention computation. The version you need depends on your GPU architecture.

For NVIDIA Hopper GPUs (e.g., H100): Install FlashAttention 3 by building from source.

git clone git@github.com:Dao-AILab/flash-attention.git
cd flash-attention/hopper
python setup.py install

For NVIDIA Ampere or earlier GPUs (e.g., A100, RTX 3090/4090): Install FlashAttention 2 from pip.

pip3 install flash-attn

Step 3: Clone the Repository

Clone the HRM repository from GitHub.

git clone https://github.com/sapientinc/HRM.git
cd HRM

Step 4: Initialize Raw Datasets

The raw datasets for ARC, ConceptARC, etc., are included as Git submodules. Initialize them with the following command:

git submodule update --init --recursive
This will download the necessary raw data into the dataset/raw-data/ directory.

Step 5: Install Python Dependencies

Install the required Python packages using the requirements.txt file.

pip install -r requirements.txt

Step 6: Log in to Weights & Biases

The project uses Weights & Biases (W&B) for experiment tracking and visualization. You will need a W&B account.

Log in to your account from the command line:

wandb login

After completing these steps, your environment is ready for building datasets and running training experiments.