Quick Start Guide

Welcome to the Quick Start Guide for SPARC. This guide walks you through the basic setup, configuration, and execution steps to run your first simulation.

Set Environment Variables

export VASP_PP_PATH=/path/to/vasp/potcar_files    # POTCAR files path (VASP only)

If PLUMED was installed from source (skip if you used conda-forge):

export PLUMED_KERNEL="$CONDA_PREFIX/lib/libplumedKernel.so"
export PYTHONPATH="$CONDA_PREFIX/lib/plumed/python:$PYTHONPATH"

Basic Usage

SPARC requires a YAML input file that defines the structure, DFT calculator, MD settings, and active learning parameters.

Example Input File

The example below shows a complete active learning workflow using VASP:

general:
  structure_file: "POSCAR"

dft_calculator:
  engine: "VASP"
  template_file: "INCAR"
  exe_command: "mpirun -np 4 vasp_std"

aimd_setup:
  ensemble: "NVT"
  temperature: 300.0
  timestep_fs: 1.0
  steps: 100
  thermostat:
    type: "Nose"
    tdamp: 2.0

mlip_setup:
  training: true
  data_dir: "Dataset"
  input_file: "input.json"
  num_models: 4
  MdSimulation: true
  ensemble: "NVT"
  temperature: 300.0
  timestep_fs: 1.0
  md_steps: 2000
  train_ratio: 0.8

active_learning: true
iteration: 10
model_dev:
  f_min_dev: 0.05
  f_max_dev: 0.30

See Input File for a full description of all available options.

Running a Simulation

sparc -i input.yaml

Directory Structure

After the first iteration the following layout is created:

Project Root/
├── POSCAR
├── INCAR
├── input.json
├── input.yaml
├── Dataset/
│   ├── training_data/
│   └── validation_data/
├── iter_000000/
│   ├── 00.dft/           DFT / AIMD labelling
│   ├── 01.train/         MLIP training
│   │   ├── training_1/
│   │   ├── training_2/
│   │   └── ...
│   └── 02.dpmd/          ML-MD + model deviation
├── iter_000001/
│   └── ...
  • 00.dft/ — DFT calculations used to label selected structures

  • 01.train/ — ML model training; one training_N/ folder per model

  • 02.dpmd/ — ML-MD simulation and Query-by-Committee model deviation

Sample Output (Sparc.log)

================================================================================
BEGIN CALCULATION - 2025-04-08 22:30:32
================================================================================

        ######  ########     ###    ########   ######
        ##    ## ##     ##   ## ##   ##     ## ##    ##
        ##       ##     ##  ##   ##  ##     ## ##
        ######  ########  ##     ## ########  ##
              ## ##        ######### ##   ##   ##
        ##    ## ##        ##     ## ##    ##  ##    ##
        ######  ##        ##     ## ##     ##  ######
        --v0.2.0

================================================================================
Creating Directories for Iteration: 000000
================================================================================
├── iter_000000
│   ├── 00.dft
│   ├── 01.train
│   └── 02.dpmd

================================================================================
Starting AIMD Simulation [Nose-Hoover]
================================================================================
Step     Epot (eV)    Ekin (eV)    Temp (K)
--------------------------------------------------------------------------------
0           -36.0932      0.3102    300.00
1           -36.1182      0.4385    424.04
2           -36.1058      0.4062    392.84

================================================================================
MLIP Training — 4 models
================================================================================
RUNNING TRAINING IN FOLDER (iter_000000/01.train/training_1)
...
frozen_model_1.pth saved

================================================================================
Starting ML-MD Simulation
================================================================================
Step     Epot (eV)    Ekin (eV)    Temp (K)
--------------------------------------------------------------------------------
0           -29.8049      0.1939    300.00
5           -29.7611      0.1458    225.61
10          -29.7915      0.1711    264.75

Core Components

1. MD Simulation

  • NVE, NVT (Nose-Hoover / Langevin), and NPT (Berendsen) ensembles

  • Supports both ab initio (VASP, CP2K, ORCA, QE, xTB, Gaussian) and ML-MD

  • Checkpoint/restart capabilities

  • PLUMED integration for enhanced sampling (Metadynamics, Umbrella Sampling)

2. MLIP Training

  • Automated DeepMD-kit training pipeline

  • Ensemble model generation for uncertainty quantification

  • Fine-tuning of universal potentials (DPA-3, MACE-MP) from a pre-trained checkpoint

3. Active Learning

  • Query-by-Committee (QbC) for candidate selection based on force deviation

  • RMSD-based duplicate filtering for diverse training data

  • Automated DFT labelling and model retraining

  • fparam support for universal models (e.g., DPA-3)