Welcome to SPARC's documentation!
SPARC (Smart Potential with Atomistic Rare Events and Continuous Learning) is a Python package that automates the active learning workflow for developing machine learning interatomic potentials (MLIPs). It is built on top of ASE and ties together DFT labelling, MLIP training, and ML-driven molecular dynamics into a single iterative loop.
Scientific Overview
Training a stable and reactive MLIP requires a training dataset that covers the configuration space
relevant to the target conditions. SPARC package automates the generation of training dataset through
an active learning loop that run in stages per iteration: ab initio MD or first-principles calculations to generate reference data (00.dft), MLIP training on the accumulated dataset (01.train), and ML-driven MD
together with advanced sampling techniques to generate a diverse dataset of candidate structures (02.dpmd).
Structures where the inter-model force deviation exceeds a threshold, are selected as candidates for labelling and passed back. The loop stops when no new uncertain structures are found, meaning the potential
energy surface is well represented for the thermodynamic conditions of interest.
See Workflow Overview for a full description.
Getting Started
User Guide
Key Features
DFT engines — VASP, CP2K, ORCA, xTB, Quantum ESPRESSO, and Gaussian via ASE calculators
DeepMD-kit v2 and v3 — supports TensorFlow and PyTorch backends; automatically detected at runtime
GNN potentials — MACE and NequIP training via deepmd-gnn using the same workflow
Fine-tuning — initialise from pre-trained DPA-3 universal models instead of training from scratch
Active learning — Query-by-Committee force deviation selects uncertain structures for DFT relabelling
Enhanced sampling — PLUMED integration for metadynamics, umbrella sampling, and any CV/bias