SPARC Logo Welcome to SPARC's documentation!

SPARC (Smart Potential with Atomistic Rare Events and Continuous Learning) is a Python package that automates the active learning workflow for developing machine learning interatomic potentials (MLIPs). It is built on top of ASE and ties together DFT labelling, MLIP training, and ML-driven molecular dynamics into a single iterative loop.

Scientific Overview

Training a stable and reactive MLIP requires a training dataset that covers the configuration space relevant to the target conditions. SPARC package automates the generation of training dataset through an active learning loop that run in stages per iteration: ab initio MD or first-principles calculations to generate reference data (00.dft), MLIP training on the accumulated dataset (01.train), and ML-driven MD together with advanced sampling techniques to generate a diverse dataset of candidate structures (02.dpmd). Structures where the inter-model force deviation exceeds a threshold, are selected as candidates for labelling and passed back. The loop stops when no new uncertain structures are found, meaning the potential energy surface is well represented for the thermodynamic conditions of interest. See Workflow Overview for a full description.

Key Features

  • DFT engines — VASP, CP2K, ORCA, xTB, Quantum ESPRESSO, and Gaussian via ASE calculators

  • DeepMD-kit v2 and v3 — supports TensorFlow and PyTorch backends; automatically detected at runtime

  • GNN potentials — MACE and NequIP training via deepmd-gnn using the same workflow

  • Fine-tuning — initialise from pre-trained DPA-3 universal models instead of training from scratch

  • Active learning — Query-by-Committee force deviation selects uncertain structures for DFT relabelling

  • Enhanced samplingPLUMED integration for metadynamics, umbrella sampling, and any CV/bias

Indices and Tables