Quick Start Guide

Welcome to the Quick Start Guide for setting up and running your first simulation with SPARC. This guide walks you through the basic setup, configuration, and execution steps, allowing you to quickly begin your calculations. Follow these steps to configure your environment and set up an input file. An example input file is provided in the Basic Usage section below.

Environment Variables:

When using different ab initio packages, certain environment variables must be defined prior to executing the workflow. These ensure that the underlying electronic structure code can locate the required pseudopotentials, basis sets, and related data files.

First-principle Calculator
- VASP
  
  The pseudopotential library must be made available by setting the path to the POTCAR files through VASP_PP_PATH. See the ASE VASP documentation.
```
export VASP_PP_PATH=/path/to/vasp/potcar_files    # path to POTCAR files path
```
- CP2K
  
  CP2K requires access to basis set and pseudopotential library, which is specified through the CP2K_DATA_DIR variable. See the ASE CP2K documentation.
```
export $CP2K_DATA_DIR=/path/to/cp2k/data        # path to CP2K basis set data directory
```

PLUMED

If PLUMED was installed manually (skip: if used conda-forge), you need to set the plumed environment variable.

export PLUMED_KERNEL="$CONDA_PREFIX/lib/libplumedKernel.so"
export PYTHONPATH="$CONDA_PREFIX/lib/plumed/python:$PYTHONPATH"

Basic Usage

SPARC code requires a YAML input file that specifies the essential parameters of the workflow. Each block in the file controls different part of the workflow, such as system setup, MD settings, or the DFT calculator.

An example input file is shown below.

Input File

general:
  structure_file: "POSCAR"

md_simulation:
  thermostat: "Nose"
  temperature: 300.0
  steps: 100
  timestep_fs: 1.0

dft_calculator:
  name: "VASP"
  prec: "Normal"
  incar_file: "INCAR"
  exe_command: "mpirun -np 2 /path/to/vasp_std"

Running the WorkFlow

Once the input file is configured, execute the workflow with command:

sparc -i input.yaml

Directory Structure

After the first iteration, SPARC creates the following directory structure:

>>> Project Root
├── POSCAR
├── INCAR
├── input.json
├── input.yaml
├── Dataset
│   ├── training_data
│   └── validation_data
├── iter_000000
│   ├── 00.dft
│   ├── 01.train
│   └── 02.dpmd
├── iter_000001
│   ├── 00.dft
│   ├── 01.train
│   └── 02.dpmd

This layout organizes the outputs as follows:

POSCAR / INCAR: Standard VASP structure and input file

input.yaml: User-defined configuration for SPARC execution.

input.json: Configuration DeePMD-kit ML training

Dataset: Stores training and validation data for ML training

iter_000000, iter_000001, …: Iteration folders containing:

00.dft: First-principles calculation

01.train: DeepMD-kit ML model training

02.dpmd: Molecular dynamics simulations with ML potential

Each new iteration (e.g., iter_000001, iter_000002, …) corresponds to an active learning cycle, in which the potential is refined with additional labelled data and retrained models.

By default SPARC writes all log messages to an output file sparc.log. A sample output file looks like this:

========================================================================
BEGIN CALCULATION - 2025-04-08 22:30:32
========================================================================

       ######  ########     ###    ########   ######
      ##    ## ##     ##   ## ##   ##     ## ##    ##
      ##       ##     ##  ##   ##  ##     ## ##
       ######  ########  ##     ## ########  ##
            ## ##        ######### ##   ##   ##
      ##    ## ##        ##     ## ##    ##  ##    ##
       ######  ##        ##     ## ##     ##  ######
       --v0.1.0

========================================================================
Creating directories for Iteration: 000000
========================================================================
├── iter_000000/
│   ├── 00.dft/
│   ├── 01.train/
│   └── 02.dpmd/
========================================================================

! ab-initio MD Simulations will be performed at Temp.: 300K !

========================================================================

========================================================================
                  Starting AIMD Simulation [Langevin]
========================================================================
Steps: 0, Epot: -23.900369, Ekin: 0.193890, Temp: 300.00
Steps: 1, Epot: -23.868841, Ekin: 0.163737, Temp: 253.35
Steps: 2, Epot: -23.809660, Ekin: 0.114040, Temp: 176.45
Steps: 3, Epot: -23.785705, Ekin: 0.095163, Temp: 147.24
========================================================================
          DEEPMD WILL TRAIN 2 MODELS !
========================================================================

========================================================================
RUNNING TRAINING IN FOLDER (iter_000000/01.train/training_1) !
DeepMD Model Evaluation Results:
------------------------------------------------------------------------
DEEPMD INFO    # ---------------output of dp test---------------
DEEPMD INFO    # testing system : Dataset/validation_data
DEEPMD INFO    # number of test data : 45
DEEPMD INFO    Energy MAE         : 1.000640e+01 eV
DEEPMD INFO    Energy RMSE        : 2.874460e+01 eV
DEEPMD INFO    Energy MAE/Natoms  : 2.001280e+00 eV
DEEPMD INFO    Energy RMSE/Natoms : 5.748920e+00 eV
DEEPMD INFO    Force  MAE         : 7.150255e-02 eV/A
DEEPMD INFO    Force  RMSE        : 9.012610e-02 eV/A
DEEPMD INFO    Virial MAE         : 2.775489e-01 eV
DEEPMD INFO    Virial RMSE        : 3.691740e-01 eV
DEEPMD INFO    Virial MAE/Natoms  : 5.550977e-02 eV
DEEPMD INFO    Virial RMSE/Natoms : 7.383479e-02 eV
DEEPMD INFO    # -----------------------------------------------
========================================================================
DeepPotential model successfully loaded and tested:
iter_000000/01.train/training_1/frozen_model_1.pb
========================================================================

========================================================================
          Initializing DeepPotential MD Simulation [Langevin]
========================================================================
Steps: 0, Epot: -29.804899, Ekin: 0.193890, Temp: 300.00
Steps: 70, Epot: -29.761078, Ekin: 0.145811, Temp: 225.61
Steps: 140, Epot: -29.791506, Ekin: 0.171109, Temp: 264.75
Steps: 210, Epot: -29.771736, Ekin: 0.170868, Temp: 264.38
Steps: 280, Epot: -29.755832, Ekin: 0.133304, Temp: 206.26
Steps: 350, Epot: -29.801518, Ekin: 0.177911, Temp: 275.28
Steps: 420, Epot: -29.736048, Ekin: 0.165758, Temp: 256.47
Steps: 490, Epot: -29.717222, Ekin: 0.161525, Temp: 249.92
========================================================================

Note

SPARC integrates three key components: molecular dynamics simulations, ML potential training, and active learning. For a more details, see Scientific Overview.