DeePMD

For training deepmd model a working DeepMD-kit installation (dp CLI available in $PATH, see SPARC Installation Guide), and an input json file (input.json) compatible with DeepMD package is required.

Within each iter_xxxx directory, a 01.train folder is created. Foe each model specidifed by num_models, a subdirectory named training_n is created, where the training is executed. The training configurations is taken from the user defined input_file, see DeepMD section for details.

SPARC edits the base input.json for each model so that training data path remains same across each model, while assigns a unique seed value for each training. The following fields in the training configuration will be updated,

Replace seed everywhere with a random value for each run,
Set type_map for system specific atom_types,
Set training_data.systems = [ "<datadir>/training_data" ],
Set validation_data.systems = [ "<datadir>/validation_data" ].

"model": {
   "type_map": ["O", "H"]

"training": {
   "seed": 123456,
   "training_data": {
      "systems": []
   },
   "validation_data": {
      "systems": []
   },
   }

Logs are written to deepmd_training.log. If more than four models are requested, a warning is logged. The function evaluate_model_accuracy evaluates each frozen model at the end of its training run.

Tip

Train multiple models to improve accuracy.

After training each iter_xxxx/01.train/training_n should have a file frozen_model_n.pb file. These models are then used in a Query-by-committee approach to find new candidates for labelling.

Note

Query by committee (QbC): Identifies the configurations by measuring the disagreement among an ensemble of model. Allows the model to learn only what it needs to without wasting resources on redundant data. See also Deepmd-kit model deviation for more details.

If the candidates are found, a subdirecoty dft_candidates is created under the 02.dft folder. It contains one POSCAR for each candidates.

>>> tree dft_candidates
    ├── 0001
    │   └── POSCAR
    ├── 0002
    │   └── POSCAR
    ├── 0003
    │   └── POSCAR
    ├── 0004
    │   └── POSCAR
    ├── 0005
        └── POSCAR

from sparc.src.deepmd import setup_DeepPotential
from ase import Atoms

atoms = Atoms("H2O")
system, calc = setup_DeepPotential(atoms, model_path='iter_000000/01.train/training_1', model_name='frozen_model_1.pb')
print(system.get_potential_energy())

Returns ASE Atoms with DeepPotential calculator attached and corresponding deepmd.calculator.DP instance.

Example: Evaluate a Frozen Model

from sparc.src.deepmd import evaluate_model_accuracy

model = "iter_000000/01.train/training_1/frozen_model_1.pb"
test  = "Dataset/validation_data"
evaluate_model_accuracy(model, test)

References

For more details on DeepMD-Kit, visit: https://github.com/deepmodeling/deepmd-kit