DeePMD
For training deepmd model a working DeepMD-kit installation (dp CLI available in $PATH, see SPARC Installation Guide), and an input json file (input.json) compatible with DeepMD package is required.
Within each iter_xxxx directory, a 01.train folder is created. Foe each model specidifed by num_models, a subdirectory named training_n is created, where the training is executed. The training configurations is taken from the user defined input_file, see DeepMD section for details.
SPARC edits the base input.json for each model so that training data path remains same across each model, while assigns a unique seed value for each training. The following fields in the training configuration will be updated,
Replace
seedeverywhere with a random value for each run,Set
type_mapfor system specificatom_types,Set
training_data.systems = [ "<datadir>/training_data" ],Set
validation_data.systems = [ "<datadir>/validation_data" ].
"model": {
"type_map": ["O", "H"]
"training": {
"seed": 123456,
"training_data": {
"systems": []
},
"validation_data": {
"systems": []
},
}
Logs are written to deepmd_training.log. If more than four models are requested, a warning is logged. The function evaluate_model_accuracy evaluates each frozen model at the end of its training run.
Tip
Train multiple models to improve accuracy.
After training each iter_xxxx/01.train/training_n should have a file frozen_model_n.pb file.
These models are then used in a Query-by-committee approach to find new candidates for labelling.
Note
Query by committee (QbC): Identifies the configurations by measuring the disagreement among an ensemble of model. Allows the model to learn only what it needs to without wasting resources on redundant data. See also Deepmd-kit model deviation for more details.
If the candidates are found, a subdirecoty dft_candidates is created under the 02.dft folder. It contains one POSCAR for each candidates.
>>> tree dft_candidates
├── 0001
│ └── POSCAR
├── 0002
│ └── POSCAR
├── 0003
│ └── POSCAR
├── 0004
│ └── POSCAR
├── 0005
└── POSCAR
from sparc.src.deepmd import setup_DeepPotential
from ase import Atoms
atoms = Atoms("H2O")
system, calc = setup_DeepPotential(atoms, model_path='iter_000000/01.train/training_1', model_name='frozen_model_1.pb')
print(system.get_potential_energy())
Returns ASE Atoms with DeepPotential calculator attached and corresponding deepmd.calculator.DP instance.
Example: Evaluate a Frozen Model
from sparc.src.deepmd import evaluate_model_accuracy
model = "iter_000000/01.train/training_1/frozen_model_1.pb"
test = "Dataset/validation_data"
evaluate_model_accuracy(model, test)
References
For more details on DeepMD-Kit, visit: https://github.com/deepmodeling/deepmd-kit