.. _deepmd: DeePMD ====== For training deepmd model a working DeepMD-kit installation (``dp`` CLI available in ``$PATH``, see :ref:`InstalltionGuide`), and an input ``json`` file (``input.json``) compatible with DeepMD package is required. Within each ``iter_xxxx`` directory, a ``01.train`` folder is created. Foe each model specidifed by ``num_models``, a subdirectory named ``training_n`` is created, where the training is executed. The training configurations is taken from the user defined ``input_file``, see :ref:`deepmd_section` section for details. SPARC edits the base ``input.json`` for each model so that training data path remains same across each model, while assigns a **unique** `seed` value for each training. The following fields in the training configuration will be updated, - Replace ``seed`` everywhere with a random value for each run, - Set ``type_map`` for system specific ``atom_types``, - Set ``training_data.systems = [ "/training_data" ]``, - Set ``validation_data.systems = [ "/validation_data" ]``. .. code-block:: json "model": { "type_map": ["O", "H"] "training": { "seed": 123456, "training_data": { "systems": [] }, "validation_data": { "systems": [] }, } Logs are written to ``deepmd_training.log``. If more than four models are requested, a warning is logged. The function ``evaluate_model_accuracy`` evaluates each frozen model at the end of its training run. .. tip:: Train multiple models to improve accuracy. After training each ``iter_xxxx/01.train/training_n`` should have a file ``frozen_model_n.pb`` file. These models are then used in a *Query-by-committee* approach to find new candidates for labelling. .. note:: **Query by committee (QbC)**: Identifies the configurations by measuring the disagreement among an ensemble of model. Allows the model to learn only **what it needs to** without wasting resources on redundant data. See also Deepmd-kit `model deviation `_ for more details. If the candidates are found, a subdirecoty ``dft_candidates`` is created under the ``02.dft`` folder. It contains one ``POSCAR`` for each candidates. .. code-block:: bash >>> tree dft_candidates ├── 0001 │   └── POSCAR ├── 0002 │   └── POSCAR ├── 0003 │   └── POSCAR ├── 0004 │   └── POSCAR ├── 0005    └── POSCAR .. (# change the path to sparc.src.deepmd later) .. automodule:: sparc.src.deepmd :members: setup_DeepPotential :exclude-members: :undoc-members: :show-inheritance: .. code-block:: python from sparc.src.deepmd import setup_DeepPotential from ase import Atoms atoms = Atoms("H2O") system, calc = setup_DeepPotential(atoms, model_path='iter_000000/01.train/training_1', model_name='frozen_model_1.pb') print(system.get_potential_energy()) Returns ASE ``Atoms`` with DeepPotential calculator attached and corresponding ``deepmd.calculator.DP`` instance. Example: Evaluate a Frozen Model -------------------------------- .. code-block:: python from sparc.src.deepmd import evaluate_model_accuracy model = "iter_000000/01.train/training_1/frozen_model_1.pb" test = "Dataset/validation_data" evaluate_model_accuracy(model, test) .. automodule:: sparc.src.deepmd :members: evaluate_model_accuracy References ---------- For more details on DeepMD-Kit, visit: https://github.com/deepmodeling/deepmd-kit .. _qbc: https://docs.deepmodeling.com/projects/deepmd/en/stable/test/model-deviation.html