Fine-Tuning vs. Training From Scratch
SPARC supports two MLIP training strategies. Choose based on how much DFT data you have and whether a suitable pre-trained foundation model exists.
Training from scratch (default)
SPARC trains num_models DeePMD models from random initialisation using
the accumulated dataset. Weights are updated entirely from your DFT data.
Use this when:
No relevant pre-trained model exists for your chemical system
You have a large DFT dataset (typically > 500 frames)
Maximum control over the model architecture is required
Configuration — simply leave finetune.enabled: false (the default) and
configure mlip_setup normally.
Fine-tuning a DeePMD universal model
Set finetune.enabled: true to initialise from a pre-trained DeePMD
foundation model (DPA-1, DPA-2, or DPA-3). This requires DeePMD-kit v3
with the PyTorch backend.
Fine-tuning typically converges with far fewer DFT calculations than training from scratch — often 50–200 frames instead of 500+.
When to use fine-tuning
Situation |
Recommendation |
|---|---|
Small dataset (< 300 frames) |
Fine-tuning converges faster |
Inorganic materials |
DPA-3 |
Organic / reactive systems |
DPA-3 |
Novel system, no related pre-trained model |
Train from scratch |
Using TF/TF2 DeePMD (v2.x) |
Train from scratch (fine-tuning requires v3+) |
For full configuration options, see Fine-Tuning Universal Models.