Data repository for the publication: Kazuki Morita, Daniel W. Davies, Keith T. Butler, and Aron Walsh Modelling the dielectric constants of crystals using machine learning
This work is mainly composed of two parts: machine learning and SHAP analysis. Overall aim is to extract physically and chemically meaningful trends from trained machine learning model.
- build docker image
docker build --rm -t mlshap .
- run container (this will run jupyter lab directly)
docker run -it -p 8080:8080 --name mlshap mlshap
- port 8080 is used by default
- Train a support vector regression (SVR) model
- Check the performance as done in the paper
Step 2: Shapley additive explanation analysis
- analyse the model and its prediction using shapley additive explanations(SHAP)
All the required dataset is in the dataset
directory.
The notebooks and the dataset make use of many Python packages:
The jupyter notebooks and the datasets follow the copyrights of above packages.
For specific package versions see the Dockerfile
.
- The dataset was downloaded from
Materials Project
on March 2020 and it may differ from the current version of the database. - Docker should reproduce the result to a high degree, but the result may slightly differ from the main paper.
- Many different libraries are used and I am not an expert in all of them: some of the code is probably far from elegant!