Data-Driven Materials Discovery Using ML

This document discusses applying materials informatics principles to materials science to improve materials development. It describes extracting data from materials databases, cleaning and feature engineering the data, building and evaluating classical machine learning and neural network models to predict bulk modulus, and achieving good performance from an extra trees regression model and dense neural network model.

Uploaded by

desai.achyutakrishna

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

34 views1 page

Data-Driven Materials Discovery Using ML

Uploaded by

desai.achyutakrishna

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 1

MATERIALS INFORMATICS

STUDENT NAME : PRAVEEN M

STUDENT ROLL NO : CB.SC.P2PHY20013
PROJECT GUIDE : Dr.M.DHARANI

INTRODUCTION
RESULT AND ANALYSIS
Materials informatics is a field of This is an emerging field, with a goal to achieve
Raw Data from Matminer Data repository
study that applies the principles of high-speed and robust acquisition, Data after Cleaning
informatics to materials science and management, analysis, and dissemination of
engineering to improve the diverse materials data with the goal of greatly
understanding, use, selection, reducing the time and risk required to develop,
development, and discovery of produce, and deploy a new materials, which
materials. generally takes longer than 20 years.

OBJECTIVE Fitted Classical ML Models with various parameters

To extract data from materials database repositories to perform data analysis and
build a Machine Learning model to predict the Bulk modulus of the given
materials using Composition Based Feature Vector.

METHODOLOGY Best fitted model after Evaluation Performance of models with 10 different splits

Extracting data: The data we use is accessed through matminer interface where the data we use is from "A
complete copy of the Materials Project database" which consists of 83989 compositions.
Loading and examining the data: Using Pandas, we read in the dataset into a Data Frame. We examine some
rows and look at the data's basic statistics.
Data Cleaning: Here we can use the built-in Pandas methods to check for NaN values in the dataset, which are
missing values. Finally, after cleaning and processing the data, you can save it to disk in a cleaned state for you to
use later. Panda allows us to save our data as a comma separated value .csv file.
Splitting data into the train/validation/test dataset:It is important to split your full dataset into
train/validation/test datasets, and reliably use the same datasets for your modeling tasks later. By saving these
dataset splits into files, you can then later reproducible use these same exact splits during your subsequent model
training and comparison steps. Use the same datasets for all your models, that way, you can ensure a fair
comparison. Performance of models with 10 different splits Evaluated Denset Neural Network Model
Data Featurization: Featurizing materials composition data using so-called "composition-based feature vectors",
or CBFVs. This method represents a single chemical formula as one vector based on its constituent atoms'
chemical properties
Modeling using "classical" machine learning models:Here we implement some classical ML models from
sklearn:Ridge regression, Support vector machine, Linear support vector machine, Random forest, Extra trees,
Adaptive boosting, Gradient boosting, k-nearest neighbors, Dummy regression. In our case, since our target
variables are continuous values (Bulk Modulus), we are performing a regression task.
Evaluating model performance on validation dataset:We use the same validation set to evaluate all models.
This ensures a fair comparison. In addition, we plot the predicted vs. actual plots using the predictions made by
each trained model on the same validation set. After you have finalized your model, you can re-train your model
again on the combined train + validation datasets, and finally, evaluate your model on the held-out test dataset to
improve model performance. Therefore, typically the average value of all the scores are reported, as this gives a
much more accurate estimate of how well the model actually performs.
Modeling using neural network:We will define a simple dense fully-connected neural network which we will
call DenseNet.The input layer of DenseNet accepts input data in the dimension of each row of the input data,
which is equal to the number of features in our CBFV featurization scheme. The output layer dimension of
CONCLUSION
DenseNet is 1, because we want to predict one value (bulk modulus).And finally, we train the neural network by
In this study, we were able to predict the Bulk modulus property of 83989 compositions just by specifying its
evaluating the model on the validation dataset using val_loader and plot predicted vs. actual value plots.
chemical formula with the help of Composition Based Feature Vector (CBFV).
We were able to get an appropriate model (i.e. Extra Tree regression) and fitted it within just 42 seconds.
We also successfully implemented the Densenet neural network model with the extracted data and evaluated the
model performance.
Finally, we have successfully created a linkage between the composition and the property to be predicted.

REFERENCE
Sterling G. Baird*, Marianne Liu*, Hasan M. Sayeed*, and Taylor D. Sparks “Data-Driven Materials Discovery and Synthesis using Machine Learning Methods, Comprehensive Inorganic Chemistry III..”
Taylor D. Sparks, “Inaugural Congress to Focus on Artificial Intelligence” JOM, 73, 3679-3680 (2021).
Debanshu Banerjee* and Taylor D. Sparks “Comparing transfer learning to feature optimization Computational Materials Science, 195, 110452 (2021)”.
Debanshu Banerjee* and Taylor D. Sparks “Comparing transfer learning to feature optimization in microstructure classification Computational Materials Science, 195, 110452 (2021)”.

Predicting Material Properties Using Machine Learning For Accelerated Materials Discovery
No ratings yet
Predicting Material Properties Using Machine Learning For Accelerated Materials Discovery
9 pages
Lecture 2 Turning Data Into Knowledge
No ratings yet
Lecture 2 Turning Data Into Knowledge
107 pages
Dunn Et Al. - 2020 - Benchmarking Materials Property Prediction Methods The Matbench Test Set and Automatminer Reference
No ratings yet
Dunn Et Al. - 2020 - Benchmarking Materials Property Prediction Methods The Matbench Test Set and Automatminer Reference
10 pages
Chapter 1
No ratings yet
Chapter 1
20 pages
Machine Learning in Materials Science
No ratings yet
Machine Learning in Materials Science
7 pages
Combining Machine Learning and Computational Chemistry For Predictive Insights Into Chemical Systems
No ratings yet
Combining Machine Learning and Computational Chemistry For Predictive Insights Into Chemical Systems
150 pages
Base Paer
No ratings yet
Base Paer
61 pages
ML For Mat. Sc.
No ratings yet
ML For Mat. Sc.
41 pages
Interpretable ML Techniques for Materials Design
No ratings yet
Interpretable ML Techniques for Materials Design
38 pages
InfoMat - 2019 - Wei - Machine Learning in Materials Science
No ratings yet
InfoMat - 2019 - Wei - Machine Learning in Materials Science
34 pages
Machine Learning in Materials Science
No ratings yet
Machine Learning in Materials Science
21 pages
A Review
No ratings yet
A Review
15 pages
MCM Research Article
No ratings yet
MCM Research Article
24 pages
MCM Research Article-2
No ratings yet
MCM Research Article-2
32 pages
Research Paper (2) DIYA
No ratings yet
Research Paper (2) DIYA
12 pages
Predicting The Mechanical Properties of Aluminum Alloys
No ratings yet
Predicting The Mechanical Properties of Aluminum Alloys
8 pages
Translated - 1 s2.0 S2095809918313559 Main
100% (1)
Translated - 1 s2.0 S2095809918313559 Main
10 pages
ML For Composites
No ratings yet
ML For Composites
11 pages
Cammd Aims Lecture1
No ratings yet
Cammd Aims Lecture1
24 pages
Materials 16 07322 v2
No ratings yet
Materials 16 07322 v2
24 pages
Matminer: Data Mining for Materials Science
No ratings yet
Matminer: Data Mining for Materials Science
57 pages
MD-HIT: Machine Learning For Material Property Prediction With Dataset Redundancy Control
No ratings yet
MD-HIT: Machine Learning For Material Property Prediction With Dataset Redundancy Control
11 pages
Intreport 1
No ratings yet
Intreport 1
6 pages
Machine Learning in Materials Research
No ratings yet
Machine Learning in Materials Research
11 pages
Materials 16 05977
No ratings yet
Materials 16 05977
30 pages
Predicting The Electronic and Structural Properties of Two-Dimensional Materials Using Machine Learning
No ratings yet
Predicting The Electronic and Structural Properties of Two-Dimensional Materials Using Machine Learning
14 pages
An Intelligent Computing System To Detect Material
No ratings yet
An Intelligent Computing System To Detect Material
5 pages
Aims Notes
No ratings yet
Aims Notes
28 pages
Lecture 1 Foundations of Materials Informatics
No ratings yet
Lecture 1 Foundations of Materials Informatics
92 pages
Areview On Recent Applications of Machine Learning
No ratings yet
Areview On Recent Applications of Machine Learning
22 pages
1 s2.0 S2095809923002813 Main
No ratings yet
1 s2.0 S2095809923002813 Main
14 pages
Crystals 13 00602
No ratings yet
Crystals 13 00602
10 pages
ML for Metallic Material Characterization
No ratings yet
ML for Metallic Material Characterization
21 pages
1 s2.0 S2095809923002813 Main
No ratings yet
1 s2.0 S2095809923002813 Main
14 pages
Structure To Property: Chemical Element Embeddings and A Deep Learning Approach For Accurate Prediction of Chemical Properties
No ratings yet
Structure To Property: Chemical Element Embeddings and A Deep Learning Approach For Accurate Prediction of Chemical Properties
11 pages
Model Stacking To Improve Prediction and Variable Impor 2022 Digital Chemica
No ratings yet
Model Stacking To Improve Prediction and Variable Impor 2022 Digital Chemica
13 pages
IIT Kanpur Project Report-Naman Sharma
No ratings yet
IIT Kanpur Project Report-Naman Sharma
14 pages
35 2018 Jacs
No ratings yet
35 2018 Jacs
10 pages
Role of Machine Learning in The Field of Fiber Reinforced Polymer
No ratings yet
Role of Machine Learning in The Field of Fiber Reinforced Polymer
6 pages
Unit II 1
No ratings yet
Unit II 1
12 pages
AFM Inno. Mat Sci Via ML
No ratings yet
AFM Inno. Mat Sci Via ML
14 pages
2023advances in Machine Learning - and Artificial Intelligence-Assisted Material Design of Steels
No ratings yet
2023advances in Machine Learning - and Artificial Intelligence-Assisted Material Design of Steels
22 pages
Mathematics 12 01153 v2
No ratings yet
Mathematics 12 01153 v2
22 pages
ML Prediction of Molecular Properties Using SMILES
No ratings yet
ML Prediction of Molecular Properties Using SMILES
13 pages
Tagade Et Al. - 2019 - Attribute Driven Inverse Materials Design Using de
No ratings yet
Tagade Et Al. - 2019 - Attribute Driven Inverse Materials Design Using de
14 pages
Advanced Science - 2023 - Muroga - A Comprehensive and Versatile Multimodal Deep Learning Approach For Predicting Diverse
No ratings yet
Advanced Science - 2023 - Muroga - A Comprehensive and Versatile Multimodal Deep Learning Approach For Predicting Diverse
12 pages
Nanomaterials 14 01688
No ratings yet
Nanomaterials 14 01688
22 pages
Empowering Chemists With AI and ML
No ratings yet
Empowering Chemists With AI and ML
31 pages
AI Tool for Material Property Prediction
No ratings yet
AI Tool for Material Property Prediction
7 pages
AI in Process Chemistry
No ratings yet
AI in Process Chemistry
49 pages
Machine Learning For Molecular and Materials Science 2c34wzvcqw
No ratings yet
Machine Learning For Molecular and Materials Science 2c34wzvcqw
21 pages
Reliable and Explainable
No ratings yet
Reliable and Explainable
24 pages
Applsci 13 00567
No ratings yet
Applsci 13 00567
18 pages
0030 D Magpie Encoding 2
No ratings yet
0030 D Magpie Encoding 2
10 pages
Machine Learning's Impact on Chemistry
No ratings yet
Machine Learning's Impact on Chemistry
153 pages
Accelerating Materials Property Predictions Using Machine Learning
No ratings yet
Accelerating Materials Property Predictions Using Machine Learning
6 pages
Applied Materials Today
No ratings yet
Applied Materials Today
12 pages
SRE Insights for Google Cloud Users
No ratings yet
SRE Insights for Google Cloud Users
58 pages
Pupils Handbook
100% (1)
Pupils Handbook
2 pages
Literature Review of Vacuum Cleaner
100% (2)
Literature Review of Vacuum Cleaner
5 pages
Characteristics of Rational Functions Worksheet
No ratings yet
Characteristics of Rational Functions Worksheet
3 pages
AOM: Quick Reference for Educators
No ratings yet
AOM: Quick Reference for Educators
3 pages
Nursing Case Study Evaluation
No ratings yet
Nursing Case Study Evaluation
1 page
Kontrak Perkuliahan English For Management
No ratings yet
Kontrak Perkuliahan English For Management
3 pages
Script of Interview
No ratings yet
Script of Interview
12 pages
Familiarizing with My Second Home
No ratings yet
Familiarizing with My Second Home
6 pages
6th Grade Trapezoid
No ratings yet
6th Grade Trapezoid
3 pages
Science vs Philosophy: Key Differences
No ratings yet
Science vs Philosophy: Key Differences
3 pages
Media, Identity, and Online Communities in A Changing Arab World
No ratings yet
Media, Identity, and Online Communities in A Changing Arab World
8 pages
Calicut University Sixth Semester Exam Timetable
No ratings yet
Calicut University Sixth Semester Exam Timetable
3 pages
English Grammar for Students
No ratings yet
English Grammar for Students
5 pages
AI Insights for Knowledge Workers
No ratings yet
AI Insights for Knowledge Workers
176 pages
Retake Application
No ratings yet
Retake Application
1 page
Lesson Plan-Technical English L-2
No ratings yet
Lesson Plan-Technical English L-2
2 pages
RPS Matakuliah 181665331 - VISIN
No ratings yet
RPS Matakuliah 181665331 - VISIN
21 pages
York Notes For Gcse Aqa Practice Tests An Inspector Calls
100% (3)
York Notes For Gcse Aqa Practice Tests An Inspector Calls
29 pages
Accomplishment Report - Phil Iri
100% (5)
Accomplishment Report - Phil Iri
2 pages
PSUCCESS Membership Application Form
No ratings yet
PSUCCESS Membership Application Form
2 pages
Tarp Final
No ratings yet
Tarp Final
49 pages
Spreadsheet Best Practices Guide
No ratings yet
Spreadsheet Best Practices Guide
1 page
Annual Pedagogical Plan
No ratings yet
Annual Pedagogical Plan
18 pages
EPL Training System Functional Requirements
No ratings yet
EPL Training System Functional Requirements
2 pages
Bits Timetable
No ratings yet
Bits Timetable
172 pages
Alder Indivisual Psychology
100% (3)
Alder Indivisual Psychology
159 pages
HRDC Bdu 23 24
No ratings yet
HRDC Bdu 23 24
1 page
New School Forms in Excel Format - Prototype Forms For Implementation Effective End of Sy 2013 2014
No ratings yet
New School Forms in Excel Format - Prototype Forms For Implementation Effective End of Sy 2013 2014
62 pages
DBT for Borderline Personality Disorder
No ratings yet
DBT for Borderline Personality Disorder
22 pages

Data-Driven Materials Discovery Using ML

Uploaded by

Data-Driven Materials Discovery Using ML

Uploaded by

MATERIALS INFORMATICS

STUDENT NAME : PRAVEEN M

OBJECTIVE Fitted Classical ML Models with various parameters

You might also like