MATERIAL FEATURES PROPERTY
TiO2 rutile F11 F12 … F1N gap = 3.0 eV
C diamond F21 F22 … F2N gap = 5.5 eV
… … … … … …
PbTe rocksalt FM1 FM2 … FMN gap = 0.3 eV
Python 
ML Libraries
Data
Featurization
Data
Retrieval
Data
Visualization
Materials Databases
MPDSCitrine
Materials
Project
Software tools Materials Project integration
Software tools for data-driven research and
their application to thermoelectrics materials discovery
Anubhav Jain, Energy Storage & Distributed Resources Division, Berkeley Lab
Thermoelectrics discovery
Atomate is a library of
standardized workflows for
high-throughput computational
materials science. One can
provide a crystal structure and
select from >15 different types
of calculation procedures (e.g.,
band structure, elastic tensor,
thermal expansion, etc.) to
perform. Atomate scales to
millions of calculations at large
supercomputing centers and
produces a database of
organized results.
www.github.com/hackingmaterials/atomate
(production)
Matminer retrieves data from
the APIs of several large
databases and assists the user
with feature extraction,
implementing over 20 different
featurization classes that can
g e n e r a t e t h o u s a n d s o f
physically-relevant materials
descriptors. Matminer includes
a built-in visualization package
and interfaces to standard
Python-based libraries for
machine learning.
www.github.com/hackingmaterials/matminer
(released)
The AMSET project develops
and implements an ab initio
model for electron transport
(mobility, Seebeck) that
b a l a n c e s a c c u r a c y a n d
computational efficiency. It
achieves this by adapting
classical scattering equations
intended for single 1D
parabolic bands to complex,
anisotropic band structures
c o m p u t e d f ro m d e n s i t y
functional theory methods.
www.github.com/hackingmaterials/amset
(in development)
The Materials Project produces
computationally-generated
m a t e r i a l s d a t a t h a t i s
disseminated to over 45,000
registered users. The software
tools developed in our project
are integrated with the
Materials Project. For example,
the atomate software will be
used by Materials Project to run
millions of simulations in the
upcoming year.
We have recently developed a set
of order parameters to characterize
the local environment of a site in a
structure. Order parameter
functions produce a value of 1 when
matching a target geometry (e.g.,
“tetrahedral”) and continuously
fade to zero as the neighbor
geometry deviates from the target.
The set of order parameters are
assembled into a “fingerprint” that
characterizes a site’s resemblance to
~20 coordination motifs. The site
fingerprints are then assembled into
a structure fingerprint.
One application deployed on MP is
to compute “similarities” between
crystal structures in terms of
fingerprint distance, which is robust
even under distortion or alloying.
We used the atomate library to
compute the electron transport
properties of ~48,000 inorganic
materials using the BoltzTraP
method under a constant
relaxation time approximation.
This database (~300GB) is
downloadable online through
the Dryad repository and is
used in our project as the basis
for identifying potential new
thermoelectric compositions.
One novel thermoelectric
composition identified from the
computational screening is p-
type YCuTe2, which exhibits
favorable band convergence
and exhibits a peak zT of ~0.75
(in-line with computational
estimates). To our knowledge,
this is the highest experimental
zT from a composition first
suggested from computational
guidance. Another material,
TmAgTe2, was suggested but
its experimental zT (~0.35) was
limited by doping issues.
AMSET!
>45,000
users!
Target: W
similar structures
detected
Cs3Sb!
TiGaFeCo!
CeMg2Cu!
www.materialsproject.org/materials/mp-91/
1.0E+02
1.0E+03
1.0E+04
1.0E+05
1.0E+06
1.0E+07
1.0E+08
200 300 400 500 600 700 800 900 1000 1100
Mobility(cm2/V*s)
Temperature (K)
IMP
PIE
ACD
POP
overall
expt
experiment
computation
We have also investigated
traditionally under-explored
chemistries such as phosphide
thermoelectrics. Contrary to
the belief that phosphides
should have high thermal
c o n d u c t i v i t y ( κ ) , o u r
computations suggest that the
m i n i m u m κ o f t h e s e
compounds can be quite low.
Experimental tests of cubic
NiP2 confirms low κ (~1.0 W/
m*K) but with poor electronic
properties.
Funding
Funding for this research was provided by the U.S.
Department of Energy, Basic Energy Sciences, Materials
Science Division through an Early Career Grant and through
the Materials Project center grant EDCBEE. Computing
resources were provided by the National Energy Research
Scientific Computing Center.
Registered users over time
www.materialsproject.org
Electron mobility of n-type GaAs
(ne = 3.0 * 1013 cm-3)

More Related Content

PDF
Methods, tools, and examples (Part II): High-throughput computation and machi...
PDF
Density functional theory calculations and data mining for new thermoelectric...
PDF
Software tools, crystal descriptors, and machine learning applied to material...
PDF
Conducting and Enabling Data-Driven Research Through the Materials Project
PDF
Combining density functional theory calculations, supercomputing, and data-dr...
PDF
Prediction and Experimental Validation of New Bulk Thermoelectrics Compositio...
PDF
Combining density functional theory calculations, supercomputing, and data-dr...
PDF
Capturing and leveraging materials science knowledge from millions of journal...
Methods, tools, and examples (Part II): High-throughput computation and machi...
Density functional theory calculations and data mining for new thermoelectric...
Software tools, crystal descriptors, and machine learning applied to material...
Conducting and Enabling Data-Driven Research Through the Materials Project
Combining density functional theory calculations, supercomputing, and data-dr...
Prediction and Experimental Validation of New Bulk Thermoelectrics Compositio...
Combining density functional theory calculations, supercomputing, and data-dr...
Capturing and leveraging materials science knowledge from millions of journal...

What's hot (20)

PDF
Software tools for high-throughput materials data generation and data mining
PDF
Discovering advanced materials for energy applications (with high-throughput ...
PDF
Computational materials design with high-throughput and machine learning methods
PDF
Data dissemination and materials informatics at LBNL
PDF
Computational screening of tens of thousands of compounds as potential thermo...
PDF
Software Tools, Methods and Applications of Machine Learning in Functional Ma...
PDF
Data Mining to Discovery for Inorganic Solids: Software Tools and Applications
PDF
Machine learning for materials design: opportunities, challenges, and methods
PDF
Materials discovery through theory, computation, and machine learning
PDF
Materials Project computation and database infrastructure
PDF
Introduction (Part I): High-throughput computation and machine learning appli...
PDF
Software tools, crystal descriptors, and machine learning applied to material...
PDF
Targeted Band Structure Design and Thermoelectric Materials Discovery Using H...
PDF
DuraMat Data Analytics
PDF
Computational Materials Design and Data Dissemination through the Materials P...
PDF
Combining density functional theory calculations, supercomputing, and data-dr...
PDF
Combined Theory and Data-Driven Approaches to Thermoelectrics Materials Disco...
PDF
Automated Machine Learning Applied to Diverse Materials Design Problems
PDF
Atomate: a tool for rapid high-throughput computing and materials discovery
PDF
Discovering advanced materials for energy applications by mining the scientif...
Software tools for high-throughput materials data generation and data mining
Discovering advanced materials for energy applications (with high-throughput ...
Computational materials design with high-throughput and machine learning methods
Data dissemination and materials informatics at LBNL
Computational screening of tens of thousands of compounds as potential thermo...
Software Tools, Methods and Applications of Machine Learning in Functional Ma...
Data Mining to Discovery for Inorganic Solids: Software Tools and Applications
Machine learning for materials design: opportunities, challenges, and methods
Materials discovery through theory, computation, and machine learning
Materials Project computation and database infrastructure
Introduction (Part I): High-throughput computation and machine learning appli...
Software tools, crystal descriptors, and machine learning applied to material...
Targeted Band Structure Design and Thermoelectric Materials Discovery Using H...
DuraMat Data Analytics
Computational Materials Design and Data Dissemination through the Materials P...
Combining density functional theory calculations, supercomputing, and data-dr...
Combined Theory and Data-Driven Approaches to Thermoelectrics Materials Disco...
Automated Machine Learning Applied to Diverse Materials Design Problems
Atomate: a tool for rapid high-throughput computing and materials discovery
Discovering advanced materials for energy applications by mining the scientif...
Ad

Similar to Software tools for data-driven research and their application to thermoelectrics materials discovery (20)

PDF
Recent Advancements in the NIST-JARVIS Infrastructure
PDF
BigData_MultiDimensional_CaseStudy
PDF
BigData_MultiDimensional_CaseStudy
PDF
Implementing a neural network potential for exascale molecular dynamics
PDF
Prediction of Critical Temperature of Superconductors using Tree Based Method...
PDF
cis97003
PDF
A*STAR Webinar on The AI Revolution in Materials Science
PDF
Modelling the Single Chamber Solid Oxide Fuel Cell by Artificial Neural Network
PDF
Big Fast Data in High-Energy Particle Physics
PDF
Accelerated Materials Discovery Using Theory, Optimization, and Natural Langu...
PDF
Discovering new functional materials for clean energy and beyond using high-t...
PDF
Nanometric Modelization of Gas Structure, Multidimensional using COMSOL Soft...
PDF
cug2011-praveen
PPT
Introduction to cosmology and numerical cosmology (with the Cactus code) (2/2)
PDF
Fault detection in power transformers using random neural networks
PDF
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
PDF
Efficient methods for accurately calculating thermoelectric properties – elec...
PDF
The Materials Project: An Electronic Structure Database for Community-Based M...
PDF
I0343047049
PDF
Evaluating Machine Learning Algorithms for Materials Science using the Matben...
Recent Advancements in the NIST-JARVIS Infrastructure
BigData_MultiDimensional_CaseStudy
BigData_MultiDimensional_CaseStudy
Implementing a neural network potential for exascale molecular dynamics
Prediction of Critical Temperature of Superconductors using Tree Based Method...
cis97003
A*STAR Webinar on The AI Revolution in Materials Science
Modelling the Single Chamber Solid Oxide Fuel Cell by Artificial Neural Network
Big Fast Data in High-Energy Particle Physics
Accelerated Materials Discovery Using Theory, Optimization, and Natural Langu...
Discovering new functional materials for clean energy and beyond using high-t...
Nanometric Modelization of Gas Structure, Multidimensional using COMSOL Soft...
cug2011-praveen
Introduction to cosmology and numerical cosmology (with the Cactus code) (2/2)
Fault detection in power transformers using random neural networks
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
Efficient methods for accurately calculating thermoelectric properties – elec...
The Materials Project: An Electronic Structure Database for Community-Based M...
I0343047049
Evaluating Machine Learning Algorithms for Materials Science using the Matben...
Ad

More from Anubhav Jain (20)

PDF
A Career at a U.S. National Lab: Perspective from a Mid-Career Scientist
PDF
Research opportunities in materials design using AI/ML
PDF
Accelerating materials discovery with big data and machine learning
PDF
Predicting the Synthesizability of Inorganic Materials: Convex Hulls, Literat...
PDF
Discovering advanced materials for energy applications: theory, high-throughp...
PDF
Applications of Large Language Models in Materials Discovery and Design
PDF
An AI-driven closed-loop facility for materials synthesis
PDF
Best practices for DuraMat software dissemination
PDF
Best practices for DuraMat software dissemination
PDF
Available methods for predicting materials synthesizability using computation...
PDF
Natural Language Processing for Data Extraction and Synthesizability Predicti...
PDF
Machine Learning for Catalyst Design
PDF
Natural language processing for extracting synthesis recipes and applications...
PDF
Accelerating New Materials Design with Supercomputing and Machine Learning
PDF
DuraMat CO1 Central Data Resource: How it started, how it’s going …
PDF
The Materials Project
PDF
Evaluating Chemical Composition and Crystal Structure Representations using t...
PDF
Perspectives on chemical composition and crystal structure representations fr...
PDF
Discovering and Exploring New Materials through the Materials Project
PDF
The Materials Project: Applications to energy storage and functional materia...
A Career at a U.S. National Lab: Perspective from a Mid-Career Scientist
Research opportunities in materials design using AI/ML
Accelerating materials discovery with big data and machine learning
Predicting the Synthesizability of Inorganic Materials: Convex Hulls, Literat...
Discovering advanced materials for energy applications: theory, high-throughp...
Applications of Large Language Models in Materials Discovery and Design
An AI-driven closed-loop facility for materials synthesis
Best practices for DuraMat software dissemination
Best practices for DuraMat software dissemination
Available methods for predicting materials synthesizability using computation...
Natural Language Processing for Data Extraction and Synthesizability Predicti...
Machine Learning for Catalyst Design
Natural language processing for extracting synthesis recipes and applications...
Accelerating New Materials Design with Supercomputing and Machine Learning
DuraMat CO1 Central Data Resource: How it started, how it’s going …
The Materials Project
Evaluating Chemical Composition and Crystal Structure Representations using t...
Perspectives on chemical composition and crystal structure representations fr...
Discovering and Exploring New Materials through the Materials Project
The Materials Project: Applications to energy storage and functional materia...

Recently uploaded (20)

PPTX
Targeted drug delivery system 1_44299_BP704T_03-12-2024.pptx
PDF
LEUCEMIA LINFOBLÁSTICA AGUDA EN NIÑOS. Guías NCCN 2020-desbloqueado.pdf
PDF
Thyroid Hormone by Iqra Nasir detail.pdf
PDF
software engineering for computer science
PDF
Pharmacokinetics Lecture_Study Material.pdf
PDF
final prehhhejjehehhehehehebesentation.pdf
PDF
Unit Four Lesson in Carbohydrates chemistry
PPTX
Cutaneous tuberculosis Dermatology
PPTX
Introduction of Plant Ecology and Diversity Conservation
PDF
BCKIC FOUNDATION_MAY-JUNE 2025_NEWSLETTER
PDF
2024_PohleJellKlug_CambrianPlectronoceratidsAustralia.pdf
PPTX
Contact Lens Dr Hari.pptx presentation powerpoint
PPTX
ELS 2ND QUARTER 2 FOR HUMSS STUDENTS.pptx
PDF
Microplastics: Environmental Impact and Remediation Strategies
PPTX
Chromosomal Aberrations Dr. Thirunahari Ugandhar.pptx
PPTX
Thyroid disorders presentation for MBBS.pptx
PPTX
Spectroscopic Techniques for M Tech Civil Engineerin .pptx
PDF
SOCIAL PSYCHOLOGY chapter 1-what is social psychology and its definition
PDF
Glycolysis by Rishikanta Usham, Dhanamanjuri University
PDF
Telemedicine: Transforming Healthcare Delivery in Remote Areas (www.kiu.ac.ug)
Targeted drug delivery system 1_44299_BP704T_03-12-2024.pptx
LEUCEMIA LINFOBLÁSTICA AGUDA EN NIÑOS. Guías NCCN 2020-desbloqueado.pdf
Thyroid Hormone by Iqra Nasir detail.pdf
software engineering for computer science
Pharmacokinetics Lecture_Study Material.pdf
final prehhhejjehehhehehehebesentation.pdf
Unit Four Lesson in Carbohydrates chemistry
Cutaneous tuberculosis Dermatology
Introduction of Plant Ecology and Diversity Conservation
BCKIC FOUNDATION_MAY-JUNE 2025_NEWSLETTER
2024_PohleJellKlug_CambrianPlectronoceratidsAustralia.pdf
Contact Lens Dr Hari.pptx presentation powerpoint
ELS 2ND QUARTER 2 FOR HUMSS STUDENTS.pptx
Microplastics: Environmental Impact and Remediation Strategies
Chromosomal Aberrations Dr. Thirunahari Ugandhar.pptx
Thyroid disorders presentation for MBBS.pptx
Spectroscopic Techniques for M Tech Civil Engineerin .pptx
SOCIAL PSYCHOLOGY chapter 1-what is social psychology and its definition
Glycolysis by Rishikanta Usham, Dhanamanjuri University
Telemedicine: Transforming Healthcare Delivery in Remote Areas (www.kiu.ac.ug)

Software tools for data-driven research and their application to thermoelectrics materials discovery

  • 1. MATERIAL FEATURES PROPERTY TiO2 rutile F11 F12 … F1N gap = 3.0 eV C diamond F21 F22 … F2N gap = 5.5 eV … … … … … … PbTe rocksalt FM1 FM2 … FMN gap = 0.3 eV Python ML Libraries Data Featurization Data Retrieval Data Visualization Materials Databases MPDSCitrine Materials Project Software tools Materials Project integration Software tools for data-driven research and their application to thermoelectrics materials discovery Anubhav Jain, Energy Storage & Distributed Resources Division, Berkeley Lab Thermoelectrics discovery Atomate is a library of standardized workflows for high-throughput computational materials science. One can provide a crystal structure and select from >15 different types of calculation procedures (e.g., band structure, elastic tensor, thermal expansion, etc.) to perform. Atomate scales to millions of calculations at large supercomputing centers and produces a database of organized results. www.github.com/hackingmaterials/atomate (production) Matminer retrieves data from the APIs of several large databases and assists the user with feature extraction, implementing over 20 different featurization classes that can g e n e r a t e t h o u s a n d s o f physically-relevant materials descriptors. Matminer includes a built-in visualization package and interfaces to standard Python-based libraries for machine learning. www.github.com/hackingmaterials/matminer (released) The AMSET project develops and implements an ab initio model for electron transport (mobility, Seebeck) that b a l a n c e s a c c u r a c y a n d computational efficiency. It achieves this by adapting classical scattering equations intended for single 1D parabolic bands to complex, anisotropic band structures c o m p u t e d f ro m d e n s i t y functional theory methods. www.github.com/hackingmaterials/amset (in development) The Materials Project produces computationally-generated m a t e r i a l s d a t a t h a t i s disseminated to over 45,000 registered users. The software tools developed in our project are integrated with the Materials Project. For example, the atomate software will be used by Materials Project to run millions of simulations in the upcoming year. We have recently developed a set of order parameters to characterize the local environment of a site in a structure. Order parameter functions produce a value of 1 when matching a target geometry (e.g., “tetrahedral”) and continuously fade to zero as the neighbor geometry deviates from the target. The set of order parameters are assembled into a “fingerprint” that characterizes a site’s resemblance to ~20 coordination motifs. The site fingerprints are then assembled into a structure fingerprint. One application deployed on MP is to compute “similarities” between crystal structures in terms of fingerprint distance, which is robust even under distortion or alloying. We used the atomate library to compute the electron transport properties of ~48,000 inorganic materials using the BoltzTraP method under a constant relaxation time approximation. This database (~300GB) is downloadable online through the Dryad repository and is used in our project as the basis for identifying potential new thermoelectric compositions. One novel thermoelectric composition identified from the computational screening is p- type YCuTe2, which exhibits favorable band convergence and exhibits a peak zT of ~0.75 (in-line with computational estimates). To our knowledge, this is the highest experimental zT from a composition first suggested from computational guidance. Another material, TmAgTe2, was suggested but its experimental zT (~0.35) was limited by doping issues. AMSET! >45,000 users! Target: W similar structures detected Cs3Sb! TiGaFeCo! CeMg2Cu! www.materialsproject.org/materials/mp-91/ 1.0E+02 1.0E+03 1.0E+04 1.0E+05 1.0E+06 1.0E+07 1.0E+08 200 300 400 500 600 700 800 900 1000 1100 Mobility(cm2/V*s) Temperature (K) IMP PIE ACD POP overall expt experiment computation We have also investigated traditionally under-explored chemistries such as phosphide thermoelectrics. Contrary to the belief that phosphides should have high thermal c o n d u c t i v i t y ( κ ) , o u r computations suggest that the m i n i m u m κ o f t h e s e compounds can be quite low. Experimental tests of cubic NiP2 confirms low κ (~1.0 W/ m*K) but with poor electronic properties. Funding Funding for this research was provided by the U.S. Department of Energy, Basic Energy Sciences, Materials Science Division through an Early Career Grant and through the Materials Project center grant EDCBEE. Computing resources were provided by the National Energy Research Scientific Computing Center. Registered users over time www.materialsproject.org Electron mobility of n-type GaAs (ne = 3.0 * 1013 cm-3)