Hostname: page-component-586b7cd67f-t8hqh Total loading time: 0 Render date: 2024-11-30T07:32:10.610Z Has data issue: false hasContentIssue false

A picture is worth a thousand words: applying natural language processing tools for creating a quantum materials database map

Published online by Cambridge University Press:  07 October 2019

Vineeth Venugopal
Affiliation:
Department of Materials Design and Innovation, University at Buffalo, Buffalo, NY, USA
Scott R. Broderick
Affiliation:
Department of Materials Design and Innovation, University at Buffalo, Buffalo, NY, USA
Krishna Rajan*
Affiliation:
Department of Materials Design and Innovation, University at Buffalo, Buffalo, NY, USA
*
Address all correspondence to Krishna Rajan at [email protected]
Get access

Abstract

This paper demonstrates the application of Natural Language Processing (NLP) tools to explore large libraries of documents and to correlate heuristic associations between text descriptions in figure captions with interpretations of images and figures. The use of visualization tools based on NLP methods permits one to quickly assess the extent of the research described in the literature related to a specific topic. The authors demonstrate how the use of NLP methods on only the figure captions without having to navigate the entire text of a document can provide an accelerated assessment of the literature in a given domain.

Type
Artificial Intelligence Research Letters
Copyright
Copyright © Materials Research Society 2019

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

1Kim, E., Huang, K., Tomala, A., Matthews, S., Strubell, E., Saunders, A., McCallum, A., and Olivetti, E.: Machine-learned and codified synthesis parameters of oxide materials. Sci. Data 127, 170127 (2017).CrossRefGoogle Scholar
2Murray-Rust, P. and Rzepa, H.S.: Chemical markup, XML, and the world wide web. 4. CML schema. J. Chem. Inf. Comput. Sci. 43, 757772 (2003).CrossRefGoogle ScholarPubMed
3Pence, H.E. and Williams, A.: Chemspider: an online chemical information resource. J. Chem. Educ. 87, 11231124 (2010).CrossRefGoogle Scholar
4Sheshadri, R. and Sparks, T.D.: Perspective: interactive material databases through aggregation of literature data. APL Mater 4, 053206 (2016).CrossRefGoogle Scholar
5Lin, L.C., Berger, A.H., Martin, R.L., Kim, J., Swisher, J.A., Jariwala, K., Rycroft, C.H., Bhown, A.S., Deem, M.W., Haranczyk, M., and Smit, B.: In silico screening of carbon capture materials. Nat. Mater 11, 633641 (2012).CrossRefGoogle ScholarPubMed
6Oliynyk, A.O., Antono, E., Sparks, T.D., Ghadbeigi, L., Gaultois, M.W., Meredig, B., and Mar, A.: High throughput machine learning driven synthesis of full-Heusler compounds. Chem. Mater 28, 73247331 (2016).CrossRefGoogle Scholar
7Pyzer-Knapp, E.O., Li, K., and Aspuru-Guzik, A.: Learning from the Harvard Clean Energy Project: the use of neural networks to accelerate materials discovery. Adv. Funct. Mater. 25, 64956502 (2015).CrossRefGoogle Scholar
8Sumpter, B.G., Vasudevan, R.K., Potok, T., and Kalinin, S.V.: A bridge for accelerating materials by design. NPJ Comp. Mater 1, 15008 (2015).CrossRefGoogle Scholar
9Rocktaschel, T., Weidlich, M., and Leser, U.: ChemSport: a hybrid system for chemical named entity recognition. Bioinformatics 28, 16331640 (2012).CrossRefGoogle Scholar
10Wilmer, C.E., Leaf, M., Lee, C.Y., Farha, O.K., Hauser, B.G., Hupp, J.T., and Snurr, R.Q.: Large scale screening of hypothetical metal-organic frameworks. Nat. Chem. 4, 8389 (2011).CrossRefGoogle ScholarPubMed
11Kim, E., Huang, K., Stefanie, J., and Olivetti, E.: Virtual screening of inorganic materials synthesis parameters with deep learning. NPJ Comp. Mater 3, 53 (2017).CrossRefGoogle Scholar
12Swain, M.C. and Cole, J.M.: ChemDataExtractor: a toolkit for automated extraction of chemical information from the scientific literature. J. Chem. Inf. Model. 56, 18941904 (2016).CrossRefGoogle ScholarPubMed
13Callum, C.J. and Cole, J.M.: Auto-generated materials database of Curie and Neel temperatures via semi-supervised relationship extraction. Sci. Data 5, 180111 (2018).Google Scholar
14Bansal, N.P. and Lamon, J.: Ceramic Matrix Composites: Materials, Modelling, and Technology (John Wiley & Sons, Hoboken, NJ, 2016).Google Scholar
15Sato, M. and Ando, Y.: Topological Superconductors: a review. Rep. Prog. Phys 80, 076501 (2017).CrossRefGoogle ScholarPubMed
16Elsevier: Elsevier Developers. (2018). https://dev.elsevier.com/ (cited 2018).Google Scholar
17Torralba, A., Fergus, R., and Freeman, W.T.: 80 Million tiny images: a large dataset for non-parametric object and scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 30, 19581970 (2008).CrossRefGoogle ScholarPubMed
18Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., and Fei-Fei, L.: ImageNet: a large scale hierarchial image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition, 2009; pp. 248–255.CrossRefGoogle Scholar
19Jones, K.S.: A statistical interpretation of term specificity and its application in retrieval. J. Doc 28, 1121 (1972).CrossRefGoogle Scholar
20van der Maaten, L. and Hinton, G.: Visualizing Data using t-SNE. Journal of Machine Learning Research 9, 25792605 (2008).Google Scholar
21Ando, Yoichi and Fu, Liang: Topological Crystalline Insulators and Topological Superconductors: From Concepts to Materials. Annual Review of Condensed Matter Physics 6(1), 361381 (2015). http://dx.doi.org/10.1146/annurev-conmatphys-031214-014501.CrossRefGoogle Scholar
22Sultana, Rabia, Neha, P., Goyal, R., Patnaik, S., and Awana, V.P.S.: Unusual non saturating Giant Magneto-resistance in single crystalline Bi 2 Te 3 topological insulator. Journal of Magnetism and Magnetic Materials 428, 213218 (2017). http://dx.doi.org/10.1016/j.jmmm.2016.12.011.CrossRefGoogle Scholar
23Goncharov, A.F and Struzhkin, V.V: Pressure dependence of the Raman spectrum, lattice parameters and superconducting critical temperature of MgB2: evidence for pressure-driven phonon-assisted electronic topological transition. Physica C: Superconductivity 385(1–2), 117130 (2003). http://dx.doi.org/10.1016/S0921-4534(02)02311-0.CrossRefGoogle Scholar
24Chang, C.C., Chen, T.K., Lee, W.C., Lin, P.H., Wang, M.J., Wen, Y.C., Wu, P.M., and Wu, M.K.: Superconductivity in Fe-chalcogenides. Physica C: Superconductivity and its Applications 514, 423434 (2015). http://dx.doi.org/10.1016/j.physc.2015.02.011.CrossRefGoogle Scholar
25Andrada-Chacón, A., Baonza, V.G., and Sánchez-Benítez, J.: Correlation between electrical resistance and defect concentration in graphite under non-hydrostatic stress. Carbon 113, 205211 (2017). http://dx.doi.org/10.1016/j.carbon.2016.11.058.CrossRefGoogle Scholar
26Kharlamova, Marianna V.: Advances in tailoring the electronic properties of single-walled carbon nanotubes. Progress in Materials Science 77, 125211 (2016). http://dx.doi.org/10.1016/j.pmatsci.2015.09.001.CrossRefGoogle Scholar
27Bonaccorso, Francesco, Lombardo, Antonio, Hasan, Tawfique, Sun, Zhipei, Colombo, Luigi, and Ferrari, Andrea C.: Production and processing of graphene and 2d crystals. Materials Today 15(12), 564589 (2012). http://dx.doi.org/10.1016/S1369-7021(13)70014-2.CrossRefGoogle Scholar
28Freiman, Yu.A. and Jodl, H.J.: Solid oxygen. Physics Reports 401(1–4), 1228 (2004). http://dx.doi.org/10.1016/j.physrep.2004.06.002.CrossRefGoogle Scholar
29Fontana, Marc D. and Bourson, Patrice: Microstructure and defects probed by Raman spectroscopy in lithium niobate crystals and devices. Applied Physics Reviews 2(4), 040602 (2015). http://dx.doi.org/10.1063/1.4934203.CrossRefGoogle Scholar