Published online by Cambridge University Press: 14 March 2014
Twenty-five years ago the desktop computer started becoming ubiquitous in the scientific lab. Researchers were delighted with its ability to both control instrumentation and acquire data on a single system, but they were not completely satisfied. There were often gaps in knowledge that they thought might be gained if they just had more data and they could get the data faster. Computer technology has evolved in keeping with Moore’s Law meeting those desires; however those improvements have of late become both a boon and bane for researchers. Computers are now capable of producing high speed data streams containing terabytes of information; capabilities that evolved faster than envisioned last century. Software to handle large scientific data sets has not kept up. How much information might be lost through accidental mismanagement or how many discoveries are missed through data overload are now vital questions. An important new task in most scientific disciplines involves developing methods to address those issues and to create the software that can handle large data sets with an eye towards scalability. This software must create archived, indexed, and searchable data from heterogeneous instrumentation for the implementation of a strong data-driven materials development strategy. At the National Center for Photovoltaics in the National Renewable Energy Laboratory, we began development a few years ago on a Laboratory Information Management System (LIMS) designed to handle lab-wide scientific data acquisition, management, processing and mining needs for physics and materials science data, and with a specific focus towards future scalability for new equipment or research focuses. We will present the decisions, processes, and problems we went through while building our LIMS system for materials research, its current operational state and our steps for future development.