The Spectra Online Database
Content
The spectral data in the Spectra Online database has come from a variety of sources. Many of the collections have come from public domain sources such as other web sites. This data was typically found by users of Thermo Electron software as well as by our employees. Some other sources of public domain data were found by suggestions posted in the sci.chem Usenet newsgroups as well as other public bulletin boards. Some examples of this type of data collection are the EPA Vapor Phase FT-IR Library, EPA-AECD Gas Phase FT-IR Database of HAPs and the U.S.G.S. Spectral Library of Minerals.
Still other collections were either donated or are copyrighted collections to which Thermo Electron was given special permission to post them in the Spectra Online database. Collections that fall into this category are the Notre Dame Organics Workbook Spectra, McCreery Raman Library, NIST Chemistry WebBook and the USDA Instrumentation Research Lab NIR Library.
In all cases, the compound name was supplied with the spectral data. In addition, different collections came with some extra information about the material such as CAS Registry Number, Molecular Formula, etc. Whenever this type of information was available, it was used to construct the database. In addition, Thermo Electron looked up the CAS Registry Number for all known "pure" compounds where it was not supplied as part of the collection.
A few collections also included the molecular structures for the compounds. Where at all possible, these were used to complete the compound record information in the database. The NIST Chemistry WebBook came with a sizable collection of molecular structures which represents a majority of the structures in the Spectra Online database. The public-domain National Cancer Institute (NCI) structure database was used to fill in additional structures in the database. Although these collections of structures are both quite large, there are still a number of compounds in the Spectra Online database that have no chemical structure assigned.
Compiling the Database
Before building the Spectra Online database, the spectra in each collection was first translated into the SPC file format using GRAMS software (except for those collections supplied in the SPC format). In addition, all the supplied compound information, the CAS Registry Numbers (looked up by Thermo Electron personnel if not supplied with the collection) and the names of the SPC files were entered into a Microsoft Access database; one for each collection.
A series of steps were required to populate the web server database (Microsoft SQL Server 7). These were accomplished by developing a set of custom Visual Basic applications that use the Spectral Server software components:
1. The first VB application read the information from the individual collection databases. As the spectral data was copied over to the server database, the application checked the compound information for duplicate records by comparing the CAS numbers. It also checked the server database for missing compound information that was available in the collection database and filled it in where necessary. In addition, if the collection database had a different name for the compound than the server, the new name was entered as a synonym.
2. Another application was developed to copy the chemical structures (as MDL MOL files) into the database matching them to the compound records by CAS number. This was run on the collections that were supplied with structures as well as the NIST and NCI structure databases. The structures were "cleaned up" and output to GIF files using another application developed around the CambridgeSoft ChemOffice component toolkit.
3. Finally, another application was developed to generate the set of compressed spectral search libraries for all the FT-IR, Raman, UV-Vis, NIR and MS spectra in the database. The application automatically creates a set of direct links to the spectral records in the primary server database to allow retrieval of the full resolution spectrum.

Visit our corporate website | Privacy Statement | Terms & Conditions

©2012 Thermo Fisher Scientific, Inc. All rights reserved.