|
| Content |
|
| The spectral data in the Spectra Online
database has come from a variety of sources. Many of the
collections have come from public domain sources such as other web sites. This data was
typically found by users of Thermo Electron software as well as by our employees. Some
other sources of public domain data were found by suggestions posted in the sci.chem
Usenet newsgroups as well as other public bulletin boards. Some examples of this type of
data collection are the EPA Vapor Phase FT-IR Library,
EPA-AECD Gas Phase FT-IR Database of HAPs and the U.S.G.S. Spectral Library of Minerals. |
|
| Still other collections were either
donated or are copyrighted collections to which Thermo Electron was given special permission
to post them in the Spectra Online database. Collections that fall into this category are
the Notre Dame Organics Workbook Spectra, McCreery Raman Library, NIST Chemistry WebBook and the USDA Instrumentation Research Lab NIR Library. |
|
| In all cases, the compound name was
supplied with the spectral data. In addition, different collections came with some extra
information about the material such as CAS Registry Number, Molecular Formula, etc.
Whenever this type of information was available, it was used to construct the database. In
addition, Thermo Electron looked up the CAS Registry Number for all known "pure"
compounds where it was not supplied as part of the collection. |
|
| A few collections also included the
molecular structures for the compounds. Where at all possible, these were used to complete
the compound record information in the database. The NIST
Chemistry WebBook came with a sizable collection of molecular structures which
represents a majority of the structures in the Spectra Online database. The public-domain National Cancer
Institute (NCI) structure database was used to fill in additional structures in the
database. Although these collections of structures are both quite large, there are still a
number of compounds in the Spectra Online database that have no chemical structure
assigned. |
|
| Compiling the Database |
|
| Before building the Spectra Online
database, the spectra in each collection was first translated into the SPC file format using GRAMS
software (except for those collections supplied in the SPC format). In addition, all
the supplied compound information, the CAS Registry Numbers (looked up by Thermo Electron
personnel if not supplied with the collection) and the names of the SPC files were entered
into a Microsoft Access database; one for each collection. |
|
| A series of steps were required to
populate the web server database (Microsoft SQL Server 7). These were accomplished by
developing a set of custom Visual Basic applications that use the Spectral
Server software components: |
|
| 1. |
The first VB application
read the information from the individual collection databases. As the spectral data was
copied over to the server database, the application checked the compound information for
duplicate records by comparing the CAS numbers. It also checked the server database for
missing compound information that was available in the collection database and filled it
in where necessary. In addition, if the collection database had a different name for the
compound than the server, the new name was entered as a synonym. |
|
|
| 2. |
Another application was
developed to copy the chemical structures (as MDL MOL files) into the database matching
them to the compound records by CAS number. This was run on the collections that were
supplied with structures as well as the NIST and NCI structure databases. The structures
were "cleaned up" and output to GIF files using another application developed
around the CambridgeSoft ChemOffice component toolkit. |
|
|
| 3. |
Finally, another application
was developed to generate the set of compressed spectral search libraries for all the FT-IR,
Raman, UV-Vis, NIR and MS spectra in the database. The application automatically creates a
set of direct links to the spectral records in the primary server database to allow
retrieval of the full resolution spectrum. |
|
|
|
|
|