SpiderMass is a program for semantic chemical database generation, metabolite identification and de-novo formula generation from mass spectrometry data.
The software is a re-implementation of the "Seven Golden Rules" by Kind & Fiehn (2007) with additional features.
Identification of metabolites from mass spectrometry data is challanging. Even absolutely correct mass measurements are insufficient for the unequivocal identification of compounds, if not additional information, such as the isotope pattern, is used. Kind and Fiehn developped the Seven Golden Rules "7GR" (webpage, article), which enable the heuristic filtering of chemical formulas.
The "7GR" software is running from an EXCEL sheet, using various libraries and helper apps, which limits its use to Windows Platforms. Additionally, the installation of 7GR is sometimes bothersome. Therefore, SpiderMass was initially written as a re-implementation of the 7GR with cross-platform compatibility.
Further, some improvements were made:
- Calculations are based on ions, not on neutral molecules => Better matching of high-quality data.
- Isotope-distribution fit algorithm, which is independant from user input => More convenient and less error prone.
- Inclusion of stable isotope calculations => Usability for metabolic flux data or other stable isotope experiments.
Automated Generation of Target Databases (using ChemSpider)
The use of target databases with a collection of expected metabolites greatly enhances the chance of correct hits (see Kind & Fiehn, 2007). However, the creation of such databases can be bothersome.
Therefore, we programmed a function, which searches a given compound list (consisting of common names, CAS, sum formulas, etc. in .txt format) for corresponding entries in the ChemSpider Meta-Database and creates a so-called SpiderMass target database. This semantic database generator represents a very special feature of SpiderMass and results in drastically improved speed and selectivity of MS database searches, even for data with modest quality.
SpiderMass databases and result tables are written as simple coma-separated values (.csv). This allows simple editing with text processing tools or editors.
Connection with Informatic Pipelines
Peak lists, which result from mass spectrometry data processing pipelines (e.g. OpenMS/TOPPAS, MZmine), can be processed in batch mode.
Screenshot of SpiderMass DB Generator (little window) and Identifier (window below), running on MASSyPup:
The Graphical User Interface (GUI) and most mathematical/ text/ web functions were written in Python (v3). The Pyton 3 OSA library is employed for SOAP requests. Time-critical operations such as the generation of chemical formula and the calculation of isotope patterns are implemented in C/C++. The programs should run/ compile on any modern computing platform. All parts are licensed as Open Source, and may be modified as needed. SpiderMass was tested on Windows and Linux.
For using the ChemSpider API, a token is required, which can be obtained free of charge (academic users) after registration with RSC.
The source code is available here (BitBucket).
For end users: SpiderMass is also installed on MASSyPup. You just have to place your personal ChemSpider token into the file chemspider-token.txt (located in /usr/local/spidermass/main).
I also compiled a Windows installer of SpiderMass. It was generated and tested on Windows 10, 64 bit.
Please read the Readme at https://bitbucket.org/lababi/spidermass/ :
- The default SpiderMass directory is 'USER/App/Roaming'. Other directories might cause trouble with access rights.
- Place a file named 'chemspider-token.txt' with your RSC chemspider token (free for Academics from http://www.chemspider.com/) into the same directory as the spidermass.exe (should be 'USER/App/Roaming/SpiderMass'). You need this token for online database generation and direct searches in ChemSpider.
- Results are placed in the main SpiderMass directory as well ('SpiderMassResults.csv')
I appreciate any comments about your experience with the program (see contact below).
Winkler R.: SpiderMass: Semantic database creation and tripartite metabolite identification strategy, Journal of Mass Spectrometry, 2015, 50(3), 538-541, http://onlinelibrary.wiley.com/doi/10.1002/jms.3559/abstract