-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Provide MassBank as SQLite database #181
Comments
Dear @jorainer, I was wondering if where is the massbank sql data from? Is it the compilation of all spectra as provided on the massbank web page? I am asking this for the sake of confidence in my target identification. |
Hey @YangjjMSresearch, you can find the massbank.sql at our GitHub site. It is available beginning with database version 2020.11. We usually announce new releases by Twitter or the MassBank Europe website or you can watch the MassBank data repository. Best wishes, |
@tsufz Noted with many thanks! I noticed that there are around 80000 spectra founded from the massbank.sql database. From the MoNA website, I see there are 196,159 spectra inside. Is this massbank.sql repository different from the MoNA repository? Best regards, |
@YangjjMSresearch. I reviewed MassBank of America, They actually provide 73 k MassBank Europe records and in total 175 k GC -and LC-records. MoNA and MassBank hold different datasets. MassBank Europe records are only a part of it among GNPS, HMDB, ReSPECT and others. Thus, the contents of MassBank Europe and MoNA are different. I am also not sure about the update frequency of MoNA. The structure of the sql files are also different. We provide the dump of our internal database. MoNA provides a dump of their database. @jorainer may explain, it if the MoNA sql files are also usuable. If you want to use MassBank Europe records only, use our databases, please. They guarantee reproducibility, as we provide versioned releases. In addition, you may use MoNA sql files containing different other libraries as appropriate. Best wishes, |
I've never looked into MoNa sql files - I was even not aware that they provide their data as SQL. Maybe I have a look into that someday |
Hi all, I still think it would be great to serve the SQLite that can be used with RforMassSpectrometry directly, otherwise the users have to do the conversion themselves... |
Agree @meowcat ! note that I have one pre-build database here: https://github.com/jorainer/SpectraTutorials/releases/tag/2021.03 . I've also included a super-simple short function to My other plan is (if I finally find the time) to create such MassBank SQLite versions (maybe as |
Hi, I made a small converter that takes MassBank dumps and converts to SQLite as well as mzVault formats. It's a docker and has no external requirements. https://github.com/meowcat/MassBank-convert Note that the mzVault converter collapses compound information by InChIKey. A further function which isn't working well collapses by 1D inchikey, but is deactivated in the config. Can we add that to some CI that generates "best-effort community-contributed conversions"? |
+1 on "best-effort community-contributed conversions", the OpenMS Team has an mzML converter mentioned in #31. Yours, Steffen |
For some users a MySQL database might be a little to demanding (see also sneumann/xcms#534). Would be nice to provide the MassBank data in addition as an SQLite database. This would be ~ straight forward:
Convert the MySQL dump to SQLite SQL calls using mysql2sqlite
The only problem is that the
views
are not correctly converted (actually, seems mysql2sqlite ignores them completely). To fix that one could simply insert the views later using:Where the views.sql file is simply a file containing the
create view...
calls from the 01-init-massbank.sql script.With this SQLite database it would be super-easy to use the MassBank data in R:
so, one would have full access to MassBank (cc @tsufz ).
The text was updated successfully, but these errors were encountered: