-
Notifications
You must be signed in to change notification settings - Fork 17
Home
OpenVirus is a project that aims to develop knowledge resources and tools to help tackle the COVID-19 outbreak.
The world faces and will continue to face viral epidemics that arise suddenly. Scientific and medical knowledge is a critical resource for battling epidemics. Despite over 100 Billion USD being spent on medical research worldwide, much knowledge is behind publisher paywalls and is only available to rich universities. Moreover it is usually badly published, dispersed without coherent knowledge tools. This particularly disadvantages the Global South.
This project aims to use modern tools, especially Wikidata (and Wikipedia), R, Java, text mining, with semantic tools to create a modern integrated resource of all current published information on viruses and their epidemics. It relies on collaboration and gifts of labour and knowledge.
The main documentation is on the Wiki - see the sidebar. https://github.com/petermr/openVirus/wiki/GETTING-STARTED will list some of the most important topics
Take a look at the project README and the How Can I Help section of the FAQ
Feel free to raise issues or ask questions on the project issue tracker.
We have 8 mini-projects. The details about each one of these can be found below:
Owner and Collaborator of the Mini-projects | Mini-Project | Dictionary |
---|---|---|
Ambreen H, Pooja Pareek, Ayush | miniproject: viral epidemics and country (What countries do viral epidemics occur in?) | Country Dictionary |
Priya, Dheeraj Kumar | miniproject: viral epidemics and disease (What diseases co-occur with epidemics? Not necessarily causation) | Disease Dictionary |
Pruthiv Rajan, Urja Biswas, Israel | miniproject: viral epidemics and drugs (What drugs are used during epidemics) | Drug Dictionary |
Vaishali Arora, Simranleen Singh, Shweata Hegde | miniproject: viral epidemics and funders (Which funders support research on viral epidemics?) | Funders Dictionary |
Charles Li, Anugrah | miniproject: viral epidemics and non pharmaceutical interventions (What non-pharma interventions are used during epidemics? ) | NPI Dictionary |
Kareena Singh, Jitu Ram Bhargav | miniproject: viral epidemics and viruses (What are the main viruses causing epidemics) | Virus Dictionary |
Sana Saifi | miniproject: viral epidemics and zoonoses (what is the role of zoonosis i.e.,animal hosts?) | Zoonosis Dictionary |
Vanisha Arora, Om Prakash | Miniproject:Testing and tracing in viral epidemics (Who reports Test and Trace strategies) | Testing and Tracing Dictionary |
- All the dictionaries are made available on our Dictionary GitHub page.
We have done a lot of work in Jupyter Notebook. Follow the links below to find out more.
https://github.com/petermr/openVirus/tree/master/jupyter
https://github.com/petermr/openVirus/blob/master/Wikicite%20Presentations%20of%20presenters/Wikimedia_Hamadani_1.ipynb
https://github.com/petermr/openVirus/blob/master/Wikicite%20Presentations%20of%20presenters/Wikimedia_Hamadani_2.ipynb
- Country, Disease, Drug, Organization These are generic to almost any biomedical project. We should continue to clear them up and maintain them and offer them to the world. They are clearly based directly on Wikidata. The next is specific to viruses and epidemics and less well developed. They probably need more cleaning
-
Human Virus, TestTrace, NPI, Zoonosis.
Action: Create a (meta-)dictionary project with a specification, testing/validation so that we know dictionaries are fit for purpose.
We also need,
(i) a micro-test corpus for testing code (locate within AMI, e.g. ZIKA10)
(ii) a tutorial/test corpus (e.g. 200 entries)
(iii) larger more specific corpora for "research"
We are basing the next phase on Python libraries within Notebooks. We believe that everyone can run simple Python calls to numpy
, pandas
, nltk
, matplotlib
and later scikit-learn
(and maybe other tools - word2vec
and keras
). ami-picocli
can be run from Notebooks (but needs installing). This allows us to extract sections from Ctree
documents and do powerful exploratory work. Everyone can practice this. Exploration shows us what we might be able to do. However science requires us to validate and test the results and all community code must be reviewed, tested and validated.
We now have enough software-experienced people to start developing new software, initially through Python libraries. The software includes:
a. rewriting getpapers
in Python to be more maintainable and extend to new sources
b. ami-words
so we can do word frequencies from CProjects
c. ami-search
for better search and more tools such as regex, abbreviations, etc.
(b. and c. will possibly be multilingual).
ALL software development should be testable (Python unit test and be test-driven-development TDD).
We expect there to be 4-5 software miniprojects.
- dictionaries/validation/maintenance
getpapers
ami-words
ami-search
- Containerization using Docker
Every project must have:
- an Issue
- at least 2 people, one of whom should represent the users
- Every project should be reviewed/cross-validated by non-project members.
New repo for the developmental purpose has been created to better manage our work. Link to openvirusdev
GitHub page can be found here.