-
Notifications
You must be signed in to change notification settings - Fork 17
miniprojects
each intern has their own project. The current ones are:
- countries (Ambreen) "what countries are viral epidemics reported in?" Possible outcome is a spreadsheet, or world map.
- diseases (Priya) "what diseases co-occur with epidemics?" Not necessarily causation
- viruses (Kareena) "what are the main viruses causing epidemics?"
- drugs (Rajan) "what drugs are used during epidemics?" some may be antiviral, or palliative, or antibacterial against secondary infections.
- funders (Vaishali). "Which funders support research on viral epidemics?"
All projects have an element of machine classification ("learning") and natural language processing (NLP). The main uses are:
- is this paper really/mainly about viral epidemics?
- does your concept (above) co-occur in the same sentence as the virus/disease - i.e. is it tightly coupled? For example is "India" related to "virus in India" or is it unrelated (e.g. the reagent came from an Indian supplier?) (edited)
The main packages will be
- ami for sectioning in CProjects and dictionary searching.
- KNIME for workflow and analytical tools
- R for workflow and analytical tools
- Keras for machine learning
- Jupyter for logging and reusable scripts
You will use whatever you are most comfortable with. we are not forcing one-size-fits-all. However there will be a need for converters. If you are ingesting from the CProject into (say) R or Keras or KNIME let us know now as I may need to write exporters. Structured formats such as XML or JSON are valuable as often the consuming tool can use XPath or similar to ingest the bits they want. @clyde davies
WIKIDATA I have struggled with lookup because there isn't a simple API (I may be out of date?). It used to work. I think the interface has changed. I got blocked in a mixture of lazy loading and Mojibake.
USE THE WIKI! We should use the Wiki (or Github pages) for almost all project/software support. Email is really awful, Slack is not appropriate. It is difficult to find anything over a few days old and there's no context. (I shall copy this Slack to the Wiki).
- every project should have a wiki page.
- every software should have a wiki page
- techniques should have a wiki page.