-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MoleculeGPT
: Dataset+Model+Unit tests+Example
#9698
Comments
MolculeGPT
example+datasetMolculeGPT
example+dataset+model+unit tests
Would like to contribute to this paper. Listed what to do, need some discussion for the details^^. Dataset
Model
|
MolculeGPT
example+dataset+model+unit testsMolculeGPT
: Dataset+Model+Unit tests+Example
MolculeGPT
: Dataset+Model+Unit tests+ExampleMoleculeGPT
: Dataset+Model+Unit tests+Example
Hey @xnuohz sorry for the delay! Just had a quick look at the paper, and it looks like they haven't published the code and dataset that they curated for the paper, but as a general goal, we should aim for reproducing the result from the paper by re-implementing the dataset, preprocessing, and model with an example script. We can also discuss this in PyG Slack :) (cc'ing @puririshi98 for when he's back) |
@xnuohz I think they seem to follow the this data preprocessing step https://github.com/chao1224/MoleculeSTM/tree/main/data as described in section 3.2 |
🚀 The feature, motivation and pitch
Paper: https://ai4d3.github.io/papers/34.pdf
Part of the community sprint #9694
The goal of this project is to reproduce the work done in MoleculeGPT while tying it as closely to the existing GNN+LLM frameworks in PyG. We recommend using as many existing features as possible from PyG. Additional features which you feel will be reusable for other workflows should be added to PyG. One-off functions that are specific to this workflow can be left inside the example.
Most of the effort will likely go into building a PyG dataset that matches the one described in the paper. At a high level the dataset is a composition of Q+A pairs for molecular field, with matching molecules as context. These Q+A pairs focus on molecular property prediction.
Alternatives
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: