-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add GLEM model, TAGDataset and example of GLEM #9662
base: master
Are you sure you want to change the base?
Conversation
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #9662 +/- ##
==========================================
- Coverage 87.54% 86.92% -0.62%
==========================================
Files 482 483 +1
Lines 31414 31585 +171
==========================================
- Hits 27501 27455 -46
- Misses 3913 4130 +217 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM just get CI green
@rusty1s @akihironitta ready for your reviews |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we have type annotations all over the PR? Also, I'd suggest splitting this PR into smaller ones.
# Add the parent directory to sys.path | ||
parent_dir = os.path.abspath(os.path.join(os.path.dirname(__file__), '..')) | ||
sys.path.append(parent_dir) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this necessary?
|
||
## Run GLEM for getting SOTA result on ogbn-products dataset | ||
|
||
`python glem.py` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
## Run GLEM for getting SOTA result on ogbn-products dataset | |
`python glem.py` |
ext_pred_path = download_google_url( | ||
id='15sO2m7BeW7C1Upmdw3Cx1JS__6nxTAzY', | ||
folder='/work/users/junhaos/glem_data/ogbn_products/ext_preds', | ||
filename='giant_sagn_scr.pt', log=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's use a relative path for other people to use.
pretrain_augmented = True | ||
|
||
seed_everything(42) | ||
from ogb.nodeproppred import PygNodePropPredDataset |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Let's move the import statement at the start of the file.
if em_phase == 'gnn': | ||
gnn_test_acc = max(gnn_test_acc, final_test_acc) | ||
model.gnn = model.gnn.to('cpu', non_blocking=True) | ||
em_phase = 'lm' | ||
else: | ||
lm_test_acc = max(lm_test_acc, final_test_acc) | ||
model.lm = model.lm.to('cpu', non_blocking=True) | ||
em_phase = 'gnn' | ||
torch.cuda.empty_cache() | ||
print(f'Best GNN acc: {gnn_test_acc}, LM acc: {lm_test_acc}') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the same comment as #9467 (comment), but we shouldn't pick the best metric evaluated on the test set at the end of every EM step.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't had a look outside the example script yet, but this addition is exciting! 🚀
reopened #9591
Feature summary: