Skip to content

Source code and dataset of the paper "Efficient Heterogeneous Graph Learning via Random Projection"

License

Notifications You must be signed in to change notification settings

CrawlScript/RpHGNN

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RpHGNN

Source code and dataset of the paper "Efficient Heterogeneous Graph Learning via Random Projection", which is accepted by IEEE Transactions on Knowledge and Data Engineering (TKDE 2024).

Homepage and Paper

Requirements

  • Linux
  • Python 3.7
  • torch==1.12.1+cu113
  • torchmetrics==0.11.4
  • dgl==1.0.2+cu113
  • ogb==1.3.5
  • shortuuid==1.0.11
  • pandas==1.3.5
  • gensim==4.2.0
  • numpy==1.21.6
  • tqdm==4.64.1

Download Preparation

For HGB datasets (ACM, DBLP, Freebase, and IMDB):

sh download_hgb_datasets.sh 

For OAG-Venue and OAG-L1-Field, we follow NARS' data prepatation in https://github.com/facebookresearch/NARS/tree/main/oag_dataset. After generating *.pk and *.npy files, you have to:

  • put these files in the directory
  • rename graph_field.pk to graph_L1.pk

For OGBN-MAG, the code will automatically download it via the ogb package.

For OAG-Venue and OAG-L1-Field, we adhere to NARS' data preparation instructions found at https://github.com/facebookresearch/NARS/tree/main/oag_dataset. After generating *.pk and *.npy files, you should:

  • Place these files in the directory ./datasets/nars_academic_oag/.
  • Rename graph_field.pk to graph_L1.pk.

Run RpHGNN

You can run RpHGNN with the following command:

sh scripts/run_ACM.sh

sh scripts/run_DBLP.sh

sh scripts/run_Freebase.sh

sh scripts/run_IMDB.sh

sh scripts/run_OGBN-MAG.sh

sh scripts/run_OAG-Venue.sh

sh scripts/run_OAG-L1-Field.sh

Run RpHGNN for OGB Leaderboards (ogbn-mag)

To reproduce the results on the OGB Leaderboards (ogbn-mag), follow the steps below:

  • Preparing Pre-trained Embeddings (Optional):

    • If the cache/mag.p file does not exist (embeddings pre-trained via LINE [1]), our code will automatically pre-train it and save the pre-trained embeddings in the specified path.
    • Alternatively, if you'd prefer to skip the pre-training step, download the pre-trained embeddings mag.p directly from Google Drive and place it in the cache directory.
  • Execute the script:

    sh scripts/run_leaderboard_OGBN-MAG.sh

    This script will run the training and evaluation using random seeds from 0 to 9. The output for seed i will be saved in the file nohup_leaderboard_mag_i.out.

References:

  • [1] Jian Tang, Meng Qu, Mingzhe Wang, Ming Zhang, Jun Yan, and Qiaozhu Mei. "Line: Large-scale information network embedding." In Proceedings of the 24th international conference on world wide web, pp. 1067-1077. 2015.

Cite

If you use RpHGNN in a scientific publication, we would appreciate citations to the following paper:

@ARTICLE{10643347,
  author={Hu, Jun and Hooi, Bryan and He, Bingsheng},
  journal={IEEE Transactions on Knowledge and Data Engineering}, 
  title={Efficient Heterogeneous Graph Learning via Random Projection}, 
  year={2024},
  volume={},
  number={},
  pages={1-14},
  doi={10.1109/TKDE.2024.3434956}}

License: GPLv3

Copyright (c) 2023-2024 Xtra Computing Group, NUS, Singapore.

About

Source code and dataset of the paper "Efficient Heterogeneous Graph Learning via Random Projection"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published