This repository contains the source code for the paper arXiv:2403.04990. The primary script for training the models is located in main.ipynb
, with supplementary modules available in the source/
directory. Notebooks related to IBMQ are prefixed with 'ibmq_', and the training results using simulators can be found in training_result.ipynb
.
The primary training workflow relies on PyTorch and PennyLane. Python 3.9 or newer is required. To install the necessary packages, execute the following commands:
# Install using `setup.py`
pip install .
# Optional cleanup
rm -rf build QCGNN.egg-info
Note: Ensure that the
PennyLane
version is 0.31.0 or later due to an issue withqml.qnn.TorchLayer
(see this discussion for more details).
Most of the Python scripts are written in Jupyter Notebook format (.ipynb
). If you prefer to run them as traditional Python scripts (.py
), you can convert the notebooks using the ipynb-py-convert
package:
# Convert notebook to Python script
pip install ipynb-py-convert
ipynb-py-convert some_notebook_file.ipynb to_py_file.py
python to_py_file.py
The feasibility of each model is demonstrated using two different Monte Carlo datasets, both containing particle flow information of fatjets (R = 0.8) initiated by different particles. The datasets can be downloaded from the following sources:
-
JetNet Dataset: This dataset is used for the multi-class classification task, featuring fatjets with roughly 1 TeV transverse momentum originating from gluons, light quarks, top quarks, W bosons, and Z bosons. For more details, refer to arXiv:2106.11535. Place the downloaded
hdf5
files in thedataset/jetnet
directory. -
Top Quark Tagging Dataset: This dataset is used for the binary classification task, with fatjet momentum in the range [550, 650] GeV. Place the downloaded
hdf5
files in thedataset/top
directory, and renameval.h5
tovalid.h5
.
The workflow of MPGNN and QCGNN is given by the following figure. For details, see Section III B of the paper.
The QCGNN model is implemented using PennyLane
within the PyTorch
framework, and can be found in source/models/qcgnn.py
. The classical benchmarking models include:
-
Particle Flow Network (PFN): A message-passing-based complete graph neural network.
-
Particle Transformer (ParT): A transformer-based model, incorporating interaction features as residual values in attention masks.
-
Particle Network (PNet): A dynamic graph convolutional neural network (DGCNN) model, with edges defined in latent space.
Pretrained checkpoints and training logs for each model are available in the training_logs
folder on Google Drive - QCGNN. We use Wandb for monitoring the training process, so the main training logs are stored in training_logs/WanbdLogger
. To run training_result.ipynb
, ensure the training_logs
directory is located at the cloned repository.
The results of experiments conducted on IBMQ quantum computers, including metrics such as AUC and accuracy, are stored in the ibmq_result
folder on Google Drive - QCGNN. The calibrations of the QPUs we used are provided in the CSV files at ibmq_result
.
ibmq_result/time_complexity
: Output ofibmq_time_complexity.ipynb
, which measures the time required to operate quantum gates.ibmq_result/pretrained
: Output ofibmq_pretrained.ipynb
, which evaluates the performance of the pretrained QCGNN (trained using simulators) on actual IBMQ quantum devices.ibmq_result/noise
: Output ofibmq_noise.ipynb
, detailing metrics when executing QCGNN simulators with varying noise levels.
The Particle Transformer (ParT) and Particle Net (PNet) models were provided by You-Ying Li and Zheng-Gang Chen. They were adapted from Particle Transformer for Jet Tagging and ParticleNet: Jet Tagging via Particle Clouds, respectively.