GitHub - MarioniLab/sagenet: Spatial reconstruction of dissociated single-cell data

SageNet: Spatial reconstruction of dissociated single-cell datasets using graph neural networks

SageNet is a robust and generalizable graph neural network approach that probabilistically maps dissociated single cells from an scRNAseq dataset to their hypothetical tissue of origin using one or more reference datasets aquired by spatially resolved transcriptomics techniques. It is compatible with both high-plex imaging (e.g., seqFISH, MERFISH, etc.) and spatial barcoding (e.g., 10X visium, Slide-seq, etc.) datasets as the spatial reference.

SageNet is implemented with pytorch and pytorch-geometric to be modular, fast, and scalable. Also, it uses anndata to be compatible with scanpy and squidpy for pre- and post-processing steps.

Installation

Note

v1.0

The dependency torch-geometric should be installed separately, corresponding the system specefities, look at this link for instructions. We recommend to use Miniconda.

GitHub (currently recomended)

First, clone the repository using git:

git clone https://github.com/MarioniLab/sagenet

Then, cd to the sagenet folder and run the install command:

cd sagenet
python setup.py install #or pip install .

PyPI

The easiest way to get SageNet is through pip using the following command:

pip install sagenet

Usage

import sagenet as sg
import scanpy as sc
import squidpy as sq
import anndata as ad
import random
random.seed(10)

Training phase:

Input:

Expression matrix associated with the (spatial) reference dataset (an anndata object)

adata_r = sg.MGA_data.seqFISH1()

gene-gene interaction network

glasso(adata_r, [0.5, 0.75, 1])

one or more partitionings of the spatial reference into distinct connected neighborhoods of cells or spots

adata_r.obsm['spatial'] = np.array(adata_r.obs[['x','y']])
sq.gr.spatial_neighbors(adata_r, coord_type="generic")
sc.tl.leiden(adata_r, resolution=.01, random_state=0, key_added='leiden_0.01', adjacency=adata_r.obsp["spatial_connectivities"])
sc.tl.leiden(adata_r, resolution=.05, random_state=0, key_added='leiden_0.05', adjacency=adata_r.obsp["spatial_connectivities"])
sc.tl.leiden(adata_r, resolution=.1, random_state=0, key_added='leiden_0.1', adjacency=adata_r.obsp["spatial_connectivities"])
sc.tl.leiden(adata_r, resolution=.5, random_state=0, key_added='leiden_0.5', adjacency=adata_r.obsp["spatial_connectivities"])
sc.tl.leiden(adata_r, resolution=1, random_state=0, key_added='leiden_1', adjacency=adata_r.obsp["spatial_connectivities"])

Training:

sg_obj = sg.sage.sage(device=device)
sg_obj.add_ref(adata_r, comm_columns=['leiden_0.01', 'leiden_0.05', 'leiden_0.1', 'leiden_0.5', 'leiden_1'], tag='seqFISH_ref', epochs=20, verbose = False)

Output:

A set of pre-trained models (one for each partitioning)

!mkdir models
!mkdir models/seqFISH_ref
sg_obj.save_model_as_folder('models/seqFISH_ref')

A set of Spatially Informative Genes

ind  = np.where(adata_r.var['ST_all_importance'] <= 5)[0]
SIGs = list(adata_r.var_names[ind])
with rc_context({'figure.figsize': (4, 4)}):
        sc.pl.spatial(adata_r, color=SIGs, ncols=4, spot_size=0.03, legend_loc=None)

Mapping phase

Input:

Expression matrix associated with the (dissociated) query dataset (an anndata object)

adata_q = sg.MGA_data.scRNAseq()

Mapping:

sg_obj.map_query(adata_q)

Output:

The reconstructed cell-cell spatial distance matrix

adata_q.obsm['dist_map']

A consensus scoring of mappability (uncertainity of mapping) of each cell to the references

adata_q.obs

import anndata
dist_adata = anndata.AnnData(adata_q.obsm['dist_map'], obs = adata_q.obs)
knn_indices, knn_dists, forest = sc.neighbors.compute_neighbors_umap(dist_adata.X, n_neighbors=50, metric='precomputed')
dist_adata.obsp['distances'], dist_adata.obsp['connectivities'] = sc.neighbors._compute_connectivities_umap(
    knn_indices,
    knn_dists,
    dist_adata.shape[0],
    50, # change to neighbors you plan to use
)
sc.pp.neighbors(dist_adata, metric='precomputed', use_rep='X')
sc.tl.umap(dist_adata)
sc.pl.umap(dist_adata, color='cell_type', palette=celltype_colours)

Notebooks

To see some examples of our pipeline's capability, look at the notebooks directory. The notebooks are also available on google colab:

Interactive examples

Spatial mapping of the mouse gastrulation atlas

Support and contribute

If you have a question or new architecture or a model that could be integrated into our pipeline, you can post an issue or reach us by email.

Contributions

This work is led by Elyas Heidari and Shila Ghazanfar as a joint effort between MarioniLab@CRUK@EMBL-EBI and RobinsonLab@UZH.

Name		Name	Last commit message	Last commit date
Latest commit History 283 Commits
docs		docs
figures		figures
notebooks		notebooks
sagenet		sagenet
.DS_Store		.DS_Store
.gitignore		.gitignore
.travis.yml		.travis.yml
LICENSE		LICENSE
README.rst		README.rst
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SageNet: Spatial reconstruction of dissociated single-cell datasets using graph neural networks

Installation

GitHub (currently recomended)

PyPI

Usage

Training phase:

Mapping phase

Notebooks

Interactive examples

Support and contribute

Contributions

About

Releases 4

Packages

Languages

License

MarioniLab/sagenet

Folders and files

Latest commit

History

Repository files navigation

SageNet: Spatial reconstruction of dissociated single-cell datasets using graph neural networks

Installation

GitHub (currently recomended)

PyPI

Usage

Training phase:

Mapping phase

Notebooks

Interactive examples

Support and contribute

Contributions

About

Resources

License

Stars

Watchers

Forks

Releases 4

Packages 0

Languages

Packages