-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
how to get log(TPM+1) values #44
Comments
Thanks for reporting, Shixiang will reply as soon as possible:) |
Hi, for simple datasets, you can find the count data in the gdc hub, and transform it into TPM format. |
Thank you for this, but it seems that I can not download the counts: library(UCSCXenaTools)
XE <- XenaGenerate(subset = XenaHostNames == "gdcHub")
XE %>% XenaFilter(filterDatasets = "clinical") -> XE_clinical
XE %>% XenaFilter(filterDatasets = "htseq_counts") -> XE_rna_counts
#download gdc
#download clinical information, this one works
XE_clinical.query <- XenaQuery(XE_clinical)
XE_clinical.download <- XenaDownload(XE_clinical.query,
destdir = "UCSC_Xena/TCGA/counts_Clinical", trans_slash = TRUE, force = TRUE
)
#try to download the counts
XE_rna_counts.query <- XenaQuery(XE_rna_counts)
XE_rna_counts.download <- XenaDownload(XE_rna_counts.query,
destdir = "UCSC_Xena/TCGA/counts_RNAseq", trans_slash = TRUE
)
if (!dir.exists("UCSC_Xena")) {
XE_clinical.query <- XenaQuery(XE_clinical)
XE_clinical.download <- XenaDownload(XE_clinical.query,
destdir = "UCSC_Xena/TCGA/counts_Clinical", trans_slash = TRUE
)
XE_rna_pancan.query <- XenaQuery(XE_rna_pancan)
XE_rna_pancan.download <- XenaDownload(XE_rna_pancan.query,
destdir = "UCSC_Xena/TCGA/counts_RNAseq", trans_slash = TRUE
)
} downolading of all gdc counts fails: Downloading TCGA-LAML.htseq_counts.tsv.gz
trying URL 'https://gdc.xenahubs.net/download/TCGA-LAML.htseq_counts.tsv.gz'
==> Trying #2
trying URL 'https://gdc.xenahubs.net/download/TCGA-LAML.htseq_counts.tsv.gz'
==> Trying #3
trying URL 'https://gdc.xenahubs.net/download/TCGA-LAML.htseq_counts.tsv.gz'
Tried 3 times but failed, please check your internet connection! this is what the quesrry looks like: > head(XE_rna_pancan.download)
hosts datasets
1 https://gdc.xenahubs.net TCGA-BLCA.htseq_counts.tsv
2 https://gdc.xenahubs.net TCGA-LUSC.htseq_counts.tsv
3 https://gdc.xenahubs.net TCGA-ESCA.htseq_counts.tsv
4 https://gdc.xenahubs.net TARGET-RT.htseq_counts.tsv
5 https://gdc.xenahubs.net MMRF-COMMPASS.htseq_counts.tsv
6 https://gdc.xenahubs.net TCGA-MESO.htseq_counts.tsv
url fileNames
1 https://gdc.xenahubs.net/download/TCGA-BLCA.htseq_counts.tsv.gz TCGA-BLCA.htseq_counts.tsv.gz
2 https://gdc.xenahubs.net/download/TCGA-LUSC.htseq_counts.tsv.gz TCGA-LUSC.htseq_counts.tsv.gz
3 https://gdc.xenahubs.net/download/TCGA-ESCA.htseq_counts.tsv.gz TCGA-ESCA.htseq_counts.tsv.gz
4 https://gdc.xenahubs.net/download/TARGET-RT.htseq_counts.tsv.gz TARGET-RT.htseq_counts.tsv.gz
5 https://gdc.xenahubs.net/download/MMRF-COMMPASS.htseq_counts.tsv.gz MMRF-COMMPASS.htseq_counts.tsv.gz
6 https://gdc.xenahubs.net/download/TCGA-MESO.htseq_counts.tsv.gz TCGA-MESO.htseq_counts.tsv.gz
destfiles
1 UCSC_Xena/TCGA/counts_RNAseq/TCGA-BLCA.htseq_counts.tsv.gz
2 UCSC_Xena/TCGA/counts_RNAseq/TCGA-LUSC.htseq_counts.tsv.gz
3 UCSC_Xena/TCGA/counts_RNAseq/TCGA-ESCA.htseq_counts.tsv.gz
4 UCSC_Xena/TCGA/counts_RNAseq/TARGET-RT.htseq_counts.tsv.gz
5 UCSC_Xena/TCGA/counts_RNAseq/MMRF-COMMPASS.htseq_counts.tsv.gz
6 UCSC_Xena/TCGA/counts_RNAseq/TCGA-MESO.htseq_counts.tsv.gz |
How can I get using the XENA tools those counts? https://xenabrowser.net/datapages/?dataset=tcga_RSEM_gene_tpm&host=https%3A%2F%2Ftoil.xenahubs.net&removeHub=https%3A%2F%2Fxena.treehouse.gi.ucsc.edu%3A443 |
Hi @sunta3iouxos , please rerun the code with the latest version from GitHub
|
And |
I will do and report. |
This one works. Is this approach something related to this: |
Thank you for this tool.
I am a novice in all TCGA data, but I am looking to do some analysis, and I wanted to download TPM normalised values, so that I can compine my own RNA-seq data. I think for my need, want to do GSVA, the TPM should be more appropriate than the percentile ranking.
From some tutorials I got some values that look more scaled than TPM normalised.
I want to use the data for GSVA or singscore
Is there a way to accomplish this with the XENAtools?
This is the code: (taken from https://github.com/XSLiuLab/tumor-immunogenicity-score)
The author of the code mentions:
The RNASeq data we downloaded are pancan normalized. For comparing data within independent cohort (like TCGA-LUAD), we recommend to use the "gene expression RNAseq" dataset. For questions regarding the gene expression of this particular cohort in relation to other types tumors, you can use the pancan normalized version of the "gene expression RNAseq" data. For comparing with data outside TCGA, we recommend using the percentile version if the non-TCGA data is normalized by percentile ranking. For more information, please see our Data FAQ: [here](https://docs.google.com/document/d/1q-7Tkzd7pci4Rz-_IswASRMRzYrbgx1FTTfAWOyHbmk/edit?usp=sharing
Do you have any recommendations on this?
Theodoros
The text was updated successfully, but these errors were encountered: