Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to understand otherExon #22

Open
lbwfff opened this issue Dec 6, 2022 · 7 comments
Open

How to understand otherExon #22

lbwfff opened this issue Dec 6, 2022 · 7 comments

Comments

@lbwfff
Copy link

lbwfff commented Dec 6, 2022

Hi Jianhong,

I have some questions about the concept of otherExon of the genomicElementDistribution function, how can I understand this concept, and what kind of peaks will be considered to come from belonging to otherExon.

Thanks,

LeeLee

@jianhong
Copy link
Owner

jianhong commented Dec 6, 2022

Hi LeeLee,

Thank you for trying ChIPpeakAnno to annotate your data. And sorry for the unclear documentation. otherExon is defined as the exons extracted from TxDb object that not overlap with any 5'UTR, 3'UTR and CDS. In most cases, they are single exon transcripts such as short noncoding.

@lbwfff
Copy link
Author

lbwfff commented Dec 7, 2022

Hi Jianhong,

Thanks for your reply, I have understood the problem, but I still have some doubts.
For example in my data:

> table(gr1[["peaks"]]$ExonIntron)

 exon 
10219 
> table(gr1[["peaks"]]$Exons)

      CDS otherExon      utr3      utr5 
     4058       583      4921       657 
> table(gr1[["peaks"]]$geneLevel)

      geneBody geneDownstream       promoter 
          8877            430            912 

There are a total of 10219 peaks in my data, all of them are on exons. I thought that the number of peaks located in geneBody would be equal to CDS+otherExon+utr3+utr5, but I found that the result is not the case, the number of CDS+otherExon+utr3+utr5 is equal to geneBody+geneDownstream+promoter, which means that some peak of the exon is considered to be located in the geneDownstream and the promoter at the same time. How should I understand this phenomenon?

Thanks,
LeeLee

@jianhong
Copy link
Owner

jianhong commented Dec 7, 2022 via email

@lbwfff
Copy link
Author

lbwfff commented Dec 7, 2022

I used the GRCh38 annotation from GENCODE, I guess because some gene exon regions were judged to be geneDownstream or promoter for some other genes. but this didn't have much impact on my subsequent analysis, so it wasn't too much of an issue.

@jianhong
Copy link
Owner

jianhong commented Dec 7, 2022

There are 2 parameter will affect this annotation, one is keepExonsInGenesOnly, please try to set it as FALSE to see what will happen. 2 is to check the labels order, that will affect the annotation precedence. Let me know the results. Thank you.

@lbwfff
Copy link
Author

lbwfff commented Dec 8, 2022

I tried setting keepExonsInGenesOnly to T or F, but it didn't affect the results, and the order of the labels was the same. If you need, I can provide my bed file, which is a MERIP-seq data analyzed using exomepeak2.
cache.txt

@jianhong
Copy link
Owner

jianhong commented Feb 7, 2023

Hi,
Sorry I mis-understand your first post. The total counts in Exon level should equal to Exon's count in ExonIntron level. The gene level will include promoter region, gene body (exon and intron), and downstream. The gene body does not including overlapping region with promoter and geneDownstream if you set the geneLevel order as promoter, geneDownstream and then geneBody. The geneBody is the from TSS+downtream Number in promoterRegion parameter to TES-upstream Number in geneDownstream parameter.
Hope this will help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants