Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions about post-processing Tomte for Scout #175

Closed
Jakob37 opened this issue Oct 21, 2024 · 3 comments
Closed

Questions about post-processing Tomte for Scout #175

Jakob37 opened this issue Oct 21, 2024 · 3 comments
Labels
question Further information is requested

Comments

@Jakob37
Copy link
Contributor

Jakob37 commented Oct 21, 2024

Hello from Lund!

We are in process of an RNA-seq validation using Tomte, and as part of that I have been loading Tomte outputs into Scout.

At the moment, this requires several steps of post-processing beyond obtaining the results.

Currently:

  • Add HGNC IDs (this is already mentioned by Add hgnc id to research output in DROP #130)
  • Remove entries with no HGNC symbols in the original annotation (not sure about the impact of this, but it seems a lot of long non-coding RNAs does not have HGNC IDs, and these cannot be loaded by Scout atm)
  • Add annotations and genmod-scores (this is a big step - here I am currently running the annotation from our constitutional pipeline)
  • Filter out highly ranked variants (to not load all variants into Scout)
  • Generate the yaml file for Scout loading

(Edit: Maybe the annotation/scoring parts will not be relevant when we run this together with DNA-seq data)

Eventually, it would be nice to have this automated. Things are still under discussion, so maybe some steps will be added / removed, but I suspect we will need to do something similar also ahead.

Would it make sense to include parts of this in Tomte? If parts are very Lund-specific it might make more sense for me to set up a Lund-Tomte-postprocessing. For parts that are of general interest on the other hand, might be worth considering adding to Tomte. If so, let me know and I can splice out some concrete issues, and can also help out adding these.

@Jakob37 Jakob37 added the question Further information is requested label Oct 21, 2024
@Jakob37 Jakob37 changed the title Post-processing Tomte for Scout Post-processing Tomte for Scout (question) Oct 21, 2024
@Jakob37 Jakob37 changed the title Post-processing Tomte for Scout (question) Questions about post-processing Tomte for Scout Oct 21, 2024
@jemten
Copy link
Contributor

jemten commented Oct 21, 2024

Hello!
Sounds like good ideas. Currently we're only loading the aberrant splicing and expression results into scout where we link them to a DNA case so that one can see genetic variants in genes affected by splicing or expression abnormalities. So we are actually not using the VCF generated by tomte for much. Doing something with genmod is on the agenda as well. For cases where we have DNA results we also want to be able to feed in that vcf and do allele specific expression analysis using that VCF rather than the RNA vcf. I think we have a lot of similar issues. How about we set up a meeting to discuss? Lucia is on vacation this week but how about the week thereafter?

@Jakob37
Copy link
Contributor Author

Jakob37 commented Oct 21, 2024

Currently we're only loading the aberrant splicing and expression results into scout where we link them to a DNA case so that one can see genetic variants in genes affected by splicing or expression abnormalities. So we are actually not using the VCF generated by tomte for much.

Hmm, yes, OK. Makes sense. The CLGs liked seeing the list of ranked variants for our samples currently loaded as RNA-only, which I remember now is why I initially included it. Maybe we could/should also try loading the RNA-seq on top of existing DNA-seq. Then we could skip the annotation step. Thanks for making me think this through again 😅

Doing something with genmod is on the agenda as well. For cases where we have DNA results we also want to be able to feed in that vcf and do allele specific expression analysis using that VCF rather than the RNA vcf.

Yes, that would be interesting for us as well.

I think we have a lot of similar issues. How about we set up a meeting to discuss? Lucia is on vacation this week but how about the week thereafter?

Yes, I agree, that sounds good! I think the other bioinfs in Lund in the RNA-seq project would be interested in joining as well (@ViktorHy and @A97paupic). I'll check with them and can get back to you on Slack.

@Jakob37
Copy link
Contributor Author

Jakob37 commented Nov 1, 2024

I'll close this issue. Thanks for the discussion 😃

In conclusion:

  • There will be some post processing needed outside Tomte to build the Scout-yaml
  • We will be back with a suggested approach to contamination-check and later a PR (and perhaps later with an issue on ID-SNPs)
  • DNA/RNA integration is still under fermentation, but seems like it will include:
    • In Tomte using Genmod rescoring DNA SNVs based on allelic imbalance / silencing of alleles in RNA
    • Run MAE in DROP

@Jakob37 Jakob37 closed this as completed Nov 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants