The purpose of this document is to hold the learning goals for teaching bioinformatics to beginners
For lesson plans, check :
- Software Carpentry's lessons
- Rosalind problems for biology-specific programming problems to practice with Python
- Python for Biologists
- Python Data Science Handbook
- Week 1. Goal:
2.
- Install Anaconda Python - Ahead of time
- Start a Jupyter notebook -
- Change default fonts
Installing ...
- Anaconda Python
- PyCharm editor
- git
- Starting up a Jupyter notebook
- Changing the default font (because with Courier you can't tell the difference
between 1 "one" and l "ell")
- Hack is nice
- Environments??
- Googling
- Stackoverflow
- Biostars
Could maybe use RBPs instead of TFs since I know them better
grep
a gtf file for the gene names and only get "gene" lines- use
bedtools flank
to get upstream 1000bp - Unsupervised problem:
- Use HOMER or other motif enricher and use all genes' upstream sequences as background
- Get to talk about background which is important
- HOMER motifs are very satifying
- Count enriched kmers (using Python??)
- Do a multiple sequence alignment
- Use HOMER or other motif enricher and use all genes' upstream sequences as background
- Supervised problem:
- Use UCSC genome browser to download transcription factor binding sites
- Use
bedtools intersect
to overlap with other bed file
How can students have ownership?
- Get genes from a paper they picked
- Use a TF they're interested in
- Get sanger sequencing reads
- Assemble subsets into contigs using
velvet
- Align contigs to known sequence using
bwa
- Find mismatches/SNPs using
???
vcftools?
How can students have ownership?
- Use contigs from something they created
- Get metadata from MACA
- Convert date strings eg
"170517"
to an actual date - Make histograms of how often something was made
Teach them a bunch of different things and separate the problem into three steps, where steps 1 and 2 are independent and feed into 3.
Modify Software Carpentry lessons
- ls
- Anatomy of a command
- Ls -l
- Ls -lh
- chmod
- Octal codes
- Chmod 775
- Chmod ug+r
- Chmod og-w
- cd
- pwd
- mkdir
- mkdir folder/subfolder
- head
- head -n 17
- less
- less -S
- Tail
- Tail -n 2
- Echo
- Touch
- Cat
- Wc
- Wc -l
- Cp
- Mv
- Rm
- Search for specific term
- Search for lines that don’t match
- Add lines after
- E.g. fastq
- Select columns
- Get number of fields
- Cheatsheets repo
- intersect
- flank