-
Notifications
You must be signed in to change notification settings - Fork 4
sampleInformation_wiki
All possible command line arguments for method sampleInformation
Argument | Defaults | Description |
---|---|---|
--bpm |
required by user | Full path to bead pool manifest file (.bpm); must be same one used to generate gtc |
--gtcDir |
required by user | Full path to location of directory/folder containing gtc files to process (files must end in .gtc) -- will not recursively go into sub-directories |
--outDir |
optional, default=current working directory | Full path to directory or folder to output results. If it path does not exist, program will attempt to create it |
--logName |
optional, default=gtcFuncs.log |
Name of log file to output, will be created in directory --outDir
|
--modDir |
optional, default=current working directory | Full path to module files .py from github; default is current working directory with modules folder appended |
--prefix |
optional, default=None | string prefix to name text files and image files, will be created in directory --outDir . If nothing is provided the following png files will be created and overwritten: callRatePlots.png , gc10Plots.png , logrDevPlots.png
|
--fileOutName |
optional, default=allSampleInfo.txt |
Name of final text file to output, will be created in directory --outDir
|
--recursive |
optional, flag |
if flag is specified, gtc files will be found recursively from the base --gtcDir , otherwise, only gtcs listed in --gtcDir will be used |
The minimum command required to get sample information from files is the following:
python3 gtcFuncs.py sampleInformation --bpm /path/to/manifest.bpm --gtcDir /path/to/gtcLocations/
This would create two tab-delimited text files called allSampleInfo.txt
and summaryStatsTable.txt
using all gtcs located in the argument following --gtcDir
in the current working directory, along with a log file called gtcFuncs.log
in the current working directory.
Below is an example of the output allSampleInfo.txt
generated:
For the actual file, allSampleInfo.txt
generated by sampleInformation
, please download the text file here
Below is an example of the output summaryStatsTable.txt
generated:
For the actual file, summaryStatsTable.txt
generated by sampleInformation
, please download the text file here
Additionally, sampleInformation
generates a series of image files (.png) to summarize the extracted gtc metrics across the full set of samples requested. Below is an example of the summary figure generated of the call rate metric:
- The first plot (left) is the call rate summarized across all samples and dots are color-coded by gtc listed sex
- The second plot (middle) is of the same data in the first plot, except now it is also annotated with mean and standard deviation lines. The green line represents the mean, the blue line represents +/- 3 standard deviations from the mean, and the orange line represents +/- 6 standard deviations from the mean.
- The third plot (right) removes samples that are more than 3 standard deviations from the mean and recalculates the box plot based off of the removed outliers.
- The actual file can be viewed and downloaded here
Here is an example for the gc10 plot generation (same methods applied as above except now using gc10 instead of call rate):
Here is an example for the logrDev plot generation (same methods applied as above except now using logrDev instead of call rate):