You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've been testing the bulk_expression_QC.ipynb notebook to conduct QC on a 559 sample RNASeq dataset. All three parts (Hierarchical clustering, D-statistic correlations, RLE) produce candidate outliers, but none of the listed outliers overlap, leading to a final outlier count of zero.
I noticed the samples on the right of the RLE plot (with high IQRs) are not the same as the samples printed out to the log file. The samples printed to the log file for the RLE step are the last 5% of the samples in the input TPM matrix, which aren't the actual RLE outlier samples.
The correct RLE outliers produced from this change overlapped with candidate outliers from the hierarchical clustering and D-statistic steps, unlike before the change when there were no overlaps.
The text was updated successfully, but these errors were encountered:
hmm @grennfp I think it is worth a zoom discussion ... maybe between you and @hsun3163 is good enough for starters then Hao can fill me in. Could you guys arrange something offline for next week? You can also show this to us during the Monday WG meeting. Thanks for looking carefully at the diagnosis plot and catching the possible bug!
I've been testing the bulk_expression_QC.ipynb notebook to conduct QC on a 559 sample RNASeq dataset. All three parts (Hierarchical clustering, D-statistic correlations, RLE) produce candidate outliers, but none of the listed outliers overlap, leading to a final outlier count of zero.
I noticed the samples on the right of the RLE plot (with high IQRs) are not the same as the samples printed out to the log file. The samples printed to the log file for the RLE step are the last 5% of the samples in the input TPM matrix, which aren't the actual RLE outlier samples.
I believe the issue lies in this line of code:
replacing bymedian with levels(bymedian) seemed to fix the issue. Using this code gave me the correct RLE outlier samples:
The correct RLE outliers produced from this change overlapped with candidate outliers from the hierarchical clustering and D-statistic steps, unlike before the change when there were no overlaps.
The text was updated successfully, but these errors were encountered: