You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As currently written, the left join with peptide_predictions will duplicate all of the information in all of the left-joined dataframes for any peptides that appear more than once in peptide_predictions. Outputting two TSV files - one the predictions for peptides (in which peptide_id is nonunique) and the other the joined metadata dataframes (in which peptide_id is unique) - would avoid this. Outputting two would avoid duplication, but one might be easier to work with practically. We should assess this after running the pipeline a few times. if there is relatively little duplication in peptide_ids it might not matter in practice to separate these two files.
The text was updated successfully, but these errors were encountered:
As currently written, the left join with
peptide_predictions
will duplicate all of the information in all of the left-joined dataframes for any peptides that appear more than once inpeptide_predictions
. Outputting two TSV files - one the predictions for peptides (in whichpeptide_id
is nonunique) and the other the joined metadata dataframes (in whichpeptide_id
is unique) - would avoid this. Outputting two would avoid duplication, but one might be easier to work with practically. We should assess this after running the pipeline a few times. if there is relatively little duplication in peptide_ids it might not matter in practice to separate these two files.The text was updated successfully, but these errors were encountered: