Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

--remove-duplicates #50

Open
sjellerstrand opened this issue Mar 21, 2024 · 1 comment
Open

--remove-duplicates #50

sjellerstrand opened this issue Mar 21, 2024 · 1 comment

Comments

@sjellerstrand
Copy link

Hello!
Im trying to remove duplicate sequences using https://github.com/veg/hyphy-analyses/tree/master/remove-duplicates.

It all works well when I run it on the example files, or only on one of my alignments. However, I have some issues when I am trying to include a tree for my alignment to trim. I then get the following error message:

Error:
'/crex/proj/snic2020-2-25/bin/hyphy-analyses/remove-duplicates/example1.nwk' could not be opened for reading by fscanf. Path stack:
        /proj/snic2020-2-25/nobackup/simon/conda/envs/hyphy/share/hyphy/
        /crex/proj/snic2020-2-25/bin/hyphy-analyses/remove-duplicates/ in call to fscanf(filter.tree,"Raw",filter.tree_string);

Function call stack
1 :  fscanf(filter.tree,"Raw",filter.tree_string);

        Keyword arguments:
                {
                 "output":"./uniq_seq"
                }
-------

Check errors.log for execution error details.

The program does seem to make some progress if I rename my tree-file to "example.nwk". But I still get the following error-message, with the sequence names expected as the ones in the example-files:

Error:
Node 'seq_991' not found in the tree or is the root node in _List _TreeTopology::RemoveANode(HBLObjectRef)

Function call stack
1 :  T-utility.Keys(filter.delete_leaves);

        Keyword arguments:
                {
                 "output":"./uniq_seq"
                }
-------

Check errors.log for execution error details.

Since I am mainly working with population data I have many conspecific individuals in my alignments. Therefore, duplicates occur often, and I suspect it would speed up my analysis significantly to remove those since I want to loop this over all genes in the genome.

Thank you!

Simon

@spond
Copy link
Member

spond commented Mar 21, 2024

Dear @sjellerstrand,

Can you include the command you use to call hyphy with? One suggestion is to use absolute paths and see if that helps.

Best,
Sergei

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants