Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Numba parallel weights computation + dataloader #5

Open
wants to merge 104 commits into
base: master
Choose a base branch
from

Commits on May 11, 2021

  1. Added some slurm scripts, small changes to VAE_model for file handling

    Lodevicus Van Niekerk committed May 11, 2021
    Configuration menu
    Copy the full SHA
    4e98cb2 View commit details
    Browse the repository at this point in the history

Commits on May 12, 2021

  1. defined instance variables in __init__ for clarity; extracted one_hot…

    …_encoding function; minor linting, changed seq_name_to_sequence to a string instead of list of chars
    loodvn committed May 12, 2021
    Configuration menu
    Copy the full SHA
    fcb8d1b View commit details
    Browse the repository at this point in the history

Commits on May 23, 2021

  1. change focus_seq_trimmed to a string instead of list of chars; added …

    …minor printouts to MSA_processing
    loodvn committed May 23, 2021
    Configuration menu
    Copy the full SHA
    9606f5a View commit details
    Browse the repository at this point in the history

Commits on Jun 16, 2021

  1. error checking in compute_evol_indices

    passing z_dim into train_VAE script
    manually passing in args for VAE checkpoint reloading
    small typos in scripts
    loodvn committed Jun 16, 2021
    Configuration menu
    Copy the full SHA
    fb13194 View commit details
    Browse the repository at this point in the history
  2. merged data_utils

    loodvn committed Jun 16, 2021
    Configuration menu
    Copy the full SHA
    09c8b8b View commit details
    Browse the repository at this point in the history

Commits on Jul 13, 2021

  1. Configuration menu
    Copy the full SHA
    c29fd56 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    f0984cf View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    ba75f74 View commit details
    Browse the repository at this point in the history

Commits on Jul 15, 2021

  1. had to comment out the alternating joint training for now, to switch …

    …to mixed batch joint training.
    
    Initialising the bias to mean(y_train) for much better convergence, still not great performance though.
    Moved parameter reading outside of main function so that we can override the z_dim size
    loodvn committed Jul 15, 2021
    Configuration menu
    Copy the full SHA
    bf3f1c5 View commit details
    Browse the repository at this point in the history

Commits on Aug 6, 2021

  1. joint training script improvements:

    saving vae checkpoints,
    checkpoint loading vs train from scratch,
    added sigmoid+bce loss,
    added 3 very long functions for mixed/alternating/frozen training modes to switch from command line,
    added linear model loss weight
    loodvn committed Aug 6, 2021
    Configuration menu
    Copy the full SHA
    8d54d4b View commit details
    Browse the repository at this point in the history

Commits on Sep 9, 2021

  1. Joint training: parameterize lm_loss_weight

    linting
    loodvn committed Sep 9, 2021
    Configuration menu
    Copy the full SHA
    a4c7be9 View commit details
    Browse the repository at this point in the history

Commits on Mar 16, 2022

  1. Configuration menu
    Copy the full SHA
    ab1055d View commit details
    Browse the repository at this point in the history
  2. temp hehe

    loodvn committed Mar 16, 2022
    Configuration menu
    Copy the full SHA
    57b7fb7 View commit details
    Browse the repository at this point in the history

Commits on Mar 18, 2022

  1. committing all ideas for parallelising the weights calculation for re…

    …cord, will delete the bad ones
    loodvn committed Mar 18, 2022
    Configuration menu
    Copy the full SHA
    9687b7c View commit details
    Browse the repository at this point in the history
  2. checking on O2 now

    loodvn committed Mar 18, 2022
    Configuration menu
    Copy the full SHA
    d65d435 View commit details
    Browse the repository at this point in the history
  3. adding mapping files

    loodvn committed Mar 18, 2022
    Configuration menu
    Copy the full SHA
    0597e02 View commit details
    Browse the repository at this point in the history
  4. running as array

    loodvn committed Mar 18, 2022
    Configuration menu
    Copy the full SHA
    636f88f View commit details
    Browse the repository at this point in the history
  5. changed logging dir

    loodvn committed Mar 18, 2022
    Configuration menu
    Copy the full SHA
    cc30589 View commit details
    Browse the repository at this point in the history
  6. removed old training flags

    loodvn committed Mar 18, 2022
    Configuration menu
    Copy the full SHA
    b5af044 View commit details
    Browse the repository at this point in the history
  7. some slurm script changes

    loodvn committed Mar 18, 2022
    Configuration menu
    Copy the full SHA
    87c30d5 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    b9ab6b1 View commit details
    Browse the repository at this point in the history
  9. print equality

    loodvn committed Mar 18, 2022
    Configuration menu
    Copy the full SHA
    7bf655f View commit details
    Browse the repository at this point in the history
  10. Configuration menu
    Copy the full SHA
    d8a0951 View commit details
    Browse the repository at this point in the history
  11. running all proteins now

    loodvn committed Mar 18, 2022
    Configuration menu
    Copy the full SHA
    7611fd8 View commit details
    Browse the repository at this point in the history
  12. 74 MSAs

    loodvn committed Mar 18, 2022
    Configuration menu
    Copy the full SHA
    d235b9d View commit details
    Browse the repository at this point in the history
  13. oops was still debugging

    loodvn committed Mar 18, 2022
    Configuration menu
    Copy the full SHA
    c29ddae View commit details
    Browse the repository at this point in the history
  14. using new MSAs

    loodvn committed Mar 18, 2022
    Configuration menu
    Copy the full SHA
    5993736 View commit details
    Browse the repository at this point in the history
  15. wrong MSA location

    loodvn committed Mar 18, 2022
    Configuration menu
    Copy the full SHA
    2b66b67 View commit details
    Browse the repository at this point in the history
  16. Configuration menu
    Copy the full SHA
    ad85d42 View commit details
    Browse the repository at this point in the history
  17. also testing only 1 cpu

    loodvn committed Mar 18, 2022
    Configuration menu
    Copy the full SHA
    91e4ac0 View commit details
    Browse the repository at this point in the history
  18. Configuration menu
    Copy the full SHA
    8786baa View commit details
    Browse the repository at this point in the history
  19. Configuration menu
    Copy the full SHA
    92a5d8b View commit details
    Browse the repository at this point in the history
  20. bugfixes

    loodvn committed Mar 18, 2022
    Configuration menu
    Copy the full SHA
    91efd61 View commit details
    Browse the repository at this point in the history
  21. bugfixes

    loodvn committed Mar 18, 2022
    Configuration menu
    Copy the full SHA
    79b7030 View commit details
    Browse the repository at this point in the history
  22. another bug

    loodvn committed Mar 18, 2022
    Configuration menu
    Copy the full SHA
    9a2ff26 View commit details
    Browse the repository at this point in the history
  23. Configuration menu
    Copy the full SHA
    a024bb4 View commit details
    Browse the repository at this point in the history
  24. Configuration menu
    Copy the full SHA
    c020bb8 View commit details
    Browse the repository at this point in the history

Commits on Mar 21, 2022

  1. using multiprocessing + numba now, is roughly as fast as parallel num…

    …ba, but both aren't scaling as well as expected..
    loodvn committed Mar 21, 2022
    Configuration menu
    Copy the full SHA
    c50aa89 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    0badb28 View commit details
    Browse the repository at this point in the history
  3. typo

    loodvn committed Mar 21, 2022
    Configuration menu
    Copy the full SHA
    cebd2db View commit details
    Browse the repository at this point in the history
  4. moved all the tmp EVE vs EVCouplings checks out into calc_weights.py,…

    … where it'll be deleted soon
    
    also got rid of confusing calc_method flags, will just call specific methods from calc_weights.py
    loodvn committed Mar 21, 2022
    Configuration menu
    Copy the full SHA
    c915e5a View commit details
    Browse the repository at this point in the history
  5. going big, 40 cpus

    loodvn committed Mar 21, 2022
    Configuration menu
    Copy the full SHA
    289e84a View commit details
    Browse the repository at this point in the history

Commits on Mar 22, 2022

  1. moved all the tmp EVE vs EVCouplings checks out into calc_weights.py,…

    … where it'll be deleted soon
    loodvn committed Mar 22, 2022
    Configuration menu
    Copy the full SHA
    6429f69 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    a3be96c View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    4b4ef45 View commit details
    Browse the repository at this point in the history
  4. added flag option to train_VAE.py to fail if weights not found (usefu…

    …l to make sure that weights are precomoputed on CPU)
    loodvn committed Mar 22, 2022
    Configuration menu
    Copy the full SHA
    6838046 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    93804fe View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    3a11b0c View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    417865f View commit details
    Browse the repository at this point in the history
  8. cleaned up one_hot_3D

    loodvn committed Mar 22, 2022
    Configuration menu
    Copy the full SHA
    0fe130c View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    99945e0 View commit details
    Browse the repository at this point in the history
  10. check - editing readme

    loodvn committed Mar 22, 2022
    Configuration menu
    Copy the full SHA
    9a5484b View commit details
    Browse the repository at this point in the history
  11. Configuration menu
    Copy the full SHA
    3fc7a2e View commit details
    Browse the repository at this point in the history
  12. Configuration menu
    Copy the full SHA
    a125835 View commit details
    Browse the repository at this point in the history
  13. Configuration menu
    Copy the full SHA
    4312346 View commit details
    Browse the repository at this point in the history
  14. removed circular dependency in utils/weights

    refactored another one_hot_encoding
    loodvn committed Mar 22, 2022
    Configuration menu
    Copy the full SHA
    289c218 View commit details
    Browse the repository at this point in the history
  15. Configuration menu
    Copy the full SHA
    e902012 View commit details
    Browse the repository at this point in the history

Commits on Mar 23, 2022

  1. Configuration menu
    Copy the full SHA
    b8ad7c4 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    eccc14b View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    7b1e5d1 View commit details
    Browse the repository at this point in the history
  4. moved calc_weights to top level instead of nested in gen_alignment;

    also moved preprocess_MSA into its own function because it bugged me
    loodvn committed Mar 23, 2022
    Configuration menu
    Copy the full SHA
    fe0e95b View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    bf45aba View commit details
    Browse the repository at this point in the history

Commits on Jul 28, 2022

  1. Configuration menu
    Copy the full SHA
    95ca192 View commit details
    Browse the repository at this point in the history
  2. merged with Marks-OATML master

    loodvn committed Jul 28, 2022
    Configuration menu
    Copy the full SHA
    1316094 View commit details
    Browse the repository at this point in the history
  3. Merge remote-tracking branch 'marks/master'

    # Conflicts:
    #	train_VAE.py
    loodvn committed Jul 28, 2022
    Configuration menu
    Copy the full SHA
    c658d1c View commit details
    Browse the repository at this point in the history
  4. don't need weights for scoring

    loodvn committed Jul 28, 2022
    Configuration menu
    Copy the full SHA
    e0f3754 View commit details
    Browse the repository at this point in the history
  5. Merge branch 'lood/speedup_weights'

    # Conflicts:
    #	EVE/VAE_model.py
    #	train_VAE.py
    loodvn committed Jul 28, 2022
    Configuration menu
    Copy the full SHA
    3efbda2 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    16c8c7b View commit details
    Browse the repository at this point in the history

Commits on Aug 2, 2022

  1. move constants out,

    added threshold_focus_cols_frac_gaps command line argument
    loodvn committed Aug 2, 2022
    Configuration menu
    Copy the full SHA
    c980752 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    a153a1e View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    b3f15c7 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    b0f3467 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    a0c3167 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    671669e View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    c8917f9 View commit details
    Browse the repository at this point in the history
  8. Merge remote-tracking branch 'origin/master'

    # Conflicts:
    #	scripts/2022_08_01_disorder_O2.sh
    loodvn committed Aug 2, 2022
    Configuration menu
    Copy the full SHA
    eec4a42 View commit details
    Browse the repository at this point in the history
  9. added weight shape check, using threshold_focus_cols_frac_gaps = 1 si…

    …nce then it includes all sequences
    loodvn committed Aug 2, 2022
    Configuration menu
    Copy the full SHA
    e60ee9d View commit details
    Browse the repository at this point in the history

Commits on Aug 3, 2022

  1. Configuration menu
    Copy the full SHA
    2c7338b View commit details
    Browse the repository at this point in the history

Commits on Aug 8, 2022

  1. Configuration menu
    Copy the full SHA
    15abb33 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    534b0db View commit details
    Browse the repository at this point in the history

Commits on Aug 9, 2022

  1. turned DMS filename assertion into just a warning for now, need to fi…

    …gure out properly later
    loodvn committed Aug 9, 2022
    Configuration menu
    Copy the full SHA
    536b207 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    2cb5cd0 View commit details
    Browse the repository at this point in the history

Commits on Aug 10, 2022

  1. Configuration menu
    Copy the full SHA
    07d3984 View commit details
    Browse the repository at this point in the history

Commits on Aug 11, 2022

  1. Configuration menu
    Copy the full SHA
    aeade79 View commit details
    Browse the repository at this point in the history

Commits on Sep 23, 2022

  1. adpred scripts

    loodvn committed Sep 23, 2022
    Configuration menu
    Copy the full SHA
    f761149 View commit details
    Browse the repository at this point in the history

Commits on Oct 18, 2022

  1. allowed to pass in a MSA file directly to calc_weights instead of a m…

    …apping file
    
    also added "identity" weights for completion
    loodvn committed Oct 18, 2022
    Configuration menu
    Copy the full SHA
    af99627 View commit details
    Browse the repository at this point in the history
  2. Merge branch 'lood/speedup_weights2'

    # Conflicts:
    #	EVE/VAE_model.py
    #	calc_weights.py
    #	compute_evol_indices.py
    #	data/mappings/example_mapping.csv
    #	examples/Step0_optional_calc_weights.sh
    #	examples/Step0_optional_calc_weights_slurm.sh
    #	train_VAE.py
    #	utils/data_utils.py
    #	utils/weights.py
    loodvn committed Oct 18, 2022
    Configuration menu
    Copy the full SHA
    bb902f7 View commit details
    Browse the repository at this point in the history
  3. reformatted whitespace PEP8

    loodvn committed Oct 18, 2022
    Configuration menu
    Copy the full SHA
    f70712c View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    6043996 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    875a625 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    098ce58 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    cb9aabd View commit details
    Browse the repository at this point in the history

Commits on Sep 13, 2023

  1. Weights calc:

    Added progress bar, weights-only calc mode
    loodvn committed Sep 13, 2023
    Configuration menu
    Copy the full SHA
    37375c4 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    c8249f0 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    f9c291c View commit details
    Browse the repository at this point in the history

Commits on Sep 14, 2023

  1. Streaming one-hot-encodings is working well

    Fallback to normal mode also works well
    loodvn committed Sep 14, 2023
    Configuration menu
    Copy the full SHA
    3709d7d View commit details
    Browse the repository at this point in the history

Commits on Sep 19, 2023

  1. Configuration menu
    Copy the full SHA
    e30f784 View commit details
    Browse the repository at this point in the history

Commits on Feb 28, 2024

  1. Configuration menu
    Copy the full SHA
    74238ea View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    455ffaf View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    c801b52 View commit details
    Browse the repository at this point in the history

Commits on Mar 6, 2024

  1. Computing one-hot encodings on the fly for evol_indices using dataloa…

    …der, merged in changes from ProteinGym.
    
    Removed the aggregation methods for evol indices.
    loodvn committed Mar 6, 2024
    Configuration menu
    Copy the full SHA
    70f63e2 View commit details
    Browse the repository at this point in the history

Commits on Mar 15, 2024

  1. Using dataloaders for train and validation, use multi-cpu weights by …

    …default, tested with DLG4
    
    (cherry picked from commit fcb7894)
    loodvn committed Mar 15, 2024
    Configuration menu
    Copy the full SHA
    e18c56f View commit details
    Browse the repository at this point in the history

Commits on Mar 16, 2024

  1. Configuration menu
    Copy the full SHA
    3d48173 View commit details
    Browse the repository at this point in the history
  2. deleted internal scripts

    loodvn committed Mar 16, 2024
    Configuration menu
    Copy the full SHA
    d129b81 View commit details
    Browse the repository at this point in the history