-
Notifications
You must be signed in to change notification settings - Fork 130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Suggestions for discussing a more organized code structure #162
Comments
Example projects (for examples of structuring):
I was planning this since a while. But this is a bit non-trivial. There are multiple cases you have to think about:
Draft about the structure:
Code structure:
Other tasks:
Practical organization:
|
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
To revive this issue, I would like to add a new draft to tackle this: Reasons for Restructuring
Known Issues
There are multiple approaches used to access the Sprint. Common examples are:
or
where crnn is the old repository name of RETURNN. Those variants should continue to work. If the SprintInterface would now be part of a subpackage (e.g.
When running RETURNN just from the
All scipts that import from RETURNN will have broken dependencies when files are moved to subdirectories, so a root-level
All modifications need to be compatible to Python2, but no issues appeared yet. [6] Proposal: This changes included:
The test cases used were:
Todo:
|
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
But the problem here is not so much the abbreviation. Having "inverted" instead of "inv", or "segmental" instead of "seg", or "module" instead of "mod", that doesn't really change much, or does it? I think the names itself might be better chosen, but I still would keep using very standard abbreviations ("inv", "seg", "mod", etc). Actually your examples are I think mostly for outdated/unused code anyway. |
No, I think we should not put too much code into
Yea, sure. I was anyway speaking mostly about clear and very common abbreviations. |
This is exactly why we should avoid it. This abbreviation was intended to be "segment model" not "segment module". And even |
Well, "mod" is definitely not common for "model", but it is very common for "module", esp in the context of Python. I would say, for Python code, having "mod" for "module" is always totally ok. I.e., it is not ok for "model", so "model" should be written out. I think this is a pretty straight-forward case.
You can extend it, but this is not about the abbreviation here. Call it I would still say, for common abbreviations, we should definitely keep using the abbreviations. This is really pretty standard in Python code, and I think we should stick to Python conventions. E.g. Python uses So, this effectively becomes more a discussion now about which are the cases which are ok. Let's also not put too much priority on these outdated code files which are anyway very bad examples. Also, this is a question about consistency. Should we use both "rec" and "recurrent". Or only one of these? |
What do you mean with theano setup? A training setup? I can provide you with that
Yes, but does it bother you in this branch? If it does then we can delete it, I still have a local copy of this backend and nobody is using it anyway (I just wouldn't want to delete it completely). In general I think is a good thing to do this code restructuring. We could also make the theano RETURNN a standalone version which people have to checkout explicitly, while the actual RETURNN version only contains the TF backend. |
This is also a good opportunity to unify the use of underscores and dashes under |
Not sure about that. We are very careful here to not break old configs and scripts. This would likely break some scripts. We could add symlinks for old name -> new name for the tools. But not sure if this is really nicer than just leaving them as they are. |
I just saw that these two points are still open in the draft:
I would definitely favour to use |
There are still some open questions:
About the last point, there are currently inconsistencies ( I would also vote for introducing |
So, I guess we agreed on
I vote for the latter.
Yes. In any case consistent to the TF structure.
I would vote for no. This would just make it even more chaotic.
Ok.
Ok.
Ok.
Let's just be specific and discuss for the cases we currently have here. I guess "rec" is fine. What else needs to be discussed?
There is never a must. Maybe it would be nice to be consistent. But this will anyway not be possible 100%. E.g. you will find sometimes "net", sometimes "network", used for variable names, class names, comments, whatever... But I think that's ok. Let's not be too pedantic. In general, I prefer short names.
They are not shorter, the complete names are in fact longer now (esp when we mostly use absolute imports).
Which files are moved there? |
Yes, I would say
For Theano we have all Do we put |
Ok to
This is a question of dependencies also. Code becomes much easier to maintain if dependencies between each other (e.g. at the level of packages or modules) are kept minimal, and also only in one direction if possible. E.g. In some cases, I also like to keep the code even independent from
I think it is most consistent if The stuff in |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This is not enough. You would also need sth like this:
Not sure if this is enough then, and works like this... (See e.g. here.)
There are still some question marks in the proposal. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Ok. So then this can be deleted/moved at some later point. I think all things should be covered now. EDIT: I also went over the pull-requests again, and 3 out of 4 are about tests and tools, so for those there should be no problem merging them in after restructuring. The last one is really old, and I am not sure what will happen to it. |
Note: This is ongoing, in branch restructure, and mostly done. We will merge it soon to master. Please check. |
We have performed the changes, and pushed them into master now. |
I understand that removing/ deactivating the Theano backend is not an option right now.
I don't have enough insight into these parts to know whether moving the corresponding files into a separate folder (instead of adding a prefix to every file) would break something.
However, there appears to be some work on support for a pytorch backend (at least there is a new branch with work on this).
Would it be possible to move this stuff to a separate folder directly instead of adding more files to the root folder?
In general, I think it might be a good idea to collect some ideas on how we could improve the structure of the code.
I would suggest moving the layers into a folder structure similar to the organization in the docu by @JackTemaki. (see: https://returnn.readthedocs.io/en/latest/layer_reference/index.html#layer-types )
The text was updated successfully, but these errors were encountered: