-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor codebase dictionary into CodeBase class #98
Conversation
Represents a code base as a combination of: - A list of (source) directories; and - A list of exclude patterns. The intent of this object is to replace usages of untyped dictionaries to store code base information, to improve documentation and usability. Signed-off-by: John Pennycook <[email protected]>
Signed-off-by: John Pennycook <[email protected]>
Since a lot of the functionality driven by the tests has not yet been updated to use Path, a lot of casts between Path and string are introduced here. These casts will eventually be removed, but require changes to other functionality. Signed-off-by: John Pennycook <[email protected]>
A side-effect of tracking all source files in the code base is that analysis can now pick up unexpected files that were previously never encountered in a compilation database. For example, any C++ files automatically generated by CMake will be identified as unused code if codebasin is invoked in a directory containing both build/ and src/ directories. This commit updates the documentation to highlight the importance of running codebasin in the right directory (or otherwise separating build and src). Signed-off-by: John Pennycook <[email protected]>
Signed-off-by: John Pennycook <[email protected]>
The recent failure isn't related to my latest push (which was a simple formatting change). The issue appears to be that one of the tests is relying on behavior that isn't guaranteed: the I'm not sure what we should do here, so I'm interested in your opinion @laserkelvin. The simplest thing to do would probably be to document that we can't guarantee that we'll visit files in any particular order, and then update the test to use |
You need to run |
Signed-off-by: John Pennycook <[email protected]>
rglob is not guaranteed to return a sorted list, and the output is OS-dependent. We may want to revisit this decision in future, but it will be easier to further constrain the behavior of the CodeBase iterator than to remove functionality that users rely on. Signed-off-by: John Pennycook <[email protected]>
finder.find() expects a string for compatibility with legacy interfaces. Performing the cast inside of the function is less error-prone, and will mean fewer updates later when all of the strings are replaced with Paths. Signed-off-by: John Pennycook <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Related issues
Closes Track all source files in the source directory #86.
A
CodeBase
tracks all source files in the directories.Closes Paths in tests should not be relative to the root #66.
Since all tests had to be rewritten to use the
CodeBase
class, I updated the paths.Progress towards Explore use of pathlib instead of os.path #58.
CodeBase
usespathlib
internally to store and manipulate paths. The external interfaces still accept and return strings for now, but this is only temporary. We can move away from strings entirely once all interfaces acceptPath
.Proposed changes
CodeBase
class storing all information about which directories make up a code base and which files should be excluded from analysis.CodeBase
where appropriate: whether a file should be excluded from analysis is now implemented via__contains__
; and listing the contents of a code base is now implemented via__iter__
.CodeBase
. Note that most of the changes here are actually related to casting betweenPath
and strings, required because some legacy internals do not consider these representations to be equivalent.Note that although most uses of
CodeBase
here only use a single directory, the intent is to enable a list of explicit directories to be passed in the future as:...in order to support analysis of disjoint codebases, and to allow
codebasin
to be run from any directory.