Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(Python) Validate code components extraction #20

Open
betogaona7 opened this issue Aug 10, 2023 · 1 comment
Open

(Python) Validate code components extraction #20

betogaona7 opened this issue Aug 10, 2023 · 1 comment
Assignees
Labels
enhancement New feature or request research

Comments

@betogaona7
Copy link
Collaborator

betogaona7 commented Aug 10, 2023

For python documents, we can validate (or replace) GPT's code components extraction by using the ast library: Ex.

def extract_classes_and_functions(source_code):

    parsed_tree = ast.parse(source_code)

    classes = []
    functions = []

    for node in ast.walk(parsed_tree):
        if isinstance(node, ast.ClassDef):
            classes.append(node)
        elif isinstance(node, ast.FunctionDef):
            functions.append(node)

    return classes, functions
@betogaona7 betogaona7 added enhancement New feature or request research labels Aug 10, 2023
@betogaona7
Copy link
Collaborator Author

betogaona7 commented Aug 10, 2023

This is language-dependent, so maybe could work as part of a set of functions for validation depending in the user's input, and not as an optimization in the pipeline. Right now, the extraction is language-agnostic limited only by Langchain splitter's supported programming languages: https://python.langchain.com/docs/modules/data_connection/document_transformers/text_splitters/code_splitter

@betogaona7 betogaona7 self-assigned this Aug 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request research
Projects
None yet
Development

No branches or pull requests

1 participant