refactor: work on programmatic interface, self-reviewing agent #199

ErikBjare · 2024-10-15T08:57:57Z

refactored LogManager into mutable manager and immutable Log dataclass
added wip treeofthought script

TODO

Manually test confirmation refactor

Important

Refactor logging system with new Log dataclass and add tree-branching conversation script.

Refactor LogManager:
- Introduced Log dataclass in logmanager.py for immutable message handling.
- Updated LogManager to use Log for managing conversation logs.
- Removed prepare_messages from LogManager, added as standalone function.
Scripts:
- Added treeofthoughts.py for tree-branching conversation evaluation.
Imports and Usage:
- Updated imports and usage of LogManager and Log in chat.py, cli.py, commands.py, server/api.py, and tools/chats.py.
- Replaced Conversation with ConversationMeta in cli.py and list_user_messages.py.

^{This description was created by}^{for 11df31e. It will automatically update as commits are pushed.}

ellipsis-dev

❌ Changes requested. Reviewed everything up to 36d1b7f in 1 minute and 29 seconds

More details

Looked at 857 lines of code in 9 files
Skipped 0 files when reviewing.
Skipped posting 3 drafted comments based on config settings.

1. gptme/logmanager.py:6

Draft comment:
Ensure that the replace method is imported from dataclasses to avoid potential NameError issues.

from dataclasses import dataclass, field, replace

Reason this comment was not posted:
Comment did not seem useful.

2. gptme/server/api.py:113

Draft comment:
Ensure manager.log is not empty before accessing manager.log.messages to prevent potential AttributeError.

if manager.log:
    msgs = prepare_messages(manager.log.messages)
else:
    msgs = []

Reason this comment was not posted:
Comment did not seem useful.

3. scripts/treeofthoughts.py:23

Draft comment:
Ensure that _step always yields valid Message objects to prevent runtime errors when appending to log.
Reason this comment was not posted:
Comment did not seem useful.

Workflow ID: wflow_La8vX5FjzlZhjaLR

Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

ellipsis-dev · 2024-10-15T08:59:33Z

gptme/logmanager.py


        # don't save backup branch if undoing a command
-        if not self[-1].content.startswith("/"):
+        if not self.log[-1].content.startswith("/"):


Add a check to ensure self.log is not empty before accessing self.log[-1] to prevent potential IndexError.

Suggested change

if not self.log[-1].content.startswith("/"):

if self.log and not self.log[-1].content.startswith("/"):

codecov-commenter · 2024-10-15T08:59:39Z

❌ 1 Tests Failed:

Tests completed	Failed	Passed	Skipped
80	1	79	0

View the top 1 failed tests by shortest run time

tests.test_cli test_fileblock

Stack Traces | 0.659s run time

args = ['--name', 'test-58859-test_fileblock', '/impersonate ```patch hello/hello.py\n&lt;&lt;&lt;&lt;&lt;&lt;&lt; ORIGINAL\nprint("hello")\n=======\nprint("hello world")\n&gt;&gt;&gt;&gt;&gt;&gt;&gt; UPDATED\n```']
runner = &lt;click.testing.CliRunner object at 0x7f940fd8fb80&gt;

    @pytest.mark.slow
    def test_fileblock(args: list[str], runner: CliRunner):
        args_orig = args.copy()
    
        # tests saving with a ```filename.txt block
        tooluse = ToolUse("save", ["hello.py"], "print('hello')")
        args.append(f"/impersonate {tooluse.to_output()}")
        print(f"running: gptme {' '.join(args)}")
        result = runner.invoke(gptme.cli.main, args)
        assert result.exit_code == 0
    
        # read the file
        with open("hello.py") as f:
            content = f.read()
        assert content == "print('hello')\n"
    
        # test append
        args = args_orig.copy()
        tooluse = ToolUse("append", ["hello.py"], "print('world')")
        args.append(f"/impersonate {tooluse.to_output()}")
        print(f"running: gptme {' '.join(args)}")
        result = runner.invoke(gptme.cli.main, args)
        assert result.exit_code == 0
    
        # read the file
        with open("hello.py") as f:
            content = f.read()
        assert content == "print('hello')\nprint('world')\n"
    
        # test write file to directory that doesn't exist
        tooluse = ToolUse("save", ["hello/hello.py"], 'print("hello")')
        args = args_orig.copy()
        args.append(f"/impersonate {tooluse.to_output()}")
        print(f"running: gptme {' '.join(args)}")
        result = runner.invoke(gptme.cli.main, args)
        assert result.exit_code == 0
    
        # test patch on file in directory
        patch = '&lt;&lt;&lt;&lt;&lt;&lt;&lt; ORIGINAL\nprint("hello")\n=======\nprint("hello world")\n&gt;&gt;&gt;&gt;&gt;&gt;&gt; UPDATED'
        tooluse = ToolUse("patch", ["hello/hello.py"], patch)
        args = args_orig.copy()
        args.append(f"/impersonate {tooluse.to_output()}")
        print(f"running: gptme {' '.join(args)}")
        result = runner.invoke(gptme.cli.main, args)
        assert result.exit_code == 0
    
        # read the file
&gt;       with open("hello/hello.py") as f:
E       FileNotFoundError: [Errno 2] No such file or directory: 'hello/hello.py'

.../gptme/tests/test_cli.py:182: FileNotFoundError

To view individual test run time comparison to the main branch, go to the Test Analytics Dashboard

scripts/treeofthoughts.py

ellipsis-dev

👍 Looks good to me! Incremental review on e63d525 in 33 seconds

More details

Looked at 295 lines of code in 2 files
Skipped 0 files when reviewing.
Skipped posting 7 drafted comments based on config settings.

1. scripts/treeofthoughts.py:25

Draft comment:
Consider handling exceptions like subprocess.CalledProcessError to ensure robustness when running subprocess commands.
Reason this comment was not posted:
Confidence changes required: 50%
The use of subprocess.run without handling potential exceptions can lead to unhandled errors if the command fails. It's a good practice to handle exceptions like subprocess.CalledProcessError to ensure robustness.

2. scripts/treeofthoughts.py:31

Draft comment:
Consider handling exceptions like subprocess.CalledProcessError to ensure robustness when running subprocess commands.
Reason this comment was not posted:
Confidence changes required: 50%
The use of subprocess.run without handling potential exceptions can lead to unhandled errors if the command fails. It's a good practice to handle exceptions like subprocess.CalledProcessError to ensure robustness.

3. scripts/treeofthoughts.py:39

Draft comment:
Consider handling exceptions like subprocess.CalledProcessError to ensure robustness when running subprocess commands.
Reason this comment was not posted:
Confidence changes required: 50%
The use of subprocess.run without handling potential exceptions can lead to unhandled errors if the command fails. It's a good practice to handle exceptions like subprocess.CalledProcessError to ensure robustness.

4. scripts/treeofthoughts.py:97

Draft comment:
Consider handling exceptions like subprocess.CalledProcessError to ensure robustness when running subprocess commands.
Reason this comment was not posted:
Confidence changes required: 50%
The use of subprocess.run without handling potential exceptions can lead to unhandled errors if the command fails. It's a good practice to handle exceptions like subprocess.CalledProcessError to ensure robustness.

5. scripts/treeofthoughts.py:126

Draft comment:
Consider handling exceptions like subprocess.CalledProcessError to ensure robustness when running subprocess commands.
Reason this comment was not posted:
Confidence changes required: 50%
The use of subprocess.run without handling potential exceptions can lead to unhandled errors if the command fails. It's a good practice to handle exceptions like subprocess.CalledProcessError to ensure robustness.

6. scripts/treeofthoughts.py:187

Draft comment:
Consider handling exceptions like subprocess.CalledProcessError to ensure robustness when running subprocess commands.
Reason this comment was not posted:
Confidence changes required: 50%
The use of subprocess.run without handling potential exceptions can lead to unhandled errors if the command fails. It's a good practice to handle exceptions like subprocess.CalledProcessError to ensure robustness.

7. gptme/chat.py:164

Draft comment:
Use Union[Log, List[Message]] for type hinting instead of Log | list[Message] for better clarity and compatibility.
Reason this comment was not posted:
Confidence changes required: 50%
The current implementation of the step function in gptme/chat.py uses Log | list[Message] as a type hint for the log parameter. This can be improved for clarity and type safety by using Union[Log, List[Message]].

Workflow ID: wflow_PxtchXuYJyA1Sxt8

You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

…mutable manager and immutable Log dataclass, added wip treeofthought script

ellipsis-dev

👍 Looks good to me! Incremental review on 76b9abf in 48 seconds

More details

Looked at 295 lines of code in 2 files
Skipped 0 files when reviewing.
Skipped posting 5 drafted comments based on config settings.

1. scripts/treeofthoughts.py:47

Draft comment:
Use triple backticks for code blocks in Python strings.

        context += f"

{f}\n"

- **Reason this comment was not posted:** 
Comment looked like it was already resolved.

</details>

<details>
<summary>2. <code>scripts/treeofthoughts.py:25</code></summary>

- **Draft comment:** 
Consider adding exception handling for subprocess.run to handle potential errors from the git command.
- **Reason this comment was not posted:** 
Decided after close inspection that this draft comment was likely wrong and/or not actionable:
The comment is relevant because subprocess.run is used in the new functions, and it can raise exceptions. Adding exception handling could improve the robustness of the code. However, the current usage does not use check=True, so the main concern would be handling FileNotFoundError. The comment is actionable and suggests a clear improvement to the code.
The comment does not specify which exceptions to handle or how to handle them, which might make it less actionable. Additionally, if the subprocess.run calls are expected to always succeed in the given environment, exception handling might be unnecessary.
Even if the environment is controlled, handling potential exceptions can prevent unexpected crashes and improve code robustness. The comment is a general suggestion that can be refined by the developer.
Keep the comment as it suggests a valid improvement to handle potential errors from subprocess.run, which is used in the new code.

</details>

<details>
<summary>3. <code>scripts/treeofthoughts.py:97</code></summary>

- **Draft comment:** 
Consider adding exception handling for subprocess.run to handle potential errors from the make command.
- **Reason this comment was not posted:** 
Marked as duplicate.

</details>

<details>
<summary>4. <code>scripts/treeofthoughts.py:187</code></summary>

- **Draft comment:** 
Consider adding exception handling for subprocess.run to handle potential errors from the git command.
- **Reason this comment was not posted:** 
Marked as duplicate.

</details>

<details>
<summary>5. <code>scripts/treeofthoughts.py:204</code></summary>

- **Draft comment:** 
Ensure that the Log class supports the pop method or use an appropriate method to remove the last message.
- **Reason this comment was not posted:** 
Comment did not seem useful.

</details>


Workflow ID: <workflowid>`wflow_jBkuZtgeBxNvQ5Nc`</workflowid>

</details>


----
You can customize Ellipsis with :+1: / :-1: [feedback](https://docs.ellipsis.dev/review), review rules, user-specific overrides, `quiet` mode, and [more](https://docs.ellipsis.dev/config).

…mation and simplifying confirmation support in server

ellipsis-dev

❌ Changes requested. Incremental review on 11df31e in 38 seconds

More details

Looked at 862 lines of code in 13 files
Skipped 0 files when reviewing.
Skipped posting 2 drafted comments based on config settings.

1. gptme/tools/save.py:9

Draft comment:
The ask_execute function is no longer used and should be removed to clean up the code.
Reason this comment was not posted:
Confidence changes required: 50%
The ask_execute function is no longer used in gptme/tools/save.py and should be removed to clean up the code.

2. gptme/tools/save.py:80

Draft comment:
The ask_execute function is no longer used and should be removed to clean up the code.
Reason this comment was not posted:
Confidence changes required: 50%
The ask_execute function is no longer used in gptme/tools/save.py and should be removed to clean up the code.

Workflow ID: wflow_MWodEYP7BDsjI8Z0

Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

gptme/tools/python.py

Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>

ellipsis-dev bot reviewed Oct 15, 2024

View reviewed changes

ErikBjare commented Oct 15, 2024

View reviewed changes

scripts/treeofthoughts.py Outdated Show resolved Hide resolved

ellipsis-dev bot reviewed Oct 15, 2024

View reviewed changes

ErikBjare mentioned this pull request Oct 15, 2024

Hooks? #156

Open

ErikBjare force-pushed the dev/programmatic-api-and-treeofthoughts branch from e63d525 to 76b9abf Compare October 15, 2024 12:52

ErikBjare added 3 commits October 15, 2024 14:52

refactor: work on programmatic interface, refactored LogManager into …

b49f2ce

…mutable manager and immutable Log dataclass, added wip treeofthought script

Apply suggestions from code review

7c6535a

fix: more fixes/improvements to treeofthoughts.py

76b9abf

ellipsis-dev bot reviewed Oct 15, 2024

View reviewed changes

refactor: refactor how confirmation works, enabling LLM-guided confir…

11df31e

…mation and simplifying confirmation support in server

ellipsis-dev bot reviewed Oct 15, 2024

View reviewed changes

gptme/tools/python.py Outdated Show resolved Hide resolved

Update gptme/tools/python.py

1de2a02

Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: work on programmatic interface, self-reviewing agent #199

refactor: work on programmatic interface, self-reviewing agent #199

ErikBjare commented Oct 15, 2024 •

edited

Loading

ellipsis-dev bot left a comment

ellipsis-dev bot Oct 15, 2024

codecov-commenter commented Oct 15, 2024 •

edited

Loading

ellipsis-dev bot left a comment

ellipsis-dev bot left a comment

ellipsis-dev bot left a comment

	if not self.log[-1].content.startswith("/"):
	if self.log and not self.log[-1].content.startswith("/"):

refactor: work on programmatic interface, self-reviewing agent #199

Are you sure you want to change the base?

refactor: work on programmatic interface, self-reviewing agent #199

Conversation

ErikBjare commented Oct 15, 2024 • edited Loading

TODO

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

ellipsis-dev bot Oct 15, 2024

Choose a reason for hiding this comment

codecov-commenter commented Oct 15, 2024 • edited Loading

❌ 1 Tests Failed:

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

ErikBjare commented Oct 15, 2024 •

edited

Loading

codecov-commenter commented Oct 15, 2024 •

edited

Loading