Skip to content

Commit

Permalink
copy
Browse files Browse the repository at this point in the history
  • Loading branch information
paul-gauthier committed Jun 1, 2024
1 parent 26edbcc commit bc4d39d
Show file tree
Hide file tree
Showing 4 changed files with 39 additions and 39 deletions.
8 changes: 4 additions & 4 deletions _posts/2024-05-31-both-swe-bench.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,10 +22,10 @@ that was reported recently.

[![SWE Bench results](/assets/swe_bench.svg)](https://aider.chat/assets/swe_bench.svg)

Aider was benchmarked on 570 of the 2294 SWE Bench problems.
These were the same
[randomly selected 570 problems](https://github.com/CognitionAI/devin-swebench-results/tree/main/output_diffs) that
[Devin used in their evaluation](https://www.cognition.ai/post/swe-bench-technical-report).
Aider was benchmarked on the same
[random 570](https://github.com/CognitionAI/devin-swebench-results/tree/main/output_diffs)
of the 2294 SWE Bench problems that were used in the
[Devin evaluation](https://www.cognition.ai/post/swe-bench-technical-report).
Please see the [references](#references)
for more details on the data presented in this chart.

Expand Down
Binary file modified assets/swe_bench.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
68 changes: 34 additions & 34 deletions assets/swe_bench.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion benchmark/swe_bench.py
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ def plot_swe_bench(data_file, is_lite):
if is_lite:
colors = ["#17965A" if "Aider" in model else "#b3d1e6" for model in models]
else:
colors = ["#155F91" if "Aider" in model else "#b3e6a8" for model in models]
colors = ["#155F91" if "Aider" in model else "#b3d1e6" for model in models]

bars = []
for model, pass_rate, color in zip(models, pass_rates, colors):
Expand Down

0 comments on commit bc4d39d

Please sign in to comment.