Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use compact prompt when generating requests via Claude 3.5 #277

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

actualwitch
Copy link
Member

@actualwitch actualwitch commented Sep 24, 2024

this pr adds a slim version of system prompt for ai request generation and a suite of tests to evaluate its effects on the results (using goose-quotes from examples). as running these on each commit would be rather expensive and claude api is not fully deterministic at temp=0, they are not run by default, but instructions are provided inside the test.

@actualwitch actualwitch force-pushed the fp-4088-tune-claude-prompts branch 2 times, most recently from d488ad6 to 8dbd602 Compare September 24, 2024 12:43
Copy link

pkg-pr-new bot commented Sep 24, 2024

Open in Stackblitz

pnpm add https://pkg.pr.new/fiberplane/fpx/@fiberplane/studio@277
pnpm add https://pkg.pr.new/fiberplane/fpx/@fiberplane/hono-otel@277

commit: 48089a4

@actualwitch actualwitch force-pushed the fp-4088-tune-claude-prompts branch 3 times, most recently from 9a55362 to c268d09 Compare September 24, 2024 13:16
@brettimus
Copy link
Contributor

noice - just looked over the prompt changes, will test locally tomorra

@brettimus
Copy link
Contributor

haven't fully finished the vibe check, but one thing i did notice is that it's a little flaky with query params now?

tested on a few routes in goose quotes that accepted query params and every once in a while it would add them, but most of the time not.

old prompt always adds query params fwiw.

other notes:

  • It's not taking much of a cue from the request history. If i just created a resource, it doesn't use the ID of that resource. However, this problem already appears to happen in the status quo with claude 👀 so not a regression. i think it might have to do with how chunky the history of requests is

  • It correctly determined multipart form data for a request! Wild

  • Hono url encoded forms were hard to sniff out, but i also think that's an existing problem

Will keep testing a bit more this eve

@brettimus
Copy link
Contributor

separate question: you said you tested on our data we'd collected from our internal honcathons. do you have scripts available for those tests? how did you go about that?

@brettimus
Copy link
Contributor

oh also! one more thing. it does tend to add the fpx-trace-id header into requests, which is a no-no. should probably add that instruction back to the system prompt

@actualwitch
Copy link
Member Author

i had a bit in prompt to emphasize reusing the data from history, wasnt sure if that would be preferable so i left it out, will put it back. the fpx-trace-id is being added because we include it in history of requests, probably will disappear if we filter the headers there
i used https://github.com/actualwitch/experiment to explore the csv with our honcathon data

@actualwitch actualwitch force-pushed the fp-4088-tune-claude-prompts branch 4 times, most recently from af8a30f to 9b1323a Compare October 10, 2024 19:24
@actualwitch
Copy link
Member Author

i have modified the inference logic to remove trace header from history entries, so it shouldn't be resurfacing in responses anymore. i also added a bit to integrate previous requests data, so that should be resolved too

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants