-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add LLMEvaluator to JS #1186
base: main
Are you sure you want to change the base?
add LLMEvaluator to JS #1186
Conversation
import * as uuid from "uuid"; | ||
import { EvaluationResult, EvaluationResults, RunEvaluator } from "./evaluator.js"; | ||
import type { Run, Example } from "../schemas.js"; | ||
export class CategoricalScoreConfig { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should add good typedocs :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tried my best, lmk what needs improving
js/src/evaluation/llm_evaluator.ts
Outdated
properties.score = { | ||
type: "string", | ||
enum: scoreConfig.choices, | ||
description: `The score for the evaluation, one of ${scoreConfig.choices.join(", ")}.`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really confusing to me tbh.
Why do we put categorical in the numeric score field? Why call it a score? Why not call it "category" or something obvious?
Score implies sortability / comparability to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed the key to "value", but we also do this in the output parsing which I think solves this a little? not 100% sure, lmk what you think:
if ("choices" in this.scoreConfig) {
const value = output.score;
const explanation = output[this.reasoningKey];
return {
key: this.scoreConfig.key,
value,
comment: explanation,
sourceRunId: output.runId
};
}
js/package.json
Outdated
@@ -106,14 +106,15 @@ | |||
"p-queue": "^6.6.2", | |||
"p-retry": "4", | |||
"semver": "^7.6.3", | |||
"uuid": "^10.0.0" | |||
"uuid": "^10.0.0", | |||
"langchain": "^0.3.3", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't make langchain a dependency of langsmith - it would be cyclic and not all langsmith users use langchain.
I believe we'll have to set them as peerDependencies
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Synced with Jacob to double check. Steps to handle this:
- Rm from dependencies
- Make the llm evaluator a separate entrypoint (see traceablea nd the create_entrypoints.js script)
- We should probably refrain from importing init chat model from the top since that requires langchain to be installed as well as the sub modules. Maybe accept a chat model as an argument and have them initialize, and in our docs we show using that. Core I think more OK
"openai": "^4.67.3", | ||
"prettier": "^2.8.8", | ||
"ts-jest": "^29.1.0", | ||
"ts-node": "^10.9.1", | ||
"typescript": "^5.4.5", | ||
"zod": "^3.23.8" | ||
"zod": "^3.23.8", | ||
"@langchain/core": "^0.3.14", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to keep these sorted and avoid touching unnecessary lines
@@ -12,6 +12,7 @@ const entrypoints = { | |||
traceable: "traceable", | |||
evaluation: "evaluation/index", | |||
"evaluation/langchain": "evaluation/langchain", | |||
"evaluation/llm": "evaluation/llm_evaluator", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Naming should match filepath
So rename file below to evaluation/llm.ts
) { | ||
try { | ||
// Store the configuration | ||
this.scoreConfig = scoreConfig; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More conventional to make this a constructor instead of a factory method
return variables; | ||
} | ||
|
||
private parseOutput( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prefer protected instead of private to make subclassing possible
|
||
constructor() {} | ||
|
||
static async create(params: LLMEvaluatorParams): Promise<LLMEvaluator> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You definitely shouldn't have two methods for this, ideally you just use the constructor
mapVariables?: (run: Run, example?: Example) => Record<string, any>; | ||
} | ||
|
||
export class LLMEvaluator implements RunEvaluator { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No docstring?
|
||
export class LLMEvaluator implements RunEvaluator { | ||
prompt: any; | ||
mapVariables?: (run: Run, example?: Example) => Record<string, any>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are mapVariables?
max: 1, | ||
}), | ||
chatModel: CHAT_MODEL, | ||
mapVariables: (run: any, _example?: any) => ({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a better name for this we can use? mapVariables
sounds like a map of variables or something and .map()
has a specific meaning in JS (and most languages?)
interface LLMEvaluatorParams { | ||
promptTemplate: string | [string, string][]; | ||
scoreConfig: ScoreConfig; | ||
chatModel: BaseLanguageModel; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be BaseChatModel
No description provided.