add LLMEvaluator to JS #1186

isahers1 · 2024-11-07T01:42:26Z

No description provided.

hinthornw · 2024-11-07T06:12:08Z

js/src/evaluation/llm_evaluator.ts

+import * as uuid from "uuid";
+import { EvaluationResult, EvaluationResults, RunEvaluator } from "./evaluator.js";
+import type { Run, Example } from "../schemas.js";
+export class CategoricalScoreConfig {


We should add good typedocs :)

tried my best, lmk what needs improving

js/src/evaluation/llm_evaluator.ts

hinthornw · 2024-11-07T06:18:19Z

js/src/evaluation/llm_evaluator.ts

+    properties.score = {
+      type: "string",
+      enum: scoreConfig.choices,
+      description: `The score for the evaluation, one of ${scoreConfig.choices.join(", ")}.`,


This is really confusing to me tbh.

Why do we put categorical in the numeric score field? Why call it a score? Why not call it "category" or something obvious?

Score implies sortability / comparability to me

I changed the key to "value", but we also do this in the output parsing which I think solves this a little? not 100% sure, lmk what you think:

if ("choices" in this.scoreConfig) { const value = output.score; const explanation = output[this.reasoningKey]; return { key: this.scoreConfig.key, value, comment: explanation, sourceRunId: output.runId }; }

js/src/evaluation/llm_evaluator.ts

hinthornw · 2024-11-08T23:37:23Z

js/package.json

@@ -106,14 +106,15 @@
    "p-queue": "^6.6.2",
    "p-retry": "4",
    "semver": "^7.6.3",
-    "uuid": "^10.0.0"
+    "uuid": "^10.0.0",
+    "langchain": "^0.3.3",


We can't make langchain a dependency of langsmith - it would be cyclic and not all langsmith users use langchain.

I believe we'll have to set them as peerDependencies

Synced with Jacob to double check. Steps to handle this:

Rm from dependencies

Make the llm evaluator a separate entrypoint (see traceablea nd the create_entrypoints.js script)

We should probably refrain from importing init chat model from the top since that requires langchain to be installed as well as the sub modules. Maybe accept a chat model as an argument and have them initialize, and in our docs we show using that. Core I think more OK

jacoblee93 · 2024-11-14T22:24:10Z

js/package.json

    "openai": "^4.67.3",
    "prettier": "^2.8.8",
    "ts-jest": "^29.1.0",
    "ts-node": "^10.9.1",
    "typescript": "^5.4.5",
-    "zod": "^3.23.8"
+    "zod": "^3.23.8",
+    "@langchain/core": "^0.3.14",


Good to keep these sorted and avoid touching unnecessary lines

jacoblee93 · 2024-11-14T22:26:31Z

js/scripts/create-entrypoints.js

@@ -12,6 +12,7 @@ const entrypoints = {
  traceable: "traceable",
  evaluation: "evaluation/index",
  "evaluation/langchain": "evaluation/langchain",
+  "evaluation/llm": "evaluation/llm_evaluator",


Naming should match filepath

So rename file below to evaluation/llm.ts

jacoblee93 · 2024-11-14T22:42:46Z

js/src/evaluation/llm_evaluator.ts

+  ) {
+    try {
+      // Store the configuration
+      this.scoreConfig = scoreConfig;


More conventional to make this a constructor instead of a factory method

jacoblee93 · 2024-11-14T22:44:04Z

js/src/evaluation/llm_evaluator.ts

+    return variables;
+  }
+
+  private parseOutput(


Prefer protected instead of private to make subclassing possible

jacoblee93 · 2024-11-14T22:44:45Z

js/src/evaluation/llm_evaluator.ts

+
+  constructor() {}
+
+  static async create(params: LLMEvaluatorParams): Promise<LLMEvaluator> {


You definitely shouldn't have two methods for this, ideally you just use the constructor

jacoblee93 · 2024-11-14T22:45:09Z

js/src/evaluation/llm_evaluator.ts

+  mapVariables?: (run: Run, example?: Example) => Record<string, any>;
+}
+
+export class LLMEvaluator implements RunEvaluator {


No docstring?

jacoblee93 · 2024-11-14T22:45:17Z

js/src/evaluation/llm_evaluator.ts

+
+export class LLMEvaluator implements RunEvaluator {
+  prompt: any;
+  mapVariables?: (run: Run, example?: Example) => Record<string, any>;


What are mapVariables?

jacoblee93 · 2024-11-14T22:46:21Z

js/src/tests/llm_evaluator.int.test.ts

+      max: 1,
+    }),
+    chatModel: CHAT_MODEL,
+    mapVariables: (run: any, _example?: any) => ({


Is there a better name for this we can use? mapVariables sounds like a map of variables or something and .map() has a specific meaning in JS (and most languages?)

jacoblee93 · 2024-11-14T22:46:30Z

js/src/evaluation/llm_evaluator.ts

+interface LLMEvaluatorParams {
+  promptTemplate: string | [string, string][];
+  scoreConfig: ScoreConfig;
+  chatModel: BaseLanguageModel;


Should be BaseChatModel

isahers1 added 2 commits November 6, 2024 17:42

wip

a70190f

trace properly

d89b1b2

isahers1 changed the title ~~wip~~ add LLMEvaluator to JS Nov 7, 2024

isahers1 marked this pull request as ready for review November 7, 2024 02:29

hinthornw reviewed Nov 7, 2024

View reviewed changes

isahers1 added 14 commits November 7, 2024 07:18

will comments

7e77aa7

tests

c6ee092

fmt

1b0eae3

fmt

b22d5fb

package stuff

ec3921f

fix

b86e743

fix

4e6cebf

fix

9a0268b

fix

e6f217b

debugging

e02f3df

debugging

df2c0a5

debugging

f1b6eef

forgot to add a file

334451a

fmt

6b572dc

hinthornw reviewed Nov 12, 2024

View reviewed changes

isahers1 added 4 commits November 12, 2024 10:56

peer dependency

0e3e935

dependencies

821c695

gmt

5eb1e01

fmt

a7a60b8

jacoblee93 reviewed Nov 14, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add LLMEvaluator to JS #1186

add LLMEvaluator to JS #1186

isahers1 commented Nov 7, 2024

hinthornw Nov 7, 2024

isahers1 Nov 7, 2024

hinthornw Nov 7, 2024

isahers1 Nov 7, 2024

hinthornw Nov 8, 2024

hinthornw Nov 12, 2024

jacoblee93 Nov 14, 2024

jacoblee93 Nov 14, 2024

jacoblee93 Nov 14, 2024

jacoblee93 Nov 14, 2024

jacoblee93 Nov 14, 2024

jacoblee93 Nov 14, 2024

jacoblee93 Nov 14, 2024

jacoblee93 Nov 14, 2024

jacoblee93 Nov 14, 2024


		constructor() {}

		static async create(params: LLMEvaluatorParams): Promise<LLMEvaluator> {

add LLMEvaluator to JS #1186

Are you sure you want to change the base?

add LLMEvaluator to JS #1186

Conversation

isahers1 commented Nov 7, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment