Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Engine Reuse Fails with Different JSON Schemas - "Module has already been disposed" Error #560

Closed
SMarioMan opened this issue Sep 14, 2024 · 3 comments

Comments

@SMarioMan
Copy link
Contributor

When using web-llm, there is an issue with reusing the same MLCEngine instance across multiple completion requests if the response schema changes between requests. Specifically, if we initialize the engine and generate a completion with a schema, subsequent completions with a different schema cause the engine to throw the error:
Module has already been disposed
This error occurs despite the ability to reuse the engine for multiple requests when the schema remains the same or is not provided at all. To maintain feature parity with the OpenAI API, the same MLCEngine instance should be able to handle multiple schemas.

This may be related to #486.

Below is a my reproducer code:

reproducer.ts:

import * as webllm from "@mlc-ai/web-llm";

let engine: webllm.MLCEngine;
const selectedModel = "Llama-3.1-8B-Instruct-q4f32_1-MLC";

try {
    // Standard web-llm initialization.
    const initProgressCallback = (initProgress: webllm.InitProgressReport) => {
        console.log(initProgress);
    }
    engine = await webllm.CreateMLCEngine(selectedModel, {
        initProgressCallback: initProgressCallback,
    });
} catch (error) {
    console.error("Error initializing engine:", error);
    throw error;
}

export async function Completion(engine: webllm.MLCEngine, inputStr: string, 
                                 schemaStr?: string) {
    // Support an optional schema.
    let responseFormat: webllm.ResponseFormat | undefined = undefined;
    if (schemaStr) {
        responseFormat = { type: "json_object", schema: schemaStr }
    }
    // Standard OpenAI API style completion.
    const reply = await engine.chat.completions.create({
        model: selectedModel,
        response_format: responseFormat,
        messages: [{ role: "user", content: inputStr }],
    });
    console.log(inputStr, reply.choices[0].message.content);
    return null;
}

export { engine }

reproducer.astro:

---
---
<p>Check the console</p>
<script>
  import { engine, Completion } from "@utilities/reproducer";
  const schema1 = JSON.stringify({
    type: "object",
    properties: {
      reasoning: { type: "string" },
      result: { type: "string", enum: ["Yes", "No"] },
    },
    required: ["reasoning", "result"],
  });
  const schema2 = JSON.stringify({
    type: "object",
    properties: {
      reasoning: { type: "string" },
      result: { type: "number" },
    },
    required: ["reasoning", "result"],
  });

  // We can reuse the same engine for as many prompts as we want, so long as
  // they always use no schema or the first schema specified.

  // Properly uses schema1 to control output.
  Completion(engine, "Does the moon have its own light?", schema1);
  // Uses the same schema, so no issue.
  Completion(engine, "Do dolphins sleep with one eye open?", schema1);
  // Doesn't use a schema, so no problem.
  Completion(engine, "How many grams are in a pound?");

  // Uses a different schema on the same engine instance.
  // Uncaught (in promise) Error: Module has already been disposed
  Completion(engine, "How many continents are there on Earth?", schema2);
</script>

Example output:

Does the moon have its own light? {"reasoning": "No, the moon does not have its own light. It reflects the light from the sun.", "result": "No"}
reproducer.ts:32 Do dolphins sleep with one eye open? {"reasoning": "Dolphins are marine mammals that need to rest and sleep like all other animals. However, they have a unique way of sleeping due to their need to constantly be alert for predators and navigate their surroundings.", "result": "No"}
reproducer.ts:32 How many grams are in a pound? There are 453.592 grams in a pound.
index.js:2578 Uncaught (in promise) Error: Module has already been disposed
    at TVMArray.getHandle (index.js:2578:23)
    at Instance.setPackedArguments (index.js:3804:53)
    at GrammarFactory.packedFunc [as fGrammarSMFromTokenTable] (index.js:3874:22)
    at GrammarFactory.getGrammarStateMatcherFromTokenTable (grammar.ts:125:17)
    at LLMChatPipeline.<anonymous> (llm_chat.ts:522:31)
    at Generator.next (<anonymous>)
    at tslib.es6.mjs:121:69
    at new Promise (<anonymous>)
    at __awaiter (tslib.es6.mjs:117:10)
    at LLMChatPipeline.prefillStep (llm_chat.ts:446:33)
CharlieFRuan added a commit that referenced this issue Sep 23, 2024
Prior to this PR, when reusing the same engine but for different schema
will run into error "Module has already been disposed". An example to
reproduce this is included in
#560.

This is because `this.tokenTable` is a `tvmjs.TVMObject` and will be
disposed after the scope ends.

We fix this by wrapping the call with
`this.tvm.detachFromCurrentScope()`, and only dispose `this.tokenTable`
when we dispose the entire `LLMChatPipeline`.

Co-authored-by: SMarioMan <[email protected]>
@CharlieFRuan
Copy link
Contributor

This should be fixed by #571 and will be available in the next npm!

@CharlieFRuan
Copy link
Contributor

Should be fixed with npm version 0.2.66. Closing this for now, feel free to open another one if issue persists!

@SMarioMan
Copy link
Contributor Author

Thanks for the more proper fix! The issue has been resolved on my end.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants