Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: Unable to create WebGLTexture #8423

Open
nickls opened this issue Oct 24, 2024 · 3 comments
Open

Bug: Unable to create WebGLTexture #8423

nickls opened this issue Oct 24, 2024 · 3 comments
Assignees
Labels
type:bug Something isn't working

Comments

@nickls
Copy link

nickls commented Oct 24, 2024

System information

  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow.js):
    Mostly stock inference code with a few small modifications.
  • OS Platform and Distribution: Macbook Pro 16 GB 2020 (Intel Mac) Running OS X 10.15.7
  • TensorFlow.js installed from: NPM
  • TensorFlow.js version: Reproduced with 3.9.0 and 4.21.0
  • Browser version:
    Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/129.0.0.0 Safari/537.36
  • Tensorflow.js Converter Version:

Describe the current behavior
We're running a TF.js model in production that is a fine tuned MobileNetv1. This model works perfectly for all of our users except one, we are unable to reproduce the issue locally or detect the issue before it occurs so we could switch to CPU. This issue started about a month ago, during which time we had not updated any of our TF code or components.

Problem:

  • The model loads successfully using loadGraphModel
  • When we attempt to warm the model it goes into an infinite loop and the system becomes unresponsive.
  • The console reports "Unable to create WebGLTexture" but it is not clear if this happens before or after the loop.
  • We have also seen "Error: Failed to link vertex and fragment shaders." when trying to have them reproduce.

You can see the stackstrace for when the system going into a loop (also attached)
image

Describe the expected behavior

  • The model loads successfully using loadGraphModel
  • The model warms within 50-500ms
  • The model can be used normally.

Here is what the stackstrace looks like when the model successfully loads and warms.
image

Standalone code to reproduce the issue
We cannot reproduce on local systems. But are open to any ideas on how to reproduce the problem.

Other info / logs Include any logs or source code that would be helpful to
diagnose the problem. If including tracebacks, please include the full
traceback. Large logs and files should be attached.

tf.ENV.features:

{
  "IS_BROWSER": true,
  "IS_NODE": false,
  "DEBUG": false,
  "CPU_HANDOFF_SIZE_THRESHOLD": 128,
  "CANVAS2D_WILL_READ_FREQUENTLY_FOR_GPU": false,
  "IS_SAFARI": false,
  "IS_TEST": false,
  "SOFTWARE_WEBGL_ENABLED": false,
  "WEBGL_VERSION": 2,
  "HAS_WEBGL": true,
  "WEBGL_CHECK_NUMERICAL_PROBLEMS": false,
  "WEBGL_CPU_FORWARD": true,
  "WEBGL_PACK": true,
  "WEBGL_PACK_UNARY_OPERATIONS": true,
  "WEBGL_USE_SHAPES_UNIFORMS": false,
  "WEBGL_FORCE_F16_TEXTURES": false,
  "WEBGL_RENDER_FLOAT32_CAPABLE": true,
  "WEBGL_RENDER_FLOAT32_ENABLED": true,
  "WEBGL_SIZE_UPLOAD_UNIFORM": 4,
  "WEBGL_MAX_TEXTURE_SIZE": 16384,
  "WEBGL_MAX_SIZE_FOR_NARROW_TEXTURE": null,
  "WEBGL_AUTO_SQUARIFY_NARROW_TEXTURE_SHAPE": false,
  "WEBGL_ISNAN_CUSTOM": true,
  "ENGINE_COMPILE_ONLY_ON_DEMAND": true,
  "WEBGL_FLUSH_THRESHOLD": -1,
  "WEBGL_LAZILY_UNPACK": true,
  "WEBGL_BUFFER_SUPPORTED": true,
  "WEBGL_FENCE_API_ENABLED": true,
  "WEBGL_DELETE_TEXTURE_THRESHOLD": -1,
  "USE_SETTIMEOUTCUSTOM": false
}

We wrote a TF testing page to help isolate the issue, screen shots are below. These tests all pass for our dev and QA team, but running the model fails for our user.

screenshot_1
screenshot_2

Trace-20240930T110703.json.zip

@nickls nickls added the type:bug Something isn't working label Oct 24, 2024
@nickls
Copy link
Author

nickls commented Oct 24, 2024

cc: @kevinwoolfolk97

@shmishra99 shmishra99 self-assigned this Oct 24, 2024
@nickls
Copy link
Author

nickls commented Oct 24, 2024

Our loading and warming code:

model = await loadGraphModel(this.customModelPath);
...
tf.tidy(() => {
  const results = this.model.predict(
    tf
      .zeros([Detector.IMG_SIZE, Detector.IMG_SIZE, 3], "float32")
      .expandDims(0)
  );

  results.data().then(() => {
    this.setModelState("warmed");
  });
});

@nickls
Copy link
Author

nickls commented Nov 4, 2024

@shmishra99 -- Any ideas on this issue? or anything we can do to help debug it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants