Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

convert_model.mnist_eg.tf errors, possibly due to different TF versions #35

Open
theta-lin opened this issue Apr 17, 2024 · 1 comment

Comments

@theta-lin
Copy link

> python3 --version
Python 3.11.8
> pip list | grep tensorflow
tensorflow                   2.16.1

I encountered errors executing python3 -m convert_model.mnist_eg.tf. Since the version of tensorflow is unspecified in convert_model/requirements.txt, perhaps tensorflow works differently in my version. (The version of coremltools is also unspecified, which might also be problematic)

2024-04-17 15:25:44.036395: I external/local_tsl/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
2024-04-17 15:25:44.040311: I external/local_tsl/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
2024-04-17 15:25:44.091677: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-04-17 15:25:46.018733: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2024-04-17 15:25:47.542628: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:998] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2024-04-17 15:25:47.544149: W tensorflow/core/common_runtime/gpu/gpu_device.cc:2251] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/data/dku/fedcampus/FedKit/convert_model/mnist_eg/tf.py", line 32, in <module>
    tflite()
  File "/data/dku/fedcampus/FedKit/convert_model/mnist_eg/tf.py", line 25, in tflite
    save_model(model, SAVED_MODEL_DIR)
  File "/data/dku/fedcampus/FedKit/convert_model/tflite.py", line 66, in save_model
    parameters = model.parameters.get_concrete_function()
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/dku/fedcampus/FedKit/backend/.venv/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/polymorphic_function.py", line 1251, in get_concrete_function
    concrete = self._get_concrete_function_garbage_collected(*args, **kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/dku/fedcampus/FedKit/backend/.venv/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/polymorphic_function.py", line 1221, in _get_concrete_function_garbage_collected
    self._initialize(args, kwargs, add_initializers_to=initializers)
  File "/data/dku/fedcampus/FedKit/backend/.venv/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/polymorphic_function.py", line 696, in _initialize
    self._concrete_variable_creation_fn = tracing_compilation.trace_function(
                                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/dku/fedcampus/FedKit/backend/.venv/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/tracing_compilation.py", line 178, in trace_function
    concrete_function = _maybe_define_function(
                        ^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/dku/fedcampus/FedKit/backend/.venv/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/tracing_compilation.py", line 283, in _maybe_define_function
    concrete_function = _create_concrete_function(
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/dku/fedcampus/FedKit/backend/.venv/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/tracing_compilation.py", line 310, in _create_concrete_function
    traced_func_graph = func_graph_module.func_graph_from_py_func(
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/dku/fedcampus/FedKit/backend/.venv/lib/python3.11/site-packages/tensorflow/python/framework/func_graph.py", line 1059, in func_graph_from_py_func
    func_outputs = python_func(*func_args, **func_kwargs)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/dku/fedcampus/FedKit/backend/.venv/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/polymorphic_function.py", line 599, in wrapped_fn
    out = weak_wrapped_fn().__wrapped__(*args, **kwds)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/dku/fedcampus/FedKit/backend/.venv/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/polymorphic_function.py", line 1719, in bound_method_wrapper
    return wrapped_fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/dku/fedcampus/FedKit/backend/.venv/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/autograph_util.py", line 52, in autograph_handler
    raise e.ag_error_metadata.to_exception(e)
  File "/data/dku/fedcampus/FedKit/backend/.venv/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/autograph_util.py", line 41, in autograph_handler
    return api.converted_call(
           ^^^^^^^^^^^^^^^^^^^
  File "/data/dku/fedcampus/FedKit/backend/.venv/lib/python3.11/site-packages/tensorflow/python/autograph/impl/api.py", line 439, in converted_call
    result = converted_f(*effective_args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/tmp/__autograph_generated_file__ud6qml.py", line 12, in tf__parameters
    retval_ = {f'a{ag__.ld(index)}': ag__.converted_call(ag__.ld(weight).read_value, (), None, fscope) for index, weight in ag__.converted_call(ag__.ld(enumerate), (ag__.ld(self).model.weights,), None, fscope)}
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/tmp/__autograph_generated_file__ud6qml.py", line 12, in <dictcomp>
    retval_ = {f'a{ag__.ld(index)}': ag__.converted_call(ag__.ld(weight).read_value, (), None, fscope) for index, weight in ag__.converted_call(ag__.ld(enumerate), (ag__.ld(self).model.weights,), None, fscope)}
                                                         ^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: in user code:

    File "/data/dku/fedcampus/FedKit/convert_model/tflite.py", line 31, in parameters  *
        f"a{index}": weight.read_value()

    AttributeError: 'Variable' object has no attribute 'read_value'

After changing

f"a{index}": weight.read_value()

to

f"a{index}": weight.value.read_value()

this error is resolved, but I then encountered another error

2024-04-17 15:54:33.916521: I external/local_tsl/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
2024-04-17 15:54:33.920357: I external/local_tsl/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
2024-04-17 15:54:33.969224: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-04-17 15:54:35.695073: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2024-04-17 15:54:37.321051: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:998] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2024-04-17 15:54:37.322647: W tensorflow/core/common_runtime/gpu/gpu_device.cc:2251] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/data/dku/fedcampus/FedKit/convert_model/mnist_eg/tf.py", line 32, in <module>
    tflite()
  File "/data/dku/fedcampus/FedKit/convert_model/mnist_eg/tf.py", line 25, in tflite
    save_model(model, SAVED_MODEL_DIR)
  File "/data/dku/fedcampus/FedKit/convert_model/tflite.py", line 71, in save_model
    tf.saved_model.save(
  File "/data/dku/fedcampus/FedKit/backend/.venv/lib/python3.11/site-packages/tensorflow/python/saved_model/save.py", line 1392, in save
    save_and_return_nodes(obj, export_dir, signatures, options)
  File "/data/dku/fedcampus/FedKit/backend/.venv/lib/python3.11/site-packages/tensorflow/python/saved_model/save.py", line 1427, in save_and_return_nodes
    _build_meta_graph(obj, signatures, options, meta_graph_def))
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/dku/fedcampus/FedKit/backend/.venv/lib/python3.11/site-packages/tensorflow/python/saved_model/save.py", line 1642, in _build_meta_graph
    return _build_meta_graph_impl(obj, signatures, options, meta_graph_def)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/dku/fedcampus/FedKit/backend/.venv/lib/python3.11/site-packages/tensorflow/python/saved_model/save.py", line 1566, in _build_meta_graph_impl
    asset_info, exported_graph = _fill_meta_graph_def(
                                 ^^^^^^^^^^^^^^^^^^^^^
  File "/data/dku/fedcampus/FedKit/backend/.venv/lib/python3.11/site-packages/tensorflow/python/saved_model/save.py", line 933, in _fill_meta_graph_def
    signatures = _generate_signatures(signature_functions, object_map, defaults)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/dku/fedcampus/FedKit/backend/.venv/lib/python3.11/site-packages/tensorflow/python/saved_model/save.py", line 655, in _generate_signatures
    outputs = object_map[function](**{
              ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/dku/fedcampus/FedKit/backend/.venv/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/saved_model_exported_concrete.py", line 45, in __call__
    export_captures = _map_captures_to_created_tensors(
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/dku/fedcampus/FedKit/backend/.venv/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/saved_model_exported_concrete.py", line 74, in _map_captures_to_created_tensors
    _raise_untracked_capture_error(function.name, exterior, interior)
  File "/data/dku/fedcampus/FedKit/backend/.venv/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/saved_model_exported_concrete.py", line 98, in _raise_untracked_capture_error
    raise AssertionError(msg)
AssertionError: Tried to export a function which references an 'untracked' resource. TensorFlow objects (e.g. tf.Variable) captured by functions must be 'tracked' by assigning them to an attribute of a tracked object or assigned to an attribute of the main object directly. See the information below:
	Function name = b'__inference_signature_wrapper_1685'
	Captured Tensor = <ResourceHandle(name="loss/total/10", device="/job:localhost/replica:0/task:0/device:CPU:0", container="Anonymous", type="tensorflow::Var", dtype and shapes : "[ DType enum: 1, Shape: [] ]")>
	Trackable referencing this tensor = <tf.Variable 'loss/total:0' shape=() dtype=float32>
	Internal Tensor = Tensor("1637:0", shape=(), dtype=resource)
8 restore test results.

According to answers such as https://stackoverflow.com/questions/73416907/model-save-tried-to-export-a-function-which-references-untracked-resource-eve, the use of static members might cause this problem, but I don't know how to fix it in this project's case.

@SichangHe
Copy link
Contributor

Did reproduce. Unfortunately, I don't know how to solve this.

Ideas:

  1. Try with the version here: https://github.com/adap/flower/blob/main/examples/android-kotlin/gen_tflite/pyproject.toml
  2. Check out TFLite's latest on-device training example and compare it to the 2022 one (which this converter is based on).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants