TFX + PyTorch Example #156

hanneshapke · 2022-08-02T23:20:10Z

There are a few TFX examples for how to train Scikit learn or JAX models, I haven't seen an example pipeline for PyTorch.

The pipeline could use a known dataset, e.g. MNIST, ingest the data via the CSVExampleGen, run the standard statistics and schema steps, performs a pseudo transformation (passthrough of the values) with the new PandasTransform component from tfx-addons, add a custom run_fn function for PyTorch, and then add a TFMA example.

Any thoughts?

The text was updated successfully, but these errors were encountered:

hanneshapke · 2022-08-03T15:00:03Z

Proposal for the TFX Addons Example: #157

sayakpaul · 2022-08-23T07:02:13Z

Yes, please. If possible, let's demonstrate it with a model from Hugging Face with PT backend.

rcrowe-google · 2023-03-10T22:12:43Z

One of the things that we will need for this is an ONNX extractor for Evaluator. Maybe we should break that out as a separate project?

sayakpaul · 2023-03-11T00:49:56Z

Could you elaborate this a bit more?

rcrowe-google · 2023-03-11T01:00:31Z

One of the things that we will need for this is an ONNX extractor for Evaluator. Maybe we should break that out as a separate project?

Could you elaborate this a bit more?

My understanding is that for PyTorch developers ONNX is a normal format for saving trained models, while TF's SavedModel format introduces friction. For non-SavedModel models Evaluator needs an Extractor in order to generate predictions to measure. For example, the one for Sklearn and the one for XGBoost

sayakpaul · 2023-03-11T01:07:43Z

ONNX is definitely used but I am not sure that is a normal one like you mentioned. This document gives a good rundown of the serialization semantics in PyTorch: https://pytorch.org/docs/stable/notes/serialization.html

ONNX is definitely quite popularly used there (PyTorch has a direct ONNX exporter too). From what I am gathering here is that we make ONNX the serialization format for the PyTorch models to make them work in a TFX pipeline. Is that so?

rcrowe-google · 2023-03-11T02:28:13Z

... we make ONNX the serialization format for the PyTorch models

My thought is more one of the serialization formats, which to me suggests that breaking it out as a separate project might make sense. We could also do Extractors for TensorRT, TorchScript, or whatever makes sense (and here I'm displaying my ignorance about what makes sense) and let users choose the one they need.

sayakpaul · 2023-03-11T02:32:10Z

Got it. Yeah I concur with your thoughts now.

Moreover, the reason it might make even more sense is because users might want to choose an Extractor in accordance with their deployment infra. For example, ONNX might be better for CPU-based deployment while TensorRT would be better suited for a GPU-based runtime (although ONNX can handle TensorRT as a backend as well).

hanneshapke · 2023-03-11T05:04:35Z

I think Wihan wrote a custom TFMA extractor for PyTorch. We had everything done up to the trainer when we shared the notebook with Wihan. Last time, we talked he was in the process of cleaning up his implementation. He said it worked end-to-end.

rcrowe-google · 2023-03-11T23:49:19Z

I think Wihan wrote a custom TFMA extractor for PyTorch. We had everything done up to the trainer when we shared the notebook with Wihan. Last time, we talked he was in the process of cleaning up his implementation. He said it worked end-to-end.

@wihanbooyse - That would be great! It might make sense to refactor the example to break out the extractor separately, and follow that up with some more extractors for other formats.

hanneshapke mentioned this issue Aug 3, 2022

PyTorch TFX proposal #157

Merged

rcrowe-google added the Project: Idea label Aug 11, 2022

rcrowe-google added Project: Proposed Project: Accepted merged labels Sep 28, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TFX + PyTorch Example #156

TFX + PyTorch Example #156

hanneshapke commented Aug 2, 2022 •

edited

Loading

hanneshapke commented Aug 3, 2022

sayakpaul commented Aug 23, 2022

rcrowe-google commented Mar 10, 2023

sayakpaul commented Mar 11, 2023

rcrowe-google commented Mar 11, 2023

sayakpaul commented Mar 11, 2023 •

edited

Loading

rcrowe-google commented Mar 11, 2023

sayakpaul commented Mar 11, 2023

hanneshapke commented Mar 11, 2023

rcrowe-google commented Mar 11, 2023

TFX + PyTorch Example #156

TFX + PyTorch Example #156

Comments

hanneshapke commented Aug 2, 2022 • edited Loading

hanneshapke commented Aug 3, 2022

sayakpaul commented Aug 23, 2022

rcrowe-google commented Mar 10, 2023

sayakpaul commented Mar 11, 2023

rcrowe-google commented Mar 11, 2023

sayakpaul commented Mar 11, 2023 • edited Loading

rcrowe-google commented Mar 11, 2023

sayakpaul commented Mar 11, 2023

hanneshapke commented Mar 11, 2023

rcrowe-google commented Mar 11, 2023

hanneshapke commented Aug 2, 2022 •

edited

Loading

sayakpaul commented Mar 11, 2023 •

edited

Loading