Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shape related code updated to address TODO(b/208879020) #270

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

pritamdodeja
Copy link

Summary of change

Shape related code has been updated to treat each observation as a tensor of shape (1,).

Details

The feature spec is updated to use shape (1,), and therefore the schema as well. education-num is now treated as a dense tensor instead of sparse as it may be missing values, but it does not vary in its length to warrant treatment as a RaggedTensor. transform_dataset is updated to reshape the raw data so each observation is transformed to be of shape (1,) before passing through tft_layer. This pull request includes pr268. I am open to making them independent of each other and any other feedback. I would like to make a notebook version of this example that walks through the entire lifecycle of the workflow in the context of tft. The details are in that pull request, but I would like to expand it to be more instructive through interactivity.

When read_raw_data_for_training is set to False when invoking the main
function, common.transform_data was being called on raw train and test
data anyway.  This fix moves the transformation to the block where
read_raw_data_for_training is True. The scenario here is the data has
already been preprocessed, and the user wishes to re-use that
preprocessed data.
Since this is tabular data we're dealing with, the code has been
updated to treat it as such.  The net result is simpler shape related
code.  Education-num is treated as dense here instead of sparse as it
was before.  It might be missing values in the data, so it might call
for some sort of imputation to be done.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant