Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

create helper function #26225

Open
wants to merge 50 commits into
base: master
Choose a base branch
from
Open

create helper function #26225

wants to merge 50 commits into from

Conversation

smeet07
Copy link
Contributor

@smeet07 smeet07 commented Apr 11, 2023

Create helper function to convert rank 2 tensor to rank 1 tensor

addresses #24902


Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • Mention the appropriate issue in your description (for example: addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment fixes #<ISSUE NUMBER> instead.
  • Update CHANGES.md with noteworthy changes.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md

GitHub Actions Tests Status (on master branch)

Build python source distribution and wheels
Python tests
Java tests
Go tests

See CI.md for more information about GitHub Actions CI.

Create helper function to convert rank 2 tensor to rank 1 tensor
@codecov
Copy link

codecov bot commented Apr 11, 2023

Codecov Report

Merging #26225 (f5b25b5) into master (d3f1acb) will increase coverage by 0.28%.
The diff coverage is 0.00%.

@@            Coverage Diff             @@
##           master   #26225      +/-   ##
==========================================
+ Coverage   71.20%   71.49%   +0.28%     
==========================================
  Files         787      849      +62     
  Lines      103330   102820     -510     
==========================================
- Hits        73581    73507      -74     
+ Misses      28252    27816     -436     
  Partials     1497     1497              
Flag Coverage Δ
python 80.32% <0.00%> (+0.48%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Changed Coverage Δ
...am/testing/benchmarks/cloudml/criteo_tft/criteo.py 0.00% <0.00%> (ø)

... and 292 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@github-actions
Copy link
Contributor

Assigning reviewers. If you would like to opt out of this review, comment assign to next reviewer:

R: @damccorm for label python.

Available commands:

  • stop reviewer notifications - opt out of the automated review tooling
  • remind me after tests pass - tag the comment author after tests pass
  • waiting on author - shift the attention set back to the author (any comment or push by the author will return the attention set to the reviewers)

The PR bot will only process comments in the main thread (not review comments).

@damccorm
Copy link
Contributor

Run TFT Criteo Benchmarks

Copy link
Contributor

@damccorm damccorm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing this! Looks like there are some formatting/linting issues - you can get more detail on them by looking in the logs (linked in the checks section) or by running them locally with the commands described here - https://cwiki.apache.org/confluence/display/BEAM/Python+Tips#PythonTips-LintandFormattingChecks

feature = tf.squeeze(feature, axis=1)
return feature

fill_in_missing(feature)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably need to reassign these back to feature, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right

@smeet07
Copy link
Contributor Author

smeet07 commented Apr 12, 2023

Thanks for doing this! Looks like there are some formatting/linting issues - you can get more detail on them by looking in the logs (linked in the checks section) or by running them locally with the commands described here - https://cwiki.apache.org/confluence/display/BEAM/Python+Tips#PythonTips-LintandFormattingChecks

I tried solving them by referring to the console output but seems like there is still something I'm missing out , I'll look into it again

@damccorm
Copy link
Contributor

Run TFT Criteo Benchmarks

@smeet07
Copy link
Contributor Author

smeet07 commented Apr 12, 2023

Run TFT Criteo Benchmarks

could you guide me how to do it ?

@damccorm
Copy link
Contributor

damccorm commented Apr 12, 2023

could you guide me how to do it ?

Sorry for the confusion, that was a command for our automation (our Jenkins setup will trigger builds based on comments in the PR). So just commenting Run TFT Criteo Benchmarks should eventually cause it to run. I'll comment again so it triggers against the most recent commit

Seems like we're having some issues with our Jenkins setup, so it may take a while (or I may need to try again later)

@damccorm
Copy link
Contributor

Run TFT Criteo Benchmarks

@smeet07
Copy link
Contributor Author

smeet07 commented Apr 12, 2023

Oh, can only reviewers run the tests or the contributors as well?

@damccorm
Copy link
Contributor

Contributors can as well

@damccorm
Copy link
Contributor

Run TFT Criteo Benchmarks

@smeet07
Copy link
Contributor Author

smeet07 commented Apr 13, 2023

@damccorm the TFT criteo benchmark test is giving the following error
ImportError: cannot import name 'builder' from 'google.protobuf.internal' (/home/jenkins/jenkins-slave/workspace/beam_CloudML_Benchmarks_Dataflow_PR/src/build/gradleenv/-1734967050/lib/python3.9/site-packages/google/protobuf/internal/init.py)

@damccorm
Copy link
Contributor

Lets run it again, I think it may be fixed by 0dcb26d

@damccorm
Copy link
Contributor

Run TFT Criteo Benchmarks

@smeet07
Copy link
Contributor Author

smeet07 commented Apr 14, 2023

Run TFT Criteo Benchmarks

@AnandInguva
Copy link
Contributor

Run TFT Criteo Benchmarks

this benchmark is failing currently since we updated protobuf to 4.x.x and TFT still relies on protobuf < 4. I am watching TFT pypi for the next release which will make the test suite green

@AnandInguva
Copy link
Contributor

On that note, I would suggest to do some testing locally for now.

@smeet07
Copy link
Contributor Author

smeet07 commented Jul 8, 2023

Run PythonLint PreCommit

@smeet07
Copy link
Contributor Author

smeet07 commented Jul 8, 2023

Run PythonLint PreCommit

@smeet07
Copy link
Contributor Author

smeet07 commented Jul 12, 2023

@AnandInguva
I was getting apache_beam/testing/benchmarks/cloudml/criteo_tft/criteo_test.py:30: error: Incompatible types in assignment (expression has type "None", variable has type "Callable[[Any, Any], Any]") [assignment] error so I changed the import statement of fill in missing to account for None as
fill_in_missing: Optional[Callable[[tf.sparse.SparseTensor, int], tf.Tensor]] = None .
But now I am getting AttributeError: 'NoneType' object has no attribute 'sparse' because tf is not getting imported correctly and is present in the same try block. can you suggest some workaround for this or the original error?

@AnandInguva
Copy link
Contributor

@AnandInguva I was getting apache_beam/testing/benchmarks/cloudml/criteo_tft/criteo_test.py:30: error: Incompatible types in assignment (expression has type "None", variable has type "Callable[[Any, Any], Any]") [assignment] error so I changed the import statement of fill in missing to account for None as fill_in_missing: Optional[Callable[[tf.sparse.SparseTensor, int], tf.Tensor]] = None . But now I am getting AttributeError: 'NoneType' object has no attribute 'sparse' because tf is not getting imported correctly and is present in the same try block. can you suggest some workaround for this or the original error?

Thanks. I will take a look in some time.

try:
import tensorflow as tf
import tensorflow_transform as tft
except ImportError as e:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you don't need this try except here. It is needed only in the tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not adding it was causing import error that is why added it , I'll remove it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should import criteo.py in tests under a try except block since criteo.py has a dependency on tft which might not be installed on all test environments.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I've imported criteo.py in a try block

@@ -110,6 +113,20 @@ def make_input_feature_spec(include_label=True):
return result


def fill_in_missing(feature, default_value=-1):
if tf is not None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this if condition as well.

import tensorflow as tf
from apache_beam.testing.benchmarks.cloudml.criteo_tft.criteo import fill_in_missing
except ImportError:
tft = None # type: ignore[assignment]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad. Remove the # type: ignore[assignment].

Mypy check should pass after that

import numpy as np
import pytest

from typing import Any, Callable, Optional
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove Any, Callable from imports since they are not being used anywhere.

Lint errors would get solved after this.

@smeet07
Copy link
Contributor Author

smeet07 commented Jul 25, 2023

@AnandInguva I'm having trouble finding out the exact cause of github workflow checks failing since the console output is too large, what do you look for while browsing through the files?

@AnandInguva
Copy link
Contributor

The Jenkins tests are passing but GHA tests are not. Can you merge master branch on to yours?

@AnandInguva
Copy link
Contributor

Run TFT Criteo Benchmarks

@smeet07
Copy link
Contributor Author

smeet07 commented Sep 28, 2023

@AnandInguva I'm not able to merge it on my this computer as git clone is not cloning the whole repo

@AnandInguva
Copy link
Contributor

able

Sorry. I didn't get what you are saying. Can you provide which commands you are using?

@smeet07
Copy link
Contributor Author

smeet07 commented Oct 2, 2023

able

Sorry. I didn't get what you are saying. Can you provide which commands you are using?

git clone is not copying the whole code for some reason in my new laptop

@damccorm
Copy link
Contributor

@smeet07 any luck getting this fixed up? Sounds like your local git config is currently broken, my recommendation would probably be to try uninstalling/reinstalling.

You could also try opening this in a GitHub codespace - https://github.com/features/codespaces

@smeet07
Copy link
Contributor Author

smeet07 commented Oct 29, 2023

forgot about this issue tbh, will get this fixed up in 1-2 days

@liferoad
Copy link
Collaborator

@smeet07 are you still working on this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants