-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add spark-connect tests suite and run existing tests with SC #241
Comments
Hi @SemyonSinchenko - are you okay if I give this one a try? Would need some guidance, but would like give a try of the first draft myself. |
@nijanthanvijayakumar This one is little tricky. It requires to download and run Spark-Connect server before running tests and also use |
Feel free to ask me if you face any problem with it. |
Thank you; will definitely do. A few initial questions:
|
BUT! To avoid very big PRs that are hard to review, please focus at first on CI/CD and suite. You may create a single test for that. |
This comment was marked as resolved.
This comment was marked as resolved.
@SemyonSinchenko - an update on the above comment
I will raise a PR for you to review. Please let me know your comments on that. |
@nijanthanvijayakumar I think you may continue to work in this issue. If you can, try to run all the existing tests with Spark Connect. Feel free to ask me, if you face any issue. |
Thank you @SemyonSinchenko . Will work on integrating the test cases and create a new PR. |
@SemyonSinchenko - sorry I have been away from my computer for a while. I was able to fix the However, there are a couple of other test cases/functions that were failing too. Following are those:
| =========================== short test summary info ============================
| FAILED tests/test_functions.py::test_array_choice - TypeError: Unsupported Data Type Column
| FAILED tests/test_transformations.py::test_sort_struct_nested_with_arraytypes_nullable - chispa.schema_comparer.SchemasNotEqualError: |
@nijanthanvijayakumar Thank you! May you post a more detailed stacktrace? Or even run tests in CI in your fork an give me a link to the logs? |
Thank you @SemyonSinchenko. Yep, here they are: |
@nijanthanvijayakumar It seems to me that you found an interesting behavior. Let's pause it for now. |
Thank you @SemyonSinchenko; tried fixing this for hours and couldn't. Keen to know/learn more about the behaviour here. |
It seems to me that is a bug in PySpark itself. |
@SemyonSinchenko - good to know. I'm not in that channel, but would love to be part of it. Thanks. |
@nijanthanvijayakumar Give me your email, please. I will ask @MrPowers to contact you |
Hello @SemyonSinchenko . Hope you got my email address (noticed your thumbs up earlier and I assumed you had got it)? I have removed it from this chat for now, just for privacy reasons. |
@nijanthanvijayakumar FYI apache/spark@536445c May you mark this function as "non-working" if the version of spark is less than 4.0 (or less than 3.5.2 for 3.5-branch) and throw an exception, like "This function is not working in Connect environments for spark less than 3.5.2"? |
Hi @SemyonSinchenko - if my understanding is right, do you want me to do the following?
|
I mean you should mark functions in chispa as non-workin in connect. These functions (array_choice, etc.) should trow an exception if u are trying to call them in Connect env from spark version less than 3.5.2 |
Thanks for the guidance @SemyonSinchenko . I have updated the test cases and the actual function implementation accordingly. Here's the link to the PR |
Feature Type
Adding new functionality to quinn
Changing existing functionality in quinn
Removing existing functionality in quinn
Problem Description
We need to test quinn against both connect and classic.
Feature Description
.
Additional Context
This one may be used as an example: https://github.com/pyspark-ai/pyspark-ai/blob/master/run_spark_connect.sh
The text was updated successfully, but these errors were encountered: