-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create some "schema safe append" functionality #50
Comments
@MrPowers :
Let me know if thats as per the expectation. |
@puneetsharma04 - I realized I specified this issue pretty poorly, sorry for the confusion. Suppose there is an existing Parquet table. This Something like this:
|
@MrPowers : Thanks for sharing the update.
Could you please check if this looks as per the expectations or you may expect some more changes to it ? |
@MrPowers : Any updates on this issue ? so that i can go ahead with creating PR. |
@MrPowers & @cosmincatalin Thanks in Advance! |
@puneetsharma04 I'm not sure why you've tagged me, does the issue have anything to do with #44? I can otherwise do my best to diagnose the issue anyway 🙂, just want to make sure I have the correct context. |
@cosmincatalin : Thanks for reverting back 🙂 , actually the issue is not related to the #44 , however as you are among the other contributors and may have faced same kind of issue while testing the code in Pycharm. |
Please may you share the whole error? Also maybe stupid question but did you configure Java properly? Because it looks like |
@SemyonSinchenko : Thanks for getting back on this issue. I have installed the java and tested running java program on the MAC OS. However, below given is the full description of error. I am getting this error for all the tests. ============================= test session starts ============================== test_append_if_schema_identical.py::test_append_if_schema_identical ERRORException in thread "main" java.lang.ExceptionInInitializerError test setup failed
conftest.py:37: ../quinn/spark.py:49: in set_up_spark conf = <pyspark.conf.SparkConf object at 0x7fbe2875e150>
E Exception: Java gateway process exited before sending its port number ../venv/lib/python3.7/site-packages/pyspark/java_gateway.py:105: Exception ==================================== ERRORS ====================================
conftest.py:37: ../quinn/spark.py:49: in set_up_spark conf = <pyspark.conf.SparkConf object at 0x7fbe2875e150>
E Exception: Java gateway process exited before sending its port number ../venv/lib/python3.7/site-packages/pyspark/java_gateway.py:105: Exception Process finished with exit code 1 ` |
Which Java version are you using? |
Details of Java version: |
You should use Java8 (1.8). Spark is so old that it still works only with Java 8. |
@puneetsharma04 - feel free to open up a pull request and we can help you push this over the finish line ;) |
@MrPowers : Thanks for the assurance 👍🏻.
|
This error message is clear now and doesn't relate to spark or java. So please check your code and check that there is such a method and you write all the imports in the right way and the |
@SemyonSinchenko : Thanks a lot , its working now. I will work towards creating the PR now. |
#86 : Raised the PR for review. |
The function should append the data if the
append_df
has a schema that matches thedf
exactly.If the schema doesn't match exactly, then it should error out.
This "schema safe append" could prevent bad appends.
The text was updated successfully, but these errors were encountered: