AssertSmallDataFrameEquality with ignoreNullable set to true isn't working #118

labbedaine · 2023-07-05T11:17:02Z

Hello. I have a question regarding the usage of the ignoreNullable flag when a Dataframe is not created by createDF. The following test works properly:

test("IgnoreNullable") {
    val df1 =
      spark.createDF(
        List(("Hello, world!")),
        List(("Test", StringType, false))
      )

    val df2 =
      spark.createDF(
        List(("Hello, world!")),
        List(("Test", StringType, true))
      )

    assertSmallDataFrameEquality(df1, df2, ignoreNullable = true)
  }

However when it's time to compare a Dataframe produced by production code (ex.: .transform) with an expected Dataframe created with createdDF, the ignoreNullable is ignored and then the library throws an error on the schema.


test("IgnoreNullable: Not working") {
    Given("")
    //-- NOOP

    When("")
    val actualDF = spark.table("MyTable").transform(ApplyBuisnessLogic())

    Then("")
    val expectedDF =
      spark.createDF(
        List(("Hello, world!")),
        List(("Test", StringType, true))
      )
      
    assertSmallDataFrameEquality(actualDF, expectedDF, ignoreNullable = true)
  }

Is it a bug or simply me not able to use the flag properly? I am using v.1.3.0

Thank you!

The text was updated successfully, but these errors were encountered:

scheleaap · 2024-04-03T16:03:47Z

Are you sure there isn't a datatype difference somewhere? The way the output is formatted can be misleading and it has led me to mistakenly believe ignoreNullable doesn't work correctly several times.

In the example below, several rows are marked red because there is a difference. However, not all differences cause the test to fail. For example transactionVersion (line 3) is colored red because the nullability is different. If you step through the code, you'll see that that doesn't cause the test to fail. In reality, it's line 17 where the field name and type are different that cause the test to fail.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AssertSmallDataFrameEquality with ignoreNullable set to true isn't working #118

AssertSmallDataFrameEquality with ignoreNullable set to true isn't working #118

labbedaine commented Jul 5, 2023 •

edited

Loading

scheleaap commented Apr 3, 2024 •

edited

Loading

AssertSmallDataFrameEquality with ignoreNullable set to true isn't working #118

AssertSmallDataFrameEquality with ignoreNullable set to true isn't working #118

Comments

labbedaine commented Jul 5, 2023 • edited Loading

scheleaap commented Apr 3, 2024 • edited Loading

labbedaine commented Jul 5, 2023 •

edited

Loading

scheleaap commented Apr 3, 2024 •

edited

Loading