Skip to content

Commit

Permalink
fix: improve errors on field cast failures (#2932)
Browse files Browse the repository at this point in the history
# Description
Adds information on the field, to-type and from-type when casting fails.

We could consider using our own error type for the casting errors to
allow unrolling errors to get the full path to a field. Currently we
only give the last part of the path.

When looking at `cast_field` I noticed that we might be missing a match
on `(DataType::List(_), DataType::LargeList(_))`. Casting List to
LargeList can currently cause some tricky behaviour. I had a record
batch with a List type, and tried reading it with a LargeList schema.
For some choices of schemas it failed with an error message, for other
schemas is did not fail, but read the columns in the wrong order.

Signed-off-by: R. Tyler Croy <[email protected]>
  • Loading branch information
jkylling authored Oct 19, 2024
1 parent a846879 commit 10c6b5c
Show file tree
Hide file tree
Showing 2 changed files with 22 additions and 3 deletions.
19 changes: 18 additions & 1 deletion crates/core/src/operations/cast/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -144,7 +144,19 @@ fn cast_field(
add_missing,
)?) as ArrayRef),
_ if is_cast_required(col_type, field_type) => {
cast_with_options(col, field_type, cast_options)
cast_with_options(col, field_type, cast_options).map_err(|err| {
if let ArrowError::CastError(err) = err {
ArrowError::CastError(format!(
"Failed to cast {} from {} to {}: {}",
field.name(),
field_type,
col_type,
err
))
} else {
err
}
})
}
_ => Ok(col.clone()),
}
Expand Down Expand Up @@ -337,6 +349,11 @@ mod tests {
assert!(!is_cast_required(&field1, &field2));
}

#[test]
fn test_is_cast_required_with_smol_int() {
assert!(is_cast_required(&DataType::Int8, &DataType::Int32));
}

#[test]
fn test_is_cast_required_with_list_non_default_item() {
let field1 = DataType::List(FieldRef::from(Field::new("item", DataType::Int32, false)));
Expand Down
6 changes: 4 additions & 2 deletions python/tests/test_writer.py
Original file line number Diff line number Diff line change
Expand Up @@ -273,7 +273,8 @@ def test_write_type_castable_types(existing_table: DeltaTable):
engine="rust",
)
with pytest.raises(
Exception, match="Cast error: Cannot cast string 'hello' to value of Int8 type"
Exception,
match="Cast error: Failed to cast int8 from Int8 to Utf8: Cannot cast string 'hello' to value of Int8 type",
):
write_deltalake(
existing_table,
Expand All @@ -284,7 +285,8 @@ def test_write_type_castable_types(existing_table: DeltaTable):
)

with pytest.raises(
Exception, match="Cast error: Can't cast value 1000 to type Int8"
Exception,
match="Cast error: Failed to cast int8 from Int8 to Int64: Can't cast value 1000 to type Int8",
):
write_deltalake(
existing_table,
Expand Down

0 comments on commit 10c6b5c

Please sign in to comment.