Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better error message when forming a DataFrame from a vector of dictionaries with missing data. #3410

Closed
LilithHafner opened this issue Dec 29, 2023 · 2 comments

Comments

@LilithHafner
Copy link
Contributor

julia> DataFrame([Dict(:X => 2, :Y => 4), Dict(:X => 7, :Y => 49)])
2×2 DataFrame
 Row │ X      Y     
     │ Int64  Int64 
─────┼──────────────
   1 │     2      4
   2 │     7     49

julia> DataFrame([Dict(:X => 2, :Y => 4), Dict(:X => 7)])
ERROR: KeyError: key :Y not found
Stacktrace:
  [1] getindex(h::Dict{Symbol, Int64}, key::Symbol)
    @ Base ./dict.jl:510
  [2] getcolumn
    @ ~/.julia/packages/Tables/NSGZI/src/Tables.jl:143 [inlined]
  [3] eachcolumns
    @ ~/.julia/packages/Tables/NSGZI/src/utils.jl:127 [inlined]
  [4] __buildcolumns(rowitr::Vector{…}, st::Int64, sch::Tables.Schema{…}, columns::Tuple{…}, rownbr::Int64, updated::Base.RefValue{…})
    @ Tables ~/.julia/packages/Tables/NSGZI/src/fallbacks.jl:174
  [5] _buildcolumns(rowitr::Vector{…}, row::Dict{…}, st::Int64, sch::Tables.Schema{…}, columns::Tuple{…}, updated::Base.RefValue{…})
    @ Tables ~/.julia/packages/Tables/NSGZI/src/fallbacks.jl:198
  [6] buildcolumns
    @ ~/.julia/packages/Tables/NSGZI/src/fallbacks.jl:227 [inlined]
  [7] _columns
    @ ~/.julia/packages/Tables/NSGZI/src/fallbacks.jl:265 [inlined]
  [8] columns
    @ ~/.julia/packages/Tables/NSGZI/src/fallbacks.jl:258 [inlined]
  [9] DataFrame(x::Vector{Dict{Symbol, Int64}}; copycols::Nothing)
    @ DataFrames ~/.julia/packages/DataFrames/58MUJ/src/other/tables.jl:57
 [10] DataFrame(x::Vector{Dict{Symbol, Int64}})
    @ DataFrames ~/.julia/packages/DataFrames/58MUJ/src/other/tables.jl:48
 [11] top-level scope
    @ REPL[8]:1
Some type information was truncated. Use `show(err)` to see complete types.

It would be nice if

  1. The error message was more clear about the fact that the dictionaries do not all share the same keyset
  2. There were an option (hinted to in the error message) to make missing entries cause missing entries in the dataframe rather than an error

From Logo on zulip link

@jariji
Copy link
Contributor

jariji commented Dec 29, 2023

Dup of JuliaData/Tables.jl#352 which I should have posted in the Zulip thread.

@LilithHafner
Copy link
Contributor Author

Interesting that I'm getting a "key not found" error and you're getting a "has no field" error. Closing in favor of JuliaData/Tables.jl#352

@LilithHafner LilithHafner closed this as not planned Won't fix, can't repro, duplicate, stale Dec 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants