Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should date.to_pandas() return datetime64? #8019

Open
1 task done
NickCrews opened this issue Jan 18, 2024 · 2 comments · May be fixed by #8784
Open
1 task done

Should date.to_pandas() return datetime64? #8019

NickCrews opened this issue Jan 18, 2024 · 2 comments · May be fixed by #8784
Assignees
Labels
bug Incorrect behavior inside of ibis

Comments

@NickCrews
Copy link
Contributor

What happened?

This is what currently happens on main, per a change I instigated in #7299:

import ibis

ibis.options.interactive = True

d = ibis.memtable({"date": ["2024-01-01", "2024-01-02"]}).cast({"date": "date"}).date
s = d.to_pandas()
print(type(s[0]))
# <class 'datetime.date'>
s
# 0    2024-01-01
# 1    2024-01-02
# Name: date, dtype: object

Perhaps this is what is intended, but this annoyingly breaking for me. I was making plots with altair as
alt.Chart(table.to_pandas()) using tables with a date column. Before, the pandas DF had type datetime64. This is serializable to JSON for altair/vega, and worked. Now, the datetime.date objects are not serializable to JSON, and I get an error. I think the old behavior was better, since .to_pandas() implies to me that we are going to pandas-land, and so we should use as canonical-to-pandas-as-possible dtypes.

It seems like you really thought about the semantics of this already per some comments in that issue and linked PR, but curious if there would be a problem with these semantics:

  • DateScalar.to_pandas() -> pd.Timestamp
  • DateColumn.to_pandas() -> pd.Series[datetime64]
  • DateScalar.execute() -> pd.Timestamp (because by definition this uses pandas)
  • DateColumn.execute() -> pd.Series[datetime64] (same)
  • repr(Date) -> "YYYY-MM-DD" (even if under the hood this uses .to_pandas() and we have pd.Timestamps in intermediate steps)

What version of ibis are you using?

main

What backend(s) are you using, if any?

NA

Relevant log output

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct
@hottwaj
Copy link

hottwaj commented Jul 11, 2024

Just came across this too and wanted to add a vote for getting it fixed :)

@mfatihaktas
Copy link
Contributor

@hottwaj Thanks for chiming in. The fix is implemented in this PR. It is waiting for the merge per this comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Incorrect behavior inside of ibis
Projects
Status: review
Development

Successfully merging a pull request may close this issue.

3 participants