Releases: tidyverse/duckplyr
Releases · tidyverse/duckplyr
duckplyr 0.4.1
Features
df_from_file()
and related functions support multiple files (#194, #195), show a clear error message for non-stringpath
arguments (#182), and create a tibble by default (#177).- New
as_duckplyr_tibble()
to convert a data frame to a duckplyr tibble (#177). - Support descending sort for character and other non-numeric data (@toppyy, #92, #175).
- Avoid setting memory limit (#193).
- Check compatibility of join columns (#168, #185).
- Explicitly list supported functions, add contributing guide, add analysis scripts for GitHub activity data (#179).
Documentation
- Add contributing guide (#179).
- Show a startup message at package load if telemetry is not configured (#188, #198).
?df_from_file
shows how to read multiple files (#181, #186) and how to specify CSV column types (#140, #189), and is shown correctly in reference index (#173, #190).- Discuss dbplyr in README (#145, #191).
- Add analysis scripts for GitHub activity data (#179).
duckplyr 0.4.0
Features
- Use built-in rfuns extension to implement equality and inequality operators, improve translation for
as.integer()
,NA
and%in%
(#83, #154, #148, #155, #159, #160). - Reexport non-deprecated dplyr functions (#144, #163).
library(duckplyr)
callsmethods_overwrite()
(#164).- Only allow constant patterns in
grepl()
. - Explicitly reject calls with named arguments for now.
- Reduce default memory limit to 1 GB.
Bug fixes
- Stricter type checks in the set operations
intersect()
,setdiff()
,symdiff()
,union()
, andunion_all()
(#169). - Distinguish between constant
NA
and those used in an expression (#157). head(-1)
forwards to the default implementation (#131, #156).- Fix cli syntax for internal error message (#151).
- More careful detection of row names in data frame.
- Always check roundtrip for timestamp columns.
left_join()
and other join functions callauto_copy()
.- Only reset expression depth if it has been set before.
- Require fallback if the result contains duplicate column names when ignoring case.
row_number()
returns integer.is.na(NaN)
isTRUE
.summarise(count = n(), count = n())
creates only one column namedcount
.- Correct wording in instructions for enabling fallback logging (@TimTaylor, #141).
Chore
Documentation
- Mention wildcards to read multiple files in
?df_from_file
(@andreranza, #133, #134).
Testing
duckplyr 0.3.2
duckplyr 0.3.1
Bug fixes
- Forbid reuse of new columns created in
summarise()
(#72, #106). summarise()
no longer restores subclass.- Disambiguate computation of
log10()
andlog()
. - Fix division by zero for positive and negative numbers.
Features
- New
fallback_sitrep()
and related functionality for collecting telemetry data (#102, #107, #110, #111, #115). No data is collected by default, only a message is displayed once per session and then every eight hours. Opt in or opt out by setting environment variables. - Implement
group_by()
and other methods to collect fallback information (#94, #104, #105). - Set memory limit and temporary directory for duckdb.
- Implement
suppressWarnings()
as the identity function. - Prefer
cli::cli_abort()
overstop()
orrlang::abort()
(#114). - Translate
.data$a
and.env$a
. - Strict checks for column class, only supporting
integer
,numeric
,logical
,Date
,POSIXct
, anddifftime
for now. - If the environment variable
DUCKPLYR_METHODS_OVERWRITE
is set toTRUE
, loading duckplyr automatically callsmethods_overwrite()
.
Internal
- Better duckdb tests.
- Use standalone purrr for dplyr compatibility.
Testing
- Add tests for correct base of
log()
andlog10()
.
Documentation
methods_overwrite()
andmethods_restore()
show a message.
duckplyr 0.3.0
## Bug fixes - `grepl(x = NA)` gives correct results. - Fix `auto_copy()` for non-data-frame input. - Add output order preservation for filters. - `distinct()` now preserves order in corner cases (#77, #78). - Consistent computation of `log(0)` and `log(-1)` (#75, #76). ## Features - Only allow constants in `mutate()` that are actually representable in duckdb (#73). - Avoid translating `ifelse()`, support `if_else()` (#79). ## Documentation - Separate and explain the new relational examples (@wibeasley, #84). ## Testing - Add test that TPC-H queries can be processed. ## Chore - Sync with dplyr 1.1.4 (#82). - Remove `dplyr_reconstruct()` method (#48). - Render README. - Fix code generated by `meta_replay()`. - Bump constructive dependency. - Fix output order for `arrange()` in case of ties. - Update duckdb tests. - Only implement newer `slice_sample()`, not `sample_n()` or `sample_frac()` (#74). - Sync generated files (#71).
duckplyr 0.2.3
Performance
- Join using
IS NOT DISTINCT FROM
for faster execution (duckdb/duckdb-r#41, #68).
Documentation
duckplyr 0.2.2
duckplyr 0.2.1
duckplyr 0.2.0
What's Changed
- Implement summarise() by @krlmlr in #1
- WIP: Draft support for window functions by @krlmlr in #4
- Benchmarking adjustments by @Tmonster in #6
- Revert prefix suffix optimizations, revert year extraction by @Tmonster in #7
- ci: add GHA standard check workflow by @maelle in #23
- docs: fix BugReports link by @maelle in #10
- Update scripts to run out of the box by @Tmonster in #8
- Various easy fixes after running R CMD check by @maelle in #16
- Make Duckplyr_df from csv more generic so it can be from json, parquet, csv et.c by @Tmonster in #32
- docs: document undocumented arguments by @maelle in #28
- chore: R CMD check fixes by @maelle in #42
- chore: More R CMD check fixes by @krlmlr in #43
- chore: Almost ready for CRAN by @krlmlr in #44
- feat: Order-stable
union_all()
by @krlmlr in #45 - Sync documentation with v0.1.0 by @krlmlr in #46
New Contributors
Full Changelog: https://github.com/duckdblabs/duckplyr/commits/v0.2.0
duckplyr 0.1.0
Initial release, generics only.
Full Changelog: 1950e98...v0.1.0