Releases · tidyverse/dplyr

24 Feb 15:47

hadley

dplyr 0.1.2

New features

select() is substantially more powerful. You can use named arguments to
rename existing variables, and new functions starts_with(), ends_with(),
contains(), matches() and num_range() to select variables based on
their names. It now also makes a shallow copy, substantially reducing its
memory impact (#158, #172, #192, #232).
summarize() added as alias for summarise() for people from countries
that don't don't spell things correctly ;) (#245)

filter() now fails when given anything other than a logical vector, and
correctly handles missing values (#249). filter.numeric() proxies
stats::filter() so you can continue to use filter() function with
numeric inputs (#264).
summarise() correctly uses newly created variables (#259).
mutate() correctly propagates attributes (#265) and mutate.data.frame()
correctly mutates the same variable repeatedly (#243).
lead() and lag() preserve attributes, so they now work with
dates, times and factors (#166).
n() never accepts arguments (#223).
row_number() gives correct results (#227).
rbind_all() silently ignores data frames with 0 rows or 0 columns (#274).
group_by() orders the result (#242). It also checks that columns
are of supported types (#233, #276).
The hybrid evaluator did not handle some expressions correctly, for
example in if(n() > 5) 1 else 2 the subexpression n() was not
substituted correctly. It also correctly processes $ (#278).
arrange() checks that all columns are of supported types (#266).
Working towards Solaris compatibility.
Benchmarking vignette temporarily disabled due to microbenchmark
problems reported by BDR.

Assets 2

30 Jan 01:27

hadley

dplyr 0.1.1

new location() and changes() functions which provide more information
about how data frames are stored in memory so that you can see what
gets copied.
renamed explain_tbl() to explain() (#182).
tally() gains sort argument to sort output so highest counts
come first (#173).
ungroup.grouped_df(), tbl_df(), as.data.frame.tbl_df() now only
make shallow copies of their inputs (#191).
The benchmark-baseball vignette now contains fairer (including grouping
times) comparisons with data.table. (#222)

filter() (#221) and summarise() (#194) correctly propagate attributes.
summarise() throws an error when asked to summarise an unknown variable
instead of crashing (#208).
group_by() handles factors with missing values (#183).
filter() handles scalar results (#217) and better handles scoping, e.g.
filter(., variable) where variable is defined in the function that calls
filter. It also handles T and F as aliases to TRUE and FALSE
if there are no T or F variables in the data or in the scope.
select.grouped_df fails when the grouping variables are not included
in the selected variables (#170)
all.equal.data.frame() handles a corner case where the data frame has
NULL names (#217)
mutate() gives informative error message on unsupported types (#179)
dplyr source package no longer includes pandas benchmark, reducing
download size from 2.8 MB to 0.5 MB.

Assets 2

17 Jan 13:26

hadley

First release

v0.1

Update readme and notes for CRAN release

Assets 2