Skip to content

Releases: tidyverse/dplyr

dplyr 0.1.2

24 Feb 15:47
Compare
Choose a tag to compare

New features

  • select() is substantially more powerful. You can use named arguments to
    rename existing variables, and new functions starts_with(), ends_with(),
    contains(), matches() and num_range() to select variables based on
    their names. It now also makes a shallow copy, substantially reducing its
    memory impact (#158, #172, #192, #232).
  • summarize() added as alias for summarise() for people from countries
    that don't don't spell things correctly ;) (#245)

Bug fixes

  • filter() now fails when given anything other than a logical vector, and
    correctly handles missing values (#249). filter.numeric() proxies
    stats::filter() so you can continue to use filter() function with
    numeric inputs (#264).
  • summarise() correctly uses newly created variables (#259).
  • mutate() correctly propagates attributes (#265) and mutate.data.frame()
    correctly mutates the same variable repeatedly (#243).
  • lead() and lag() preserve attributes, so they now work with
    dates, times and factors (#166).
  • n() never accepts arguments (#223).
  • row_number() gives correct results (#227).
  • rbind_all() silently ignores data frames with 0 rows or 0 columns (#274).
  • group_by() orders the result (#242). It also checks that columns
    are of supported types (#233, #276).
  • The hybrid evaluator did not handle some expressions correctly, for
    example in if(n() > 5) 1 else 2 the subexpression n() was not
    substituted correctly. It also correctly processes $ (#278).
  • arrange() checks that all columns are of supported types (#266).
  • Working towards Solaris compatibility.
  • Benchmarking vignette temporarily disabled due to microbenchmark
    problems reported by BDR.

dplyr 0.1.1

30 Jan 01:27
Compare
Choose a tag to compare

Improvements

  • new location() and changes() functions which provide more information
    about how data frames are stored in memory so that you can see what
    gets copied.
  • renamed explain_tbl() to explain() (#182).
  • tally() gains sort argument to sort output so highest counts
    come first (#173).
  • ungroup.grouped_df(), tbl_df(), as.data.frame.tbl_df() now only
    make shallow copies of their inputs (#191).
  • The benchmark-baseball vignette now contains fairer (including grouping
    times) comparisons with data.table. (#222)

Bug fixes

  • filter() (#221) and summarise() (#194) correctly propagate attributes.
  • summarise() throws an error when asked to summarise an unknown variable
    instead of crashing (#208).
  • group_by() handles factors with missing values (#183).
  • filter() handles scalar results (#217) and better handles scoping, e.g.
    filter(., variable) where variable is defined in the function that calls
    filter. It also handles T and F as aliases to TRUE and FALSE
    if there are no T or F variables in the data or in the scope.
  • select.grouped_df fails when the grouping variables are not included
    in the selected variables (#170)
  • all.equal.data.frame() handles a corner case where the data frame has
    NULL names (#217)
  • mutate() gives informative error message on unsupported types (#179)
  • dplyr source package no longer includes pandas benchmark, reducing
    download size from 2.8 MB to 0.5 MB.

First release

17 Jan 13:26
Compare
Choose a tag to compare
v0.1

Update readme and notes for CRAN release