Problem statement

When loading both plyr and dplyr, the last package loaded overwrites symbols exported by the package loaded first:

library(plyr)
library(dplyr)

Currently, the following symbols are affected:

plyr_exports <- ls("package:plyr")
dplyr_exports <- ls("package:dplyr")
(both_exports <- intersect(plyr_exports, dplyr_exports))
## [1] "arrange"   "count"     "desc"      "failwith"  "id"        "mutate"   
## [7] "rename"    "summarise" "summarize"

This means that existing projects that use plyr cannot simply load dplyr using library(dplyr) without potentially breaking existing code. There are workarounds, but all of them seem to have specific disadvantages:

This document explores alternative solutions.

Analysis

Let’s take a closer look at the interface of the functions exported from both packages:

name plyr dplyr data_first identical
mutate .data, … .data, … TRUE TRUE
summarise .data, … .data, … TRUE TRUE
summarize .data, … .data, … TRUE TRUE
arrange df, … .data, … TRUE FALSE
count df, vars, wt_var x, …, wt, sort TRUE FALSE
rename x, replace, warn_missing, warn_duplicated .data, … TRUE FALSE
desc x x FALSE TRUE
failwith default, f, quiet default, f, quiet FALSE TRUE
id .variables, drop .variables, drop FALSE TRUE

We can split the overlaps in two groups:

Data is first argument

name plyr dplyr identical
mutate .data, … .data, … TRUE
summarise .data, … .data, … TRUE
summarize .data, … .data, … TRUE
arrange df, … .data, … FALSE
count df, vars, wt_var x, …, wt, sort FALSE
rename x, replace, warn_missing, warn_duplicated .data, … FALSE

Here, mutate, summari[sz]e and arrange mean the same in both plyr and dplyr, although plyr::summari[sz]e on a grouped tbl_df seems to destroy the grouping. The count function is similar, too, only that plyr::count is more similar to dplyr::count_, and that for some reason the first argument of dplyr::count is called x and not .data. Also, plyr::rename seems to be more similar to dplyr::rename_.

Identical formals

name plyr dplyr
7 desc x x
8 failwith default, f, quiet default, f, quiet
9 id .variables, drop .variables, drop

Here, we take a look at the bodies. We compare them for textual identity.

body_l %>%
  llply(. %>% { identical(as.character(.$plyr), as.character(.$dplyr)) } ) %>%
  ldply(. %>% data.frame(identical_body = .), .id = "name")
name identical_body
desc TRUE
failwith FALSE
id TRUE

Only failwith is different, but the effect seems to be the same:

body_l$failwith$plyr
{
  f <- match.fun(f)
  function(...) try_default(f(...), default, quiet = quiet)
}
body_l$failwith$dplyr
{
  function(...) {
    out <- default
    try(out <- f(...), silent = quiet)
    out
}
}

Concept

For fully seamless transition between plyr and dplyr, a compatibility package looks like a possible option. It would provide the union of the exported functions of both packages, and compatibility wrappers for the two functions count and rename that need special attention.

Deprecating the functions count and rename in the plyr package seems simpler. For an even more radical solution, all overlapping functions could be deprecated, referring to identical functionality in dplyr:

Summary

For practical use, a thin compatibility layer seems to work reasonably well for a project that was created using plyr and is now transitioning towards dplyr:

attach(pdlyr::dplyr_compat)
## The following objects are masked from package:dplyr:
## 
##     count, mutate, rename
## 
## The following objects are masked from package:plyr:
## 
##     count, mutate, rename

Tests in the pdlyr package will assure that this package can be safely loaded with warn.conflicts = FALSE. The original plyr implementations can be accessed via shortcuts (prefix p).

A few usage examples:

mtcars %>% mutate(lphkm = 100 * 3.785411784 / 1.609344 / mpg) %>% head
## Warning in mutate(., lphkm = 100 * 3.785411784/1.609344/mpg): Row names
## will be lost
mpg cyl disp hp drat wt qsec vs am gear carb lphkm
21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 11.20069
21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 11.20069
22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 10.31643
21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 10.99134
18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 12.57832
18.1 6 225 105 2.76 3.460 20.22 1 0 3 1 12.99528
mtcars %>% plyr::mutate(lphkm = 100 * 3.785411784 / 1.609344 / mpg) %>% head
mpg cyl disp hp drat wt qsec vs am gear carb lphkm
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 11.20069
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 11.20069
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 10.31643
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 10.99134
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 12.57832
Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1 12.99528
mtcars %>% ddply("cyl", summarize, mean_mpg = mean(mpg))
cyl mean_mpg
4 26.66364
6 19.74286
8 15.10000
mtcars %>% ddply("cyl", plyr::summarise, mean_mpg = mean(mpg))
cyl mean_mpg
4 26.66364
6 19.74286
8 15.10000
mtcars %>% arrange(-wt) %>% head
mpg cyl disp hp drat wt qsec vs am gear carb
10.4 8 460.0 215 3.00 5.424 17.82 0 0 3 4
14.7 8 440.0 230 3.23 5.345 17.42 0 0 3 4
10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4
16.4 8 275.8 180 3.07 4.070 17.40 0 0 3 3
19.2 8 400.0 175 3.08 3.845 17.05 0 0 3 2
13.3 8 350.0 245 3.73 3.840 15.41 0 0 3 4
mtcars %>% plyr::arrange(-wt) %>% head
mpg cyl disp hp drat wt qsec vs am gear carb
10.4 8 460.0 215 3.00 5.424 17.82 0 0 3 4
14.7 8 440.0 230 3.23 5.345 17.42 0 0 3 4
10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4
16.4 8 275.8 180 3.07 4.070 17.40 0 0 3 3
19.2 8 400.0 175 3.08 3.845 17.05 0 0 3 2
13.3 8 350.0 245 3.73 3.840 15.41 0 0 3 4
mtcars %>% count("gear")
## Warning: 'count' is deprecated.
## Use 'dplyr::count_' instead.
## See help("Deprecated") and help("plyr-deprecated").
gear freq
3 15
4 12
5 5
mtcars %>% plyr::count("gear")
gear freq
3 15
4 12
5 5
mtcars %>% rename(list(mpg = "miles_per_gallon")) %>% extract(1:2) %>% head
## Warning: 'rename' is deprecated.
## Use 'dplyr::rename_' instead.
## See help("Deprecated") and help("plyr-deprecated").
miles_per_gallon cyl
Mazda RX4 21.0 6
Mazda RX4 Wag 21.0 6
Datsun 710 22.8 4
Hornet 4 Drive 21.4 6
Hornet Sportabout 18.7 8
Valiant 18.1 6
mtcars %>% plyr::rename(list(mpg = "miles_per_gallon")) %>% extract(1:2) %>% head
miles_per_gallon cyl
Mazda RX4 21.0 6
Mazda RX4 Wag 21.0 6
Datsun 710 22.8 4
Hornet 4 Drive 21.4 6
Hornet Sportabout 18.7 8
Valiant 18.1 6

Last changed: 2015-05-27 11:36:17 UTC