2019-08-25

How to rename suffixes when joining tables in dplyr

Problem

When joining two tables with dplyr, the suffixes .x and .y will be added to the non-joined duplicate variables to disambiguate them. In our example, to the column mpg: mpg.x y mpg.y. How can override these suffixes?

library(dplyr) 
left_join(mtcars, mtcars[, c("mpg", 'cyl')], by = c("cyl")) %>% head()
  • Output
  •   mpg.x cyl disp  hp drat   wt  qsec vs am gear carb mpg.y
    1    21   6  160 110  3.9 2.62 16.46  0  1    4    4  21.0
    2    21   6  160 110  3.9 2.62 16.46  0  1    4    4  21.0
    3    21   6  160 110  3.9 2.62 16.46  0  1    4    4  21.4
    4    21   6  160 110  3.9 2.62 16.46  0  1    4    4  18.1
    5    21   6  160 110  3.9 2.62 16.46  0  1    4    4  19.2
    6    21   6  160 110  3.9 2.62 16.46  0  1    4    4  17.8
    

    Solution

    We can pass the argument suffix, a character vector of length 2 with the desired names. In our example, mpg_original (suffix _original) and mpg_new (suffix_new).

    left_join(mtcars, mtcars[,c("mpg","cyl")], 
                  by = c("cyl"), 
                  suffix = c("_original", "_new")) %>% head()
    
  • Output
  •   mpg_original cyl disp  hp drat   wt  qsec vs am gear carb mpg_new
    1           21   6  160 110  3.9 2.62 16.46  0  1    4    4    21.0
    2           21   6  160 110  3.9 2.62 16.46  0  1    4    4    21.0
    3           21   6  160 110  3.9 2.62 16.46  0  1    4    4    21.4
    4           21   6  160 110  3.9 2.62 16.46  0  1    4    4    18.1
    5           21   6  160 110  3.9 2.62 16.46  0  1    4    4    19.2
    6           21   6  160 110  3.9 2.62 16.46  0  1    4    4    17.8
    

    References

    No hay comentarios:

    Publicar un comentario

    Nube de datos