2021-03-04

How to filter multiple values on a column in R

Problem

We want to filter multiple values on a column in R. In our example, we want to subset the rows containing the string Tom or Lynn for the column name.

Example

  days  name
1   88  Lynn
2   11   Tom
3    2 Chris
4    5  Lisa
5   22  Kyla
6    1   Tom
7  222  Lynn
8    2  Lynn
df <-
  data.frame(
    days = c(88, 11, 2, 5, 22, 1, 222, 2),
    name = c("Lynn", "Tom", "Chris", "Lisa", "Kyla", "Tom", "Lynn", "Lynn")
  ) 

Solution

  • Base package
  • df[df$name %in% c("Tom", "Lynn"), ] # or
    target <- c("Tom", "Lynn")
    df[df$name %in% target, ]
    
      days name
    1   88 Lynn
    2   11  Tom
    6    1  Tom
    7  222 Lynn
    8    2 Lynn
    
  • dplyr
  • library(dplyr)
    filter(df, name %in% c("Tom", "Lynn")) # or
    target <- c("Tom", "Lynn")
    filter(df, name %in% target)
    
  • data.table
  • library(data.table)
    DT <- data.table(df)
    DT[name %in% target]
    
  • sqldf
  • library(sqldf)
    # Two alternatives:
    sqldf('SELECT *
          FROM df 
          WHERE name = "Tom" OR name = "Lynn"')
    sqldf('SELECT *
          FROM df 
          WHERE name IN ("Tom", "Lynn")')
    

    Related posts

    References

    No hay comentarios:

    Publicar un comentario

    Nube de datos