Problem
If we filter a data frame containing a factor and then perform any operation, such as creating a contingency table, R will still show the unused levels. Subsetting does not in general drop unused levels.
df <- data.frame(name = c("a", "a", "a", "b", "b", "c", "c", "c", "c"), x = 1:9)
library(dplyr)
aa <- df %>%
group_by(name) %>%
filter(n() < 4) %>%
droplevels()
table(aa$name)
In our example, the level c is still included in the results. We'd like to remove it and display only the used levels a and b.
# Resultado
a b c
3 2 0
# Resultado deseado
a b
3 2
Solution
There are two alternatives, the function droplevels or factor.
table(droplevels(aa$name))
table(factor(aa$name))
If we are using dplyr and the pipe operator:
aa <- df %>%
group_by(name) %>%
filter(n() < 4) %>%
droplevels()
table(aa$name)
# Better still
df %>%
group_by(name) %>%
filter(n() < 4) %>%
droplevels() %>%
{table(.$name)}
No hay comentarios:
Publicar un comentario