Problem
We have the following scatterplot for two categorical variables.
When we create a 2D-density plot, we obtain overlapping densities. We want to control the number of contour bins.
library(ggplot2)
set.seed(123)
plot_data <-
data.frame(
X = c(rnorm(300, 3, 2.5), rnorm(150, 7, 2)),
Y = c(rnorm(300, 6, 2.5), rnorm(150, 2, 2)),
Label = c(rep('A', 300), rep('B', 150))
)
ggplot(plot_data, aes(X, Y, colour = Label)) + geom_point()
ggplot(plot_data, aes(X, Y)) +
stat_density_2d(geom = "polygon", aes(alpha = ..level.., fill = Label))
Solution
- Option 1 By adding to stat_density_2d the argument bins (number of contour bins) we definitely avoid overplotting, control and draw the attention to a number of density areas in a very economical fashion.
ggplot(plot_data, aes(X, Y, group = Label)) +
stat_density_2d(geom = "polygon",
aes(alpha = ..level.., fill = Label),
bins = 4)
ggplot(plot_data, aes(X, Y, group = Label)) +
stat_density_2d(geom = "polygon", aes(fill = as.factor(..level..))) +
scale_fill_manual(values = c(NA, NA, NA, "#BDD7E7", "#6BAED6", "#3182BD", "#08519C"))
References
No hay comentarios:
Publicar un comentario