Problem
We'd like to report descriptive statistics in R by a grouping variable and subsetting the output statistics.
Solution
We will use the data frame iris, columns Sepal.Length and Sepal.Width and grouping by Species. In our example, we want to return the mean, the standard deviation, the skewness and kurtosis.
library(psych)
# Variables by index
d <- describeBy(iris[1:2], group = iris$Species)
# Two options to subset the statistics:
lapply(d, "[", , c(3, 4, 11, 12))
lapply(d, subset, , c(3, 4, 11, 12))
# Variables by name
i <- match(c("Sepal.Length", "Petal.Length"), names(iris))
d <- describeBy(iris[i], group = iris$Species)
lapply(d, subset, , c("mean", "sd", "skew", "kurtosis"))
$setosa
mean sd skew kurtosis
Sepal.Length 5.01 0.35 0.11 -0.45
Sepal.Width 3.43 0.38 0.04 0.60
$versicolor
mean sd skew kurtosis
Sepal.Length 5.94 0.52 0.10 -0.69
Sepal.Width 2.77 0.31 -0.34 -0.55
$virginica
mean sd skew kurtosis
Sepal.Length 6.59 0.64 0.11 -0.20
Sepal.Width 2.97 0.32 0.34 0.38
# Seleccionamos las columnas deseadas de la tabla
d <- describe(iris[1:2])
# Subsetting output statistics
d[, c(3, 4, 11, 12)]
mean sd skew kurtosis
Sepal.Length 5.84 0.83 0.31 -0.61
Sepal.Width 3.06 0.44 0.31 0.14
References
No hay comentarios:
Publicar un comentario