
Descriptive statistics by group in R



We'd like to report descriptive statistics in R by a grouping variable and subsetting the output statistics.


We will use the data frame iris, columns Sepal.Length and Sepal.Width and grouping by Species. In our example, we want to return the mean, the standard deviation, the skewness and kurtosis.

  • Subset of descriptive statistics by group
  • library(psych)
    # Variables by index
    d <- describeBy(iris[1:2], group = iris$Species)
    # Two options to subset the statistics:
    lapply(d, "[", , c(3, 4, 11, 12))
    lapply(d, subset, , c(3, 4, 11, 12)) 
    # Variables by name
    i <- match(c("Sepal.Length", "Petal.Length"), names(iris))
    d <- describeBy(iris[i], group = iris$Species)
    lapply(d, subset, , c("mean", "sd", "skew", "kurtosis")) 
                 mean   sd skew kurtosis
    Sepal.Length 5.01 0.35 0.11    -0.45
    Sepal.Width  3.43 0.38 0.04     0.60
                 mean   sd  skew kurtosis
    Sepal.Length 5.94 0.52  0.10    -0.69
    Sepal.Width  2.77 0.31 -0.34    -0.55
                 mean   sd skew kurtosis
    Sepal.Length 6.59 0.64 0.11    -0.20
    Sepal.Width  2.97 0.32 0.34     0.38
  • Subset of descriptive statistics without grouping
  • # Seleccionamos las columnas deseadas de la tabla
    d <- describe(iris[1:2])
    # Subsetting output statistics
    d[, c(3, 4, 11, 12)]
                 mean   sd skew kurtosis
    Sepal.Length 5.84 0.83 0.31    -0.61
    Sepal.Width  3.06 0.44 0.31     0.14


    No hay comentarios:

    Publicar un comentario

    Nube de datos