
Transforming contingency tables into frequency tables in R


We want to tranform a contingency table into a frequency table in R.

# Contingency table
tbl <- table (mtcars[, c("am", "gear")])
am   3  4  5
  0 15  4  0
  1  0  8  5

Frequency tables

We transform the contingency table into a data frame.

df <- as.data.frame(tbl)
  am gear Freq
1  0    3   15
2  1    3    0
3  0    4    4
4  1    4    8
5  0    5    0
6  1    5    5

Transforming frequency tables in contingency tables

To transform a frequency table back to a contingency table.

ftable(xtabs(Freq ~ am + gear, data = df)) 
   gear  3  4  5
0       15  4  0
1        0  8  5
It is the equivalent of:

ftable(mtcars[, c("am", "gear")])



Proportion tables in R


We want to create proportion tables for one or multiple variables.


  • One variable
  • tabla <- table(mtcars$am)
          0       1 
    0.59375 0.40625
  • Two variables
  • tabla <- table(mtcars[, c("am", "gear")])
    am        3       4       5
      0 0.46875 0.12500 0.00000
      1 0.00000 0.25000 0.15625
    The prop.table function has two arguments:

    • x, table created with the function table
    • margin, with three possible values:
    •   Null - x/sum(x) default like in the previous example.
        1 - proportion calculated by rows.
        2 - proportion calculated by columns.

    # By row
    prop.table(tabla, 1)
    am          3         4         5
      0 0.7894737 0.2105263 0.0000000
      1 0.0000000 0.6153846 0.3846154
    # By column
    prop.table(tabla, 2)
    am          3         4         5
      0 1.0000000 0.3333333 0.0000000
      1 0.0000000 0.6666667 1.0000000
  • Three variables
  • tabla <- table(mtcars[, c("am", "gear", "cyl")])
    , , cyl = 4
    am        3       4       5
      0 0.03125 0.06250 0.00000
      1 0.00000 0.18750 0.06250
    , , cyl = 6
    am        3       4       5
      0 0.06250 0.06250 0.00000
      1 0.00000 0.06250 0.03125
    , , cyl = 8
    am        3       4       5
      0 0.37500 0.00000 0.00000
      1 0.00000 0.00000 0.06250
  • Flat Contingency Table
  • In the previous example, a better approach would be to create a flat contingency table..

    tabla <- ftable(mtcars[, c("am", "gear", "cyl")])
            cyl       4       6       8
    am gear                            
    0  3        0.03125 0.06250 0.37500
       4        0.06250 0.06250 0.00000
       5        0.00000 0.00000 0.00000
    1  3        0.00000 0.00000 0.00000
       4        0.18750 0.06250 0.00000
       5        0.06250 0.03125 0.06250
  • Percentage table
  • We can use the function round.

    round(prop.table(tabla)*100, 2)
             cyl     4     6     8
    am gear                      
    0  3         3.12  6.25 37.50
       4         6.25  6.25  0.00
       5         0.00  0.00  0.00
    1  3         0.00  0.00  0.00
       4        18.75  6.25  0.00
       5         6.25  3.12  6.25
    round(prop.table(tabla, 1)*100, 2) # By row, am y gear.
            cyl     4     6     8
    am gear                      
    0  3         6.67 13.33 80.00
       4        50.00 50.00  0.00
       5          NaN   NaN   NaN
    1  3          NaN   NaN   NaN
       4        75.00 25.00  0.00
       5        40.00 20.00 40.00
    round(prop.table(tabla, 2)*100, 2) # By column, cyl
            cyl     4     6     8
    am gear                      
    0  3         9.09 28.57 85.71
       4        18.18 28.57  0.00
       5         0.00  0.00  0.00
    1  3         0.00  0.00  0.00
       4        54.55 28.57  0.00
       5        18.18 14.29 14.29



Contingency tables in R


We want to create a contingency table for one or multiple variables.


  • One variable
  • table(mtcars$am)
     0  1 
    19 13 
  • Two variables
  • table(mtcars$am, mtcars$gear)
         3  4  5
      0 15  4  0
      1  0  8  5
    If we want to include the names of the variables:

    table(mtcars[, c("am", "gear")]) 
    tabla <- table(mtcars[, 9:10])
    # or the argument dnn:
    table(mtcars$am, mtcars$gear, dnn = c("am", "gear"))
    am   3  4  5
      0 15  4  0
      1  0  8  5
  • Three variables
  • table(mtcars[, c("am", "gear", "cyl")])
    , , cyl = 4
    am   3  4  5
      0  1  2  0
      1  0  6  2
    , , cyl = 6
    am   3  4  5
      0  2  2  0
      1  0  2  1
    , , cyl = 8
    am   3  4  5
      0 12  0  0
      1  0  0  2
  • Flat contingency tables
  • In the previous example, a better approach would be to create a flat contingency table.

    ftable(mtcars[, c("am", "gear", "cyl")])
            cyl  4  6  8
    am gear             
    0  3         1  2 12
       4         2  2  0
       5         0  0  0
    1  3         0  0  0
       4         6  2  0
       5         2  1  2
    We use the arguments row.vars and col.vars to provide the numbers or names of the variables to be used for the rows and columns of the flat contingency table. If neither of these two is given, the last variable is used for the columns. In our example the variable cyl.

    ftable(mtcars[, c("am", "gear", "cyl")], col.vars = c(1, 2))
         am    0        1      
        gear  3  4  5  3  4  5
    4         1  2  0  0  6  2
    6         2  2  0  0  2  1
    8        12  0  0  0  0  2


The function xtabs creates contingency tables using a formula interface, each variable separated by +.

# One variable
xtabs(~ am, mtcars)
# Two variables
xtabs(~ am + gear, mtcars)
# Three variables
xtabs(~ am + gear + cyl, mtcars)
# Flat contingency table
ftable(xtabs(~ am + gear + cyl, mtcars))

Related posts


How to draw square cells with geom_tile in ggplot2


In the following plot created using geom_tile we have rectangular cells. How can we draw squared cells instead?

df <- data.frame(val = rnorm(100), 
                 gene = rep(letters[1:20], 5), 
                 cell = c(sapply(LETTERS[1:5], 
                                 function(l) rep(l, 20))))
ggplot(df, aes(y = gene, x = cell, fill = val)) +
  geom_tile(color = "white")


We add coord_fixed() or coord_equal( ):

The default, ratio = 1, ensures that one unit on the x-axis is the same length as one unit on the y-axis.

ggplot(df, aes(y = gene, x = cell, fill = val)) +
  geom_tile(color = "white") +
  coord_fixed() # or coord_equal()


Related posts


Solving a system of linear equations in R


We want to solve a system of linear equations in R.

2x + 3y + 3z = 20
  x + 4y + 3z = 15
5x + 3y + 4z = 30


We use the function solve.

a <- rbind(c(2, 3, 3), 
           c(1, 4, 3), 
           c(5, 3, 4))
b <- c(20, 15, 30)
solve(a, b)
[1] -1.25 -6.25 13.75
If we'd like the resulst in fractions, we use the function fractions from the package MASS.

fractions(solve(a, b))
[1]  -5/4 -25/4  55/4

Related posts

Spanish version


Spacing between legend keys in ggplot2


We want to add some spacing between the elements of the legend to the following plot using ggplot2.

mtcars %>%
  mutate(transmission = ifelse(am, "manual", "automatic")) %>%
  ggplot() +
  aes(x = transmission, fill = transmission) +
  geom_bar() +
  labs(fill = NULL) +
    #legend.spacing.x = unit(.5, "char"), # adds spacing to the left too
    legend.position = "top",
    legend.justification = c(0, 0),
    legend.title = element_blank(),
    legend.margin = margin(c(5, 5, 5, 0))


  • First option: spacing between the 4 elements
  • We use legend.spacing.x to add the spacing between the 4 elements the legend's key and text.

    mtcars %>%
      mutate(transmission = ifelse(am, "manual", "automatic")) %>%
      ggplot() +
      aes(x = transmission, fill = transmission) +
      geom_bar() +
      labs(fill = NULL) +
        legend.spacing.x = unit(.5, "char"),
        legend.position = "top",
        legend.justification = c(0, 0),
        legend.title = element_blank(),
        legend.margin = margin(c(5, 5, 5, 0)))
  • Second option: spacing between the 2 groups automatic and manual
  • We pass the argument margin to he element_text function to add spacing only between the 2 groups, the two types of transmissions automatic and manual, between the text automatic and the key (blue square) of manual

    mtcars %>%
      mutate(transmission = ifelse(am, "manual", "automatic")) %>%
      ggplot() +
      aes(x = transmission, fill = transmission) +
      geom_bar() +
      labs(fill = NULL) +
      theme(legend.position = "top",
        legend.justification = c(0, 0),
        legend.title = element_blank(),
        legend.margin = margin(c(5, 5, 5, 0)),
        legend.text = element_text(margin = margin(r = 10, unit = "pt")))

    Related posts



    How to calculate elapsed and remaining days between dates in R


    We would like to calculate the number of elapsed and remaining days in a year.


  • From a starting date in the year
  • date<- as.Date("2019-03-29")
    # Elapsed days
    date - as.Date(ISOdate(format(Sys.Date(), "%Y"), 1, 1)) + 1
    Time difference of 88 days
    # Remaining days
    as.Date(ISOdate(format(Sys.Date(), "%Y"), 12, 31)) - date
    Time difference of 277 days
  • From today
  • # Elapsed days
    Sys.Date() - as.Date(ISOdate(format(Sys.Date(), "%Y"), 1, 1)) + 1 
    Time difference of 153 days
    # Remaining days
    as.Date(ISOdate(format(Sys.Date(), "%Y"), 12, 31)) - Sys.Date() 
    Time difference of 212 days

    Related entries

    Nube de datos