2019-10-23

How to reorder bar charts with ggplot2

Problem

We'd like to reorder the bars of a bar chart. In our example below, the bars are not plotted in ascending or descending order.

library(tidyverse)
ggplot(df, aes(x = Position)) + geom_bar()
  • Data frame
  • df <- structure(list(Position = structure(c(3L, 3L, 1L, 1L, 1L, 2L), .Label = c("Defense", 
    "Striker", "Zoalkeeper"), class = "factor"), Name = structure(c(2L, 
    1L, 3L, 5L, 4L, 6L), .Label = c("Frank", "James", "Jean", "John", 
    "Steve", "Tim"), class = "factor")), class = "data.frame", row.names = c(NA, 
    -6L))
    
        Position  Name
    1 Zoalkeeper James
    2 Zoalkeeper Frank
    3    Defense  Jean
    4    Defense Steve
    5    Defense  John
    6    Striker   Tim
    

    Solution

    An alternative is using reorder to order the levels of a factor. In ascending (n) or descending order (-n) based on the count:

    • Descending order
    • df %>%
        count(Position) %>%
        ggplot(aes(x = reorder(Position, -n), y = n)) +
        geom_bar(stat = 'identity') +
        xlab("Position")
      
    • Ascending order
    • df %>%
        count(Position) %>%
        ggplot(aes(x = reorder(Position, n), y = n)) +
        geom_bar(stat = 'identity') +
        xlab("Position")
      

    References

    2019-10-18

    How to subset a contingency table in R?

    Problem

    We'd like to subset a contingency table. In our example, we use the dataset chickwts, subsetting those types of feed for which we have more than 11 observations.

    table(chickwts$feed)
    
       casein horsebean   linseed  meatmeal   soybean sunflower 
           12        10        12        11        14        12
    

    Solution

    • Base package
    • Using the function subset.

      subset(data.frame(table(chickwts$feed)), Freq > 11)
      
    • dplyr
    • library(dplyr)
      chickwts %>% 
        count(feed) %>%
        filter(n > 11) 
      
      

    Results

     # base
          Var1 Freq
    1    casein   12
    3   linseed   12
    5   soybean   14
    6 sunflower   12
    
    # dplyr
    
    # A tibble: 4 × 2
           feed     n
          
    1    casein    12
    2   linseed    12
    3   soybean    14
    4 sunflower    12
    

    References

    2019-10-12

    Conditional formatting to hide cell content in Excel

    Problem

    We would like to hide the contents of a cell based on the value of another cell.

    Solution

    In A2 we apply a conditional formatting rule that hides the content of that cell if it is equal to B2.

    In Conditional Formatting we create the following rule: =$A$2=$B$2, then press Format...
    1. On the Format menu, click Cells, and then click the Number tab.
    2. Under Category, click Custom.
    3. In the Type box, type ;;; (that is, three semicolons in a row), and then click OK

    Notes

    Why do we need to add three semicolons as a custom Excel format? The reason behind is that custom number formats consist of four sections separated by three semicolons, so if we just type three semicolons and nothing else, the cell will appear empty.

    References

    2019-10-03

    How to add a text box to a plot in R

    Problem

    We want to add a text box to a plot in R. First, we will use a very convoluted code to get it done.

    # A plot
    plot(x = runif(1000), y = runif(1000), type = "p", pch = 16, col = "#40404050")
    
    # Text box parameters
    myText <- "some Text"
    posCoordsVec <- c(0.5, 0.5)
    cex <- 2
    
    # Rectangle
    textHeight <- graphics::strheight(myText, cex = cex)
    textWidth <- graphics::strwidth(myText, cex = cex)
    pad <- textHeight*0.3
    rect(xleft = posCoordsVec[1] - textWidth/2 - pad, 
            ybottom = posCoordsVec[2] - textHeight/2 - pad, 
            xright = posCoordsVec[1] + textWidth/2 + pad, 
            ytop = posCoordsVec[2] + textHeight/2 + pad,
            col = "lightblue", border = NA)
    
    # Text coordinates
    text(posCoordsVec[1], posCoordsVec[2], myText, cex = cex)
    

    Solution

    • Base graphics
    • We use the function legend, same colour for the box line (box.col) and the background (bg), adjusting the text inside with the argument adj.

      plot(x = runif(1000), y = runif(1000), type = "p", pch = 16, col = "#40404050")
      legend(0.4, 0.5, "Some text", box.col = "lightblue", bg = "lightblue", adj = 0.2)
      
    • ggpplot2
    • With ggplot2 we need the function geom_label. The text will be automatically aligned inside the box, we remove the borders with label.size = NA.

      library(ggplot2)
      df <- data.frame(x = runif(1000), y = runif(1000))
      ggplot(data = df, aes(x = x , y = y))+ 
        geom_point(alpha = 0.2)+
        geom_label(aes(x = 0.5, y = 0.5, label = "Some text"), 
                   fill = "lightblue", label.size = NA, size = 5)
      

    References

    Nube de datos