2020-03-20

Plotting coronavirus cases in R

Introduction

We want to show the evolution of the coronavirus cases using R creating static and interactive plots.

Plots

  • Interactive (linear scale)
  • Interactive (log scale)
  • Solution

    We use the data repository created by Johns Hopkins University Center for Systems Science and Engineering (JHU CSSE). There are three time-series: confirmed, deaths and recovered cases. First we will prepare the data and then plot the time-series using ggplot2 for the static version and plotly to add interactivity. The data source includes cases across the world, but in our example we will subset the time-series for Germany, France, Italy, Spain, and the United Kingdom.

    # Libraries
    library(magrittr)
    library(lubridate) 
    library(tidyverse)
    library(plotly)
    library(scales)
    
    # Importing data
    confirmed <- read_csv("https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_19-covid-Confirmed.csv")
    deaths <- read_csv("https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_19-covid-Deaths.csv")
    recovered <- read_csv("https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_19-covid-Recovered.csv")
    
    # Data preparation
    AppendMe <- function(dfNames) {
      do.call(rbind, lapply(dfNames, function(x) {
        cbind(get(x), source = x)
      }))
    }
    df <- AppendMe(c("confirmed", "deaths", "recovered"))
    data <- df %>%
      rename(province = `Province/State`, country = `Country/Region`) %>% 
      pivot_longer(
        -c(province, country, Lat, Long, source),
        names_to = "date",
        values_to = "count"
      ) %>% 
    mutate(date = mdy(date)) 
    
    # Plot linear scale
    p <- data %>%
      filter(country %in% c("Germany", "France", "Italy",  "Spain", "United Kingdom")) %>% 
      group_by(country, date, source) %>%
      summarise(n = sum(count)) %>%
      ggplot(aes(date, n, colour = country)) +
      geom_line(linetype = 2) +
      geom_point(size = 1) +
      facet_wrap( ~  source  , scales = "free", nrow = 3) +
      theme_bw()+
      labs(title = "Cumulative Covid-19 cases (linear scale)")+
      ylab("")+
      scale_x_date(date_labels = "%b %d")+
      scale_y_continuous(labels = comma)
    p # Static
    ggplotly(p) # Interactive
    
    # Plot log scale
    p <- data %>%
      filter(country %in% c("Germany", "France", "Italy",  "Spain", "United Kingdom")) %>% 
      group_by(country, date, source) %>%
      summarise(n = sum(count)) %>%
      ggplot(aes(date, n, colour = country)) +
      geom_line(linetype = 2) +
      geom_point(size = 1) +
      facet_wrap( ~  source, scales = "free",  nrow = 3) +
      theme_bw()+
      labs(title = "Cumulative Covid-19 cases (log scale)")+
      ylab("")+
      scale_x_date(date_labels = "%b %d")+
      scale_y_log10(breaks = c(1, 10, 100, 10000))
      p 
    ggplotly(p) 
    
    To highlight a series while hovering over it, we use the function highlightfrom the plotly package.

    p <- data %>%
      filter(country %in% c("Germany", "France", "Italy",  "Spain", "United Kingdom")) %>% 
      group_by(country, date, source) %>%
      summarise(cases = sum(count)) %>%
      highlight_key(~ country ) %>% 
      ggplot(aes(date, cases, colour = country)) +
      geom_line(linetype = 2)+
      geom_point(size = 1) +
      facet_wrap(~  source  , scales = "free", nrow = 3)+
      theme_bw()+
      labs(title = "Cumulative Covid-19 cases (linear scale)")+
      ylab("")+
      scale_x_date(date_labels = "%b %d")+
      scale_y_continuous(labels = comma)
    ggplotly(p, tooltip = c("country", "date", "cases")) %>% 
    highlight(on = "plotly_hover")
    
    Plot here. Screenshot below.

    References

    No hay comentarios:

    Publicar un comentario

    Nube de datos