2020-04-03

Creación de gráficos del coronavirus en R

Introducción

Queremos mostrar la evolución de casos de coronavirus en R con gráficos estáticos e interactivos.

Gráficos

  • Interactivo (escala lineal)
  • Interactivo (escala logaritmica)
  • Solución

    Usamos los datos del repositorio creado por Johns Hopkins University Center for Systems Science and Engineering (JHU CSSE). Hay tres series de datos temporales: confirmed, deaths y recovered cases. Primero preparamos los datos y creamos el gráfico usando ggplot2 para la versión estática, y plotly para añadir interactividad. Las series de datos incluyen casos de todo el mundo pero en nuestro ejemplo usamos un subconjunto para Alemania, Francia, Italia, España y el Reino Unido.

    # Librerias
    library(magrittr)
    library(lubridate) 
    library(tidyverse)
    library(plotly)
    library(scales)
    
    # Importación de datos
    confirmed <- read_csv("https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_19-covid-Confirmed.csv")
    deaths <- read_csv("https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_19-covid-Deaths.csv")
    recovered <- read_csv("https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_19-covid-Recovered.csv")
    
    # Data preparation
    AppendMe <- function(dfNames) {
      do.call(rbind, lapply(dfNames, function(x) {
        cbind(get(x), source = x)
      }))
    }
    df <- AppendMe(c("confirmed", "deaths", "recovered"))
    data <- df %>%
      rename(province = `Province/State`, country = `Country/Region`) %>% 
      pivot_longer(
        -c(province, country, Lat, Long, source),
        names_to = "date",
        values_to = "count"
      ) %>% 
    mutate(date = mdy(date)) 
    
    # Gráfico escala lineal
    p <- data %>%
      filter(country %in% c("Germany", "France", "Italy",  "Spain", "United Kingdom")) %>% 
      group_by(country, date, source) %>%
      summarise(n = sum(count)) %>%
      ggplot(aes(date, n, colour = country)) +
      geom_line(linetype = 2) +
      geom_point(size = 1) +
      facet_wrap( ~  source  , scales = "free", nrow = 3) +
      theme_bw()+
      labs(title = "Cumulative Covid-19 cases (linear scale)")+
      ylab("")+
      scale_x_date(date_labels = "%b %d")+
      scale_y_continuous(labels = comma)
    p # Estático
    ggplotly(p) # Interactivo
    
    # Gráfico escala logaritmica
    p <- data %>%
      filter(country %in% c("Germany", "France", "Italy",  "Spain", "United Kingdom")) %>% 
      group_by(country, date, source) %>%
      summarise(n = sum(count)) %>%
      ggplot(aes(date, n, colour = country)) +
      geom_line(linetype = 2) +
      geom_point(size = 1) +
      facet_wrap( ~  source, scales = "free",  nrow = 3) +
      theme_bw()+
      labs(title = "Cumulative Covid-19 cases (log scale)")+
      ylab("")+
      scale_x_date(date_labels = "%b %d")+
      scale_y_log10(breaks = c(1, 10, 100, 10000))
      p 
    ggplotly(p) 
    
    Para subrayar una serie al pasar sobre ella usamos la función highlight del paquete plotly.

    p <- data %>%
      filter(country %in% c("Germany", "France", "Italy",  "Spain", "United Kingdom")) %>% 
      group_by(country, date, source) %>%
      summarise(cases = sum(count)) %>%
      highlight_key(~ country ) %>% 
      ggplot(aes(date, cases, colour = country)) +
      geom_line(linetype = 2)+
      geom_point(size = 1) +
      facet_wrap(~  source  , scales = "free", nrow = 3)+
      theme_bw()+
      labs(title = "Cumulative Covid-19 cases (linear scale)")+
      ylab("")+
      scale_x_date(date_labels = "%b %d")+
      scale_y_continuous(labels = comma)
    ggplotly(p, tooltip = c("country", "date", "cases")) %>% 
    highlight(on = "plotly_hover")
    
    Gráficco here. Pantallazo abajo.

    Referencias

    No hay comentarios:

    Publicar un comentario

    Nube de datos