2019-12-29

Read compressed files in R using readr

Problem

We need to read compressed files in R.

Solution

We will use the package readr:

Files ending in .gz, .bz2, .xz, or .zip will be automatically uncompressed. Files starting with http://, https://, ftp://, or ftps:// will be automatically downloaded. Remote gz files can also be automatically downloaded and decompressed.

In our example we use the file title.ratings.tsv.gz.

library(readr)
df_ratings <- read_tsv('title.ratings.tsv.gz', na = "\\N", quote = '')
df_ratings %>% head()
We can provide the URL and it will be automatically downloaded and decompressed

df_ratings <- read_tsv('https://datasets.imdbws.com/title.ratings.tsv.gz', na = "\\N", quote = '')
df_ratings %>% head()

Results

# A tibble: 6 x 3
  tconst    averageRating numVotes
                   
1 tt0000001           5.8     1423
2 tt0000002           6.4      168
3 tt0000003           6.6     1016
4 tt0000004           6.4      100
5 tt0000005           6.2     1713
6 tt0000006           5.5       88

No hay comentarios:

Publicar un comentario

Nube de datos