Mostrando entradas con la etiqueta English. Mostrar todas las entradas
Mostrando entradas con la etiqueta English. Mostrar todas las entradas

2021-03-22

Ignore zero values in conditional formatting in Excel

Title

Problem

We want conditional formatting in Excel to ignore cells containing zero values. In the example below, after applying colour scales in conditional formatting, we can see that Excel applies red formatting to cells with zero values.

Solution

To make conditional formatting ignore zero values, we follow these steps:

  1. Create a new conditional formatting rule.
  2. Apply no format to cells with value equal to 0.
  3. Make sure that this rule is the first to be evaluated, and select the Stop If True check box.

Results

Related posts

Formato condicional

References

Use conditional formatting to highlight information

2021-03-09

How to highlight specific data points in a scatter plot with ggplot2

Problem

We would like to highlight specific data points in a scatter plot created with ggplot2. In our example, we will use the mpg dataset (Fuel economy data from 1999 and 2008 for 38 popular models of car). We want to highlight in red and a larger point size those data points whose displ (engine displacement) are greater than 5 and hwy (highway miles per gallon) greater than 20.

Solution

  • Option 1
  • We add the conditions for both attributes, colour and size, inside geom_point. Then we control manually those using scale_colour_manual and scale_size_manual respectively. Finally, we remove the legend.

    ggplot(data = mpg) + 
      geom_point(mapping = aes(x = displ, y = hwy, colour = displ > 5 & hwy > 20, size = displ > 5 & hwy > 20)) + 
      scale_colour_manual(values = c("black", "red")) + 
      scale_size_manual(values =c(1.5, 3))+
      theme(legend.position = "none")
    
  • Option 2
  • We create two layers of data points using geom_point. The first layer include all data points in black (by default but e). The second layer adds the points we would like to highlight in red with a larger point size.

    ggplot(data = mpg) + 
      geom_point(mapping = aes(x = displ, y = hwy), colour= "black") +
      geom_point(data = subset(mpg, displ > 5 & hwy > 20), aes(x = displ, y = hwy), colour= "red", size = 3)
     
    In ggplot2 the layers are added sequentially. If we change the order of the layers, we would get the following result.

    ggplot(data = mpg) + 
       geom_point(data = subset(mpg, displ > 5 & hwy > 20), aes(x = displ, y = hwy), colour= "red", size = 3) +
       geom_point(mapping = aes(x = displ, y = hwy), colour= "black")
     

    2021-03-04

    How to filter multiple values on a column in R

    Problem

    We want to filter multiple values on a column in R. In our example, we want to subset the rows containing the string Tom or Lynn for the column name.

    Example

      days  name
    1   88  Lynn
    2   11   Tom
    3    2 Chris
    4    5  Lisa
    5   22  Kyla
    6    1   Tom
    7  222  Lynn
    8    2  Lynn
    
    df <-
      data.frame(
        days = c(88, 11, 2, 5, 22, 1, 222, 2),
        name = c("Lynn", "Tom", "Chris", "Lisa", "Kyla", "Tom", "Lynn", "Lynn")
      ) 
    

    Solution

  • Base package
  • df[df$name %in% c("Tom", "Lynn"), ] # or
    target <- c("Tom", "Lynn")
    df[df$name %in% target, ]
    
      days name
    1   88 Lynn
    2   11  Tom
    6    1  Tom
    7  222 Lynn
    8    2 Lynn
    
  • dplyr
  • library(dplyr)
    filter(df, name %in% c("Tom", "Lynn")) # or
    target <- c("Tom", "Lynn")
    filter(df, name %in% target)
    
  • data.table
  • library(data.table)
    DT <- data.table(df)
    DT[name %in% target]
    
  • sqldf
  • library(sqldf)
    # Two alternatives:
    sqldf('SELECT *
          FROM df 
          WHERE name = "Tom" OR name = "Lynn"')
    sqldf('SELECT *
          FROM df 
          WHERE name IN ("Tom", "Lynn")')
    

    Related posts

    References

    2021-03-01

    How to draw a stratified sample in R

    Problem

    We want to draw a stratified sample from a data frame in R.

    Solution

    Let's look at two examples, with one or several groups.

    One group

    We extract 3 records from each of the species: setosa, versicolor y virginica.

  • Base package
  • set.seed(1)
    iris1 <- lapply(split(iris, iris$Species), function(x) x[sample(nrow(x), 3), ])
    do.call("rbind", iris1) 
    
                  Sepal.Length Sepal.Width Petal.Length Petal.Width    Species
    setosa.14              4.3         3.0          1.1         0.1     setosa
    setosa.19              5.7         3.8          1.7         0.3     setosa
    setosa.28              5.2         3.5          1.5         0.2     setosa
    versicolor.96          5.7         3.0          4.2         1.2 versicolor
    versicolor.60          5.2         2.7          3.9         1.4 versicolor
    versicolor.94          5.0         2.3          3.3         1.0 versicolor
    virginica.148          6.5         3.0          5.2         2.0  virginica
    virginica.133          6.4         2.8          5.6         2.2  virginica
    virginica.131          7.4         2.8          6.1         1.9  virginica
    
  • dplyr
  • library(dplyr)
    set.seed(1)
    iris %>%
      group_by(Species) %>%
      sample_n(., 3)
    
     Source: local data frame [9 x 5]
    Groups: Species
    
      Sepal.Length Sepal.Width Petal.Length Petal.Width    Species
    1          4.3         3.0          1.1         0.1     setosa
    2          5.7         3.8          1.7         0.3     setosa
    3          5.2         3.5          1.5         0.2     setosa
    4          5.7         3.0          4.2         1.2 versicolor
    5          5.2         2.7          3.9         1.4 versicolor
    6          5.0         2.3          3.3         1.0 versicolor
    7          6.5         3.0          5.2         2.0  virginica
    8          6.4         2.8          5.6         2.2  virginica
    9          7.4         2.8          6.1         1.9  virginica
    
    Two groups

    For each number of cylinders (4, 6 u 8) we will extract 2 records with automatic transmission (am = 0) and 2 with manual transmission (am = 1).

  • Base package
  • set.seed(1)
    mtcars1 <- lapply(split(mtcars, list(mtcars$cyl, mtcars$am)), function(x) x[sample(nrow(x), 2), ])
    do.call("rbind", mtcars1) 
    
                           mpg cyl  disp  hp drat    wt  qsec vs am gear carb
    0.4.Merc 240D         24.4   4 146.7  62 3.69 3.190 20.00  1  0    4    2
    0.4.Toyota Corona     21.5   4 120.1  97 3.70 2.465 20.01  1  0    3    1
    1.4.Fiat X1-9         27.3   4  79.0  66 4.08 1.935 18.90  1  1    4    1
    1.4.Lotus Europa      30.4   4  95.1 113 3.77 1.513 16.90  1  1    5    2
    0.6.Hornet 4 Drive    21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1
    0.6.Merc 280          19.2   6 167.6 123 3.92 3.440 18.30  1  0    4    4
    1.6.Ferrari Dino      19.7   6 145.0 175 3.62 2.770 15.50  0  1    5    6
    1.6.Mazda RX4 Wag     21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4
    0.8.Chrysler Imperial 14.7   8 440.0 230 3.23 5.345 17.42  0  0    3    4
    0.8.Hornet Sportabout 18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2
    1.8.Ford Pantera L    15.8   8 351.0 264 4.22 3.170 14.50  0  1    5    4
    1.8.Maserati Bora     15.0   8 301.0 335 3.54 3.570 14.60  0  1    5    8
    
  • dplyr
  • set.seed(1)
    mtcars %>%
      group_by(cyl, am) %>%
      sample_n(., 2)
    
    Source: local data frame [12 x 11]
    Groups: cyl, am
    
        mpg cyl  disp  hp drat    wt  qsec vs am gear carb
    1  24.4   4 146.7  62 3.69 3.190 20.00  1  0    4    2
    2  21.5   4 120.1  97 3.70 2.465 20.01  1  0    3    1
    3  27.3   4  79.0  66 4.08 1.935 18.90  1  1    4    1
    4  30.4   4  95.1 113 3.77 1.513 16.90  1  1    5    2
    5  21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1
    6  19.2   6 167.6 123 3.92 3.440 18.30  1  0    4    4
    7  19.7   6 145.0 175 3.62 2.770 15.50  0  1    5    6
    8  21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4
    9  14.7   8 440.0 230 3.23 5.345 17.42  0  0    3    4
    10 18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2
    11 15.8   8 351.0 264 4.22 3.170 14.50  0  1    5    4
    12 15.0   8 301.0 335 3.54 3.570 14.60  0  1    5    8
    

    Related posts

    References

    2021-02-09

    How to set tick labels to the edges of a continuous ggplot2 legend

    Title

    Problem

    We want to show the minimum and maximum values in the legend of our plot in ggplot2.

    library(ggplot2)
    d <- subset(diamonds, price >= 257 & price <= 8888)
    ggplot(d, aes(depth, carat, colour = price)) +
      geom_point() +
      scale_colour_gradient(limits = c(257, 8888))
    

    The legend shows values from 2.000 al 8.000 but we do not know the minimum and the maximum values.

    Solution

    Within scale_colour_gradient, we pass the arguments breaks and labels.

    ggplot(d, aes(depth, carat, colour = price)) +
      geom_point() +
      scale_colour_gradient(limits = c(257, 8888), 
                            breaks = c(257, 2000, 4000, 6000, 8000, 8888),
                            labels = c(257, 2000, 4000, 6000, 8000, 8888))
    

    Now we have added to the legend the minimum 257 and the maximum 8.888 values.

    Notes

    Instead of using the default colour scale colours from light blue to dark blue, we can adjust colour choices with the arguments low and high.

    ggplot(d, aes(depth, carat, colour = price)) +
      geom_point() +
      scale_colour_gradient(limits = c(257, 8888), 
                            breaks=c(257, 2000, 4000, 6000, 8000, 8888),
                            labels=c(257, 2000, 4000, 6000, 8000, 8888), low = "red")
    

    References

    2021-01-06

    How to insert a picture in an Excel comment

    Title

    Problem

    We want to insert a picture in an Excel comment

    Solution

    1. Select cell, right click, and Insert Comment.
    2. Right click on the border of the comment and clic on Format Comment.
    3. On the Colors and Lines tab, click on the drop-down arrow for Color and select Fill Effects.
    4. On the picture tab, click on Select Picture.
    5. Select the picture, insert, and click OK.
    6. Resize the picture by draggin the border of the comment.
    7. Hide if you only want to show the picture when you hover over the cell.

    Related Posts

    Nube de datos