Lecture 14: Code profiling and performance enhancement

Reading: Matloff, Chapter 14

Hadley Wickham, Advanced R

Logistics:

We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil.

Sir Tony Hoare, popularized by Donald Knuth

Counterpoint

Setup

You have:

You want:

Two steps:

What to do?

Finding what the problem is

Base R function for this task is Rprof.

Many other packages, e.g. profvis (I believe what RStudio uses now) are based on Rprof, usually with more nicely formatted output.

How does Rprof work?

For example

library(profvis)
f <- function() {
  pause(0.1)
  g()
  h()
}
g <- function() {
  pause(0.1)
  h()
}
h <- function() {
  pause(0.1)
}
source("profiling-example.R")
tmp <- tempfile()
Rprof(tmp, interval = 0.02)
f()
Rprof(NULL)
summaryRprof(tmp)

If we don’t sample frequently enough, we don’t get very useful output:

tmp <- tempfile()
Rprof(tmp, interval = 0.2)
f()
Rprof(NULL)
summaryRprof(tmp)

Using profvis:

profvis(f())

Checking whether code is faster

Several packages for testing alternative implementations

Strategies for writing faster code

Remove extra work

If you know something about the input to a function, you can speed up your code by skipping some of the steps R takes in checking what kind of data is used.

Here we’re bypassing the method dispatch for mean.

library(microbenchmark)
x <- runif(1e2)

microbenchmark(
  mean(x),
  mean.default(x), times = 1000
)
## Unit: microseconds
##             expr   min    lq     mean median     uq    max neval
##          mean(x) 2.048 2.152 2.310585 2.2135 2.2855 19.078  1000
##  mean.default(x) 1.091 1.154 1.203793 1.1820 1.2290  2.989  1000

The as.data.frame function performs many checks and is not optimized for performance.

Suppose that we have a list and want to change it to a data frame.

If we’re sure it’s formatted correctly, we can use the quickdf function.

quickdf <- function(l) {
  class(l) <- "data.frame"
  attr(l, "row.names") <- 1:length(l[[1]])
  l
}

l <- lapply(1:26, function(i) runif(1e3))
names(l) <- letters
microbenchmark(
  quick_df      = quickdf(l),
  as.data.frame = as.data.frame(l)
)
## Unit: microseconds
##           expr     min       lq      mean    median       uq       max neval
##       quick_df  10.597   11.918  164.2065   13.7245   17.432 14912.223   100
##  as.data.frame 997.955 1014.611 1102.2581 1028.5585 1069.960  4473.228   100

But we need to be sure the input is formatted correctly, because it does no checks:

quickdf(list(x = 1, y = 1:2))
## Warning in format.data.frame(if (omit) x[seq_len(n0), , drop = FALSE] else x, :
## corrupt data frame: columns will be truncated or padded with NAs
##   x y
## 1 1 1

Avoid copies

random_string <- function() {
  paste(sample(letters, 50, replace = TRUE), collapse = "")
}
strings10 <- replicate(10, random_string())
strings100 <- replicate(100, random_string())

collapse <- function(xs) {
  out = ""
  for (x in xs) {
    out <- paste0(out, x)
  }
  out
}

microbenchmark(
  loop10  = collapse(strings10),
  loop100 = collapse(strings100),
  vec10   = paste(strings10, collapse = ""),
  vec100  = paste(strings100, collapse = "")
)
## Unit: microseconds
##     expr     min      lq      mean   median       uq      max neval
##   loop10  21.934  22.348  24.02720  22.6275  23.3945   58.403   100
##  loop100 582.833 584.588 636.88791 586.5090 606.5575 3034.728   100
##    vec10   4.275   4.508   5.17937   4.7045   5.0455   19.173   100
##   vec100  28.946  29.338  31.04682  29.8150  30.4745   74.487   100

Vectorise

x <- matrix(rnorm(15), nrow = 3, ncol = 5)
row_means_loop <- function(x) {
    row_means <- numeric(nrow(x))
    for(i in 1:nrow(x)) {
        row_sum <- 0
        for(j in 1:ncol(x)) {
            row_sum <- row_sum + x[i,j]
        }
        row_means[i] <- row_sum / ncol(x)
    }
    return(row_means)
}
microbenchmark(
    no_vectorization = row_means_loop(x),
    apply = apply(x, 1, mean),
    c_version = rowMeans(x),
    times = 1000)
## Unit: microseconds
##              expr    min     lq      mean  median      uq      max neval
##  no_vectorization  5.199  5.607 16.300027  6.0025  6.4065 7169.270  1000
##             apply 22.297 23.259 28.150100 23.8715 25.7180  115.169  1000
##         c_version  2.889  3.333  4.273167  3.7890  4.1200   26.165  1000

Another example: a problem in probability.

First attempt:

# perform nreps repetitions of the marble experiment, to estimate
# P(pick blue from Urn 2)
sim1 <- function(nreps)  {
    nb1 <- 10 #10 blue marbles in Urn1
    n1 <- 18 # number of marbles in Urn 1 at 1st pick
    n2 <- 13 # number of marbles in Urn 2 at 2nd pick
    count <- 0 # number of repetitions in which get blue from Urn 2
    for (i in 1:nreps) {
        nb2 <- 6 # 6 blue marbles orig. in Urn2
        ## pick from Urn 1 and put in Urn 2; is it blue?
        if (runif(1) < nb1/n1) nb2 <- nb2 + 1
        ## pick from Urn 2; is it blue?
        if (runif(1) < nb2/n2) count <- count + 1
    }
    return(count/nreps) # est. P(pick blue from Urn 2)
}

Let’s try pre-generate all of our random numbers and remove the for loop:

sim2 <- function(nreps) {
    nb1 <- 10
    nb2 <- 6
    n1 <- 18
    n2 <- 13
    ## pre-generate all our random numbers, one row per repetition
    u <- matrix(c(runif(2 * nreps)), nrow = nreps, ncol = 2)
    ## define simfun for use in apply(); simulates one repetition
    simfun <- function(rw) {
        ## rw ("row") is a pair of random numbers
        ## choose from Urn 1
        if (rw[1] < nb1 / n1) nb2 <- nb2 + 1
        ## choose from Urn 2, and return boolean on choosing blue
        return (rw[2] < nb2 / n2)
    }
    z <- apply(u, 1, simfun)
    return(mean(z))
}
microbenchmark(sim1(100), sim2(100))
## Unit: microseconds
##       expr     min       lq     mean  median      uq       max neval
##  sim1(100) 260.834 425.9855 579.7945 437.471 454.409 10682.013   100
##  sim2(100) 154.577 159.9060 245.1615 164.410 177.711  7149.775   100

The timings here are different from in Matloff. I believe this is because more recent version of R byte compile functions that you write, and this was not true for the version of R used in the book.

Take 3:

sim3 <- function(nreps) {
    nb1 <- 10
    nb2 <- 6
    n1 <- 18
    n2 <- 13
    u <- matrix(c(runif(2 * nreps)), nrow = nreps, ncol = 2)
    ## set up the condition vector
    cndtn <-
        u[,1] <= nb1 / n1 & u[,2] <= (nb2+1) / n2 |
        u[,1] > nb1 / n1 & u[,2] <= nb2 / n2
    return(mean(cndtn))
}
microbenchmark(sim1(100), sim2(100), sim3(100))
## Unit: microseconds
##       expr     min      lq     mean   median       uq      max neval
##  sim1(100) 257.505 281.790 379.3203 426.5785 453.1620  662.683   100
##  sim2(100) 152.142 158.667 176.3794 167.3800 187.6515  301.012   100
##  sim3(100)  19.917  24.228 109.0899  26.2305  29.4800 8148.984   100

Overall strategies