Testing and debugging

Logistics

Today:

Reading

Informal testing

Simple cases

Idea: make sure your function works on cases where you know what the answer should be.

You’re checking that the core behavior is correct.

## returns the minimum value of d[i,j], i != j, and
## the row/col attaining that minimum, for square
## symmetric matrix d; no special policy on ties;
## motivated by distance matrices
mind = function(d) {
    n = nrow(d)
    ## add a column to identify row number for apply()
    dd = cbind(d, 1:n)
    wmins = apply(dd[-n, ], 1, imin)
    ## wmins will be 2xn, 1st row being indices and 2nd being values
    i = which.min(wmins[1, ])
    j = wmins[2, i]
    return(c(d[i, j], i, j))
}

## finds the location, value of the minimum in a row x
imin = function(x) {
    n = length(x)
    i = x[n]
    j = which.min(x[(i + 1):(n - 1)])
    return(c(j, x[j]))
}

m = rbind(c(0, 12, 5), c(12, 0, 8), c(5, 8, 0))

Let’s write some tests for imin.

The comment says that it finds the location and value of the minimum in a row x, so let’s see if it does.

## location of the minimum
x = 1:5
index_and_value = imin(x)
## should be 1, per the comment
index_and_value[1]
## [1] 3
## should be 1, per the comment
index_and_value[2]
## [1] 3

The comment was misleading: what the function is supposed to do is to take a vector x whose last element indicates the row the vector was taken from, and finds the minimum index among locations corresponding to the upper triangle of the initial matrix.

This is very confusing, and it’s why there was a bug in the function to begin with.

We might be tempted to change the function in the following way:

imin = function(x) {
    n = length(x)
    row = x[n]
    upper_tri_idx = (row + 1):(n - 1)
    idx_in_upper_tri = which.min(x[upper_tri_idx])
    idx = upper_tri_idx[idx_in_upper_tri]
    value = x[idx]
    return(c(idx, value))
}

And then we can test:

row = 1
x = c(5:1, row)
index_and_value = imin(x)
## index of the minimum should be 5
index_and_value[1] == 5
## [1] TRUE
## value of the minimum should be 1
index_and_value[2] == 1
## [1] TRUE

Edge cases

Idea: If the input to the function isn’t exactly what you expect, what happens?

Let’s try testing the imin function again with an edge case:

row = 5
x = c(5:1, row)
index_and_value = imin(x)
## there aren't any elements in the upper triangle in row 5, so these should be some sort of NA
index_and_value
## [1] 5 1

Based on our test, we might modify our function to look like this:

imin = function(x) {
    n = length(x)
    row = x[n]
    if(row >= (length(x) - 1)) {
        upper_tri_idx = c()
    } else {
        upper_tri_idx = (row + 1):(n - 1)
    }
    idx_in_upper_tri = which.min(x[upper_tri_idx])
    idx = upper_tri_idx[idx_in_upper_tri]
    value = x[idx]
    return(c(idx, value))
}

And try testing again:

row = 5
x = c(5:1, row)
index_and_value = imin(x)
## there aren't any elements in the upper triangle in row 5, so these should be some sort of NA
index_and_value
## numeric(0)

The real moral of the story

This is a bad way to write functions to compute the minimum, we should throw it all out and start over.

Test-based design

Example

I want to make a program that performs gradient descent for functions where I don’t have the derivative in closed form.

I’m going to need to write a function that computes the derivative

## derivative of x^2, evaluated at x = 1
deriv(function(x) x^2, 1) == 2
## derivative of 2 * x, evaluated at x = -5
deriv(function(x) 2 * x, -5) == 2
## derivative of x^2, evaluated at x = 0
deriv(function(x) x^2, 0) == 0
## derivative of e^x, evaluated at x = 0
deriv(function(x) exp(x), 0) == exp(0)

Then I write the following function, based on advice from wikipedia:

deriv = function(fn, x) {
    eps = .Machine$double.eps
    h = sqrt(eps) * x
    deriv = (fn(x + h) - fn(x - h)) / (2 * h)
    return(deriv)
}

And run through my tests:

## derivative of x^2, evaluated at x = 1
deriv(function(x) x^2, 1) == 2
## [1] TRUE
## derivative of 2 * x, evaluated at x = -5
deriv(function(x) 2 * x, -5) == 2
## [1] TRUE
## derivative of x^2, evaluated at x = 0
deriv(function(x) x^2, 0) == 0
## [1] NA
## derivative of 2 * x, evaluated at x = 0
deriv(function(x) 2 * x, 0) == 2
## [1] NA

The third and fourth tests failed, and not just because of precision. Why?

Then we can modify the function to evaluate derivatives at \(x = 0\)

deriv = function(fn, x) {
    eps = .Machine$double.eps
    if(x == 0) {
        h = 2 * eps
    } else {
        h = sqrt(eps) * x
    }
    deriv = (fn(x + h) - fn(x - h)) / (2 * h)
    return(deriv)
}

Run through the tests again:

## derivative of x^2, evaluated at x = 1
deriv(function(x) x^2, 1) == 2
## [1] TRUE
## derivative of 2 * x, evaluated at x = -5
deriv(function(x) 2 * x, -5) == 2
## [1] TRUE
## derivative of x^2, evaluated at x = 0
deriv(function(x) x^2, 0) == 0
## [1] TRUE
## derivative of 2 * x, evaluated at x = 0
deriv(function(x) 2 * x, 0) == 2
## [1] TRUE
## derivative of e^x, evaluated at x = 0
deriv(function(x) exp(x), 0) == exp(0)
## [1] TRUE

More formal ways of integrating tests

Suppose we have testthat_example.R and numerical_deriv.R, with contents that look like this:

testthat_example.R:

context("Check numerical derivative")
source("numerical_deriv.R")

test_that("derivatives match on simple functions", {
    expect_equal(deriv(function(x) x^2, 1), 2)
    expect_equal(deriv(function(x) 2 * x, -5), 2)
    expect_equal(deriv(function(x) x^2, 0), 0)
    expect_equal(deriv(function(x) 2 * x, 0), 2)
    expect_equal(deriv(function(x) exp(x), 0), exp(0))
})

test_that("error thrown when derivative doesn't exist", {
    expect_error(deriv(function(x) log(x), 0))
}) 

numerical_deriv.R:

deriv = function(fn, x) {
    eps = .Machine$double.eps
    if(x == 0) {
        h = 2 * eps
    } else {
        h = sqrt(eps) * x
    }
    deriv = (fn(x + h) - fn(x - h)) / (2 * h)
    return(deriv)
}
library(testthat)
test_dir(".")
## ✔ |  OK F W S | Context
## ⠏ |   0       | Check numerical derivative✖ |   5 1 1   | Check numerical derivative
## ───────────────────────────────────────────────────────────────────────────
## testthat_example.R:13: warning: error thrown when derivative doesn't exist
## NaNs produced
## 
## testthat_example.R:13: failure: error thrown when derivative doesn't exist
## `deriv(function(x) log(x), 0)` did not throw an error.
## ───────────────────────────────────────────────────────────────────────────
## 
## ══ Results ════════════════════════════════════════════════════════════════
## OK:       5
## Failed:   1
## Warnings: 1
## Skipped:  0

Idea behind testthat

Expectations

An expectation is the finest unit of testing, tests whether a call to a function does what you expect.

Some of the most useful expectations

a_int = 1:2
a_double = as.double(a_int)
a_named = a_int
names(a_named) = letters[1:2]
a_int
## [1] 1 2
a_double
## [1] 1 2
a_named
## a b 
## 1 2
expect_identical(a_int, a_double)
## Error: `a_int` not identical to `a_double`.
## Objects equal but not identical
expect_equal(a_int, a_double)
expect_equal(a_int, a_named)
## Error: `a_int` not equal to `a_named`.
## names for current but not for target
expect_equivalent(a_double, a_named)
a = list(1:10, letters)
str(a)
## List of 2
##  $ : int [1:10] 1 2 3 4 5 6 7 8 9 10
##  $ : chr [1:26] "a" "b" "c" "d" ...
expect_output(str(a), "List of 2")
expect_output(str(a), "int [1:10]", fixed = TRUE)

Tests

Advice on writing tests