Today:
Informal testing
Test-based design
Formal testing
Reading
We’ve done informal testing when debugging, and you’ve probably done it on your own.
Check whether the output is what we expect, either by inspection
or using ==
or identical
or something
similar.
Idea: make sure your function works on cases where you know what the answer should be.
You’re checking that the core behavior is correct.
## returns the minimum value of d[i,j], i != j, and
## the row/col attaining that minimum, for square
## symmetric matrix d; no special policy on ties;
## motivated by distance matrices
mind <- function(d) {
n <- nrow(d)
## add a column to identify row number for apply()
dd <- cbind(d, 1:n)
wmins <- apply(dd[-n, ], 1, imin)
## wmins will be 2xn, 1st row being indices and 2nd being values
i <- which.min(wmins[1, ])
j <- wmins[2, i]
return(c(d[i, j], i, j))
}
## finds the location, value of the minimum in a row x
imin <- function(x) {
n <- length(x)
i <- x[n]
j <- which.min(x[(i + 1):(n - 1)])
return(c(j, x[j]))
}
m = rbind(c(0, 12, 5), c(12, 0, 8), c(5, 8, 0))
The comment was misleading: what the function is supposed to do is to take a vector x whose last element indicates the row the vector was taken from, and finds the minimum index among locations corresponding to the upper triangle of the initial matrix.
This is very confusing, and it’s why there was a bug in the function to begin with.
We might be tempted to change the function in the following way:
Idea: If the input to the function isn’t exactly what you expect, what happens?
You’re checking that if something funny happens (bad input from a user or another function), your function will (best case) still work correctly or (at minimum) fail informatively.
Very important to make sure that the function doesn’t fail silently: it looks like it’s producing good results, but they’re actually wrong.
Most important if you are not directly providing the input to the function.
Based on our test, we might modify our function to look like this:
This is a bad way to write functions to compute the minimum, we should throw it all out and start over.
Decide what you want your function(s) to do
Write tests for those behaviors
Write the functions, check whether they pass the tests
If they pass, you’re done! Otherwise, cycle through changing the functions and testing.
I want to make a program that performs gradient descent for functions where I don’t have the derivative in closed form.
I need to:
Decide on the form the function should take.
Write tests for the function.
Write the function.
Run the tests.
Form the function should take:
The first argument is the function I want the derivative of
The second argument is the value at which to compute the derivative.
Then I write tests:
## derivative of x^2, evaluated at x = 1
deriv(function(x) x^2, 1) == 2
## derivative of 2 * x, evaluated at x = -5
deriv(function(x) 2 * x, -5) == 2
## derivative of x^2, evaluated at x = 0
deriv(function(x) x^2, 0) == 0
## derivative of e^x, evaluated at x = 0
deriv(function(x) exp(x), 0) == exp(0)
Then I write the following function, based on advice from wikipedia:
And run through my tests:
## [1] TRUE
## [1] TRUE
## [1] NA
## [1] NA
The third and fourth tests failed, and not just because of precision. Why?
Then we can modify the function to evaluate derivatives at \(x = 0\)
Run through the tests again:
## [1] TRUE
## [1] TRUE
## [1] TRUE
## [1] TRUE
## [1] TRUE
R package called testthat
Aimed more at package developers
Allows for tests to be stored and run automatically.
Suppose we have testthat_example.R
and
numerical_deriv.R
, with contents that look like this:
testthat_example.R
:
context("Check numerical derivative")
source("numerical_deriv.R")
test_that("derivatives match on simple functions", {
expect_equal(deriv(function(x) x^2, 1), 2)
expect_equal(deriv(function(x) 2 * x, -5), 2)
expect_equal(deriv(function(x) x^2, 0), 0)
expect_equal(deriv(function(x) 2 * x, 0), 2)
expect_equal(deriv(function(x) exp(x), 0), exp(0))
})
test_that("error thrown when derivative doesn't exist", {
expect_error(deriv(function(x) log(x), 0))
})
numerical_deriv.R
:
## ✔ | F W S OK | Context
## ⠏ | 0 | testthat_example ⠏ | 0 | Check numerical derivative ✖ | 1 1 5 | Check numerical derivative
## ───────────────────────────────────────────────────────────────────────────────────────────────────
## Warning (testthat_example.R:13:5): error thrown when derivative doesn't exist
## NaNs produced
## Backtrace:
## 1. testthat::expect_error(deriv(function(x) log(x), 0))
## at testthat_example.R:13:4
## 6. global deriv(function(x) log(x), 0)
## 7. fn(x - h)
## at numerical_deriv.R:8:4
##
## Failure (testthat_example.R:13:5): error thrown when derivative doesn't exist
## `deriv(function(x) log(x), 0)` did not throw an error.
## ───────────────────────────────────────────────────────────────────────────────────────────────────
##
## ══ Results ════════════════════════════════════════════════════════════════════════════════════════
## [ FAIL 1 | WARN 1 | SKIP 0 | PASS 5 ]
## Error: Test failures
Expectations: Finest unit of testing, checks one aspect of a function’s output.
Tests: Groups of related expectations.
Contexts/files: Each context or file can contain a group of tests. Primarily useful for having the test output formatted nicely. Perhaps also useful if you have some tests that require a lot of setup and you don’t want to run them every time.
An expectation is the finest unit of testing, tests whether a call to a function does what you expect.
All start with expect_
All have two arguments: the actual result, and what you expect
testthat
will throw an error if the two don’t
match
Some of the most useful expectations
expect_equivalent
/expect_equal
/expect_identical
:
Check for equality within numerical precision or exact equivalence
(expect_identical
built on identical
function,
which also checks for type)## [1] 1 2
## [1] 1 2
## a b
## 1 2
## Error: `a_int` not identical to `a_double`.
## Objects equal but not identical
## Error: `a_int` not equal to `a_named`.
## names for current but not for target
expect_match
: Checks whether a string matches a
regular expression
expect_output
: Checks the output of a function the
same way expect_match
would
## List of 2
## $ : int [1:10] 1 2 3 4 5 6 7 8 9 10
## $ : chr [1:26] "a" "b" "c" "d" ...
expect_warning
/expect_error
: Checks
whether the function gives an error or a warning.expect_is
: Checks whether the function gives a result
of the correct class (we’ll talk more about classes in a couple
weeks).expect_true
/expect_false
: Catch-alls for
cases the other expectations don’t cover.For testthat
, a test is just a group of
expectations.
You can group them however you like, but usually you think of them as covering one unit of functionality. Often this means one test per function.
Group expectations into tests so that when a test fails, it’s easy to figure out what part of the code caused the error.
When you add a new function/functionality, add a new test.
Write a test when you discover a bug.
Most important to test code that is delicate: has complicated dependencies on other functions many edge cases, doing something complicated that you’re not sure about. (In that case you might want to re-think your function design though.)
If tests are grouped according to the desired behavior of a function, they are easier to update later if you want to change the behavior of the function.