Stat 710: Statistical Computing

Logistics

Meetings: Tuesdays and Thursdays, 1-2pm, BH 317

Website: http://jfukuyama.github.io/teaching/stat710

Instructor: Prof. Julia Fukuyama             jfukuyam at iu dot edu
Office hours: Thursdays 2-4pm             Office: Informatics East, Room 201
               
Associate Instructor: Mr. John Koo             johnkoo at iu dot edu
Office hours: Mondays 10:30-11:30am and Wednesdays 1-2pm             Office: Informatics East, Room 103

Course Overview

As a statistician, you will need to manipulate data, optimize, and simulate. You will also need to know enough about how the methods you use work to diagnose problems when they arise and to be able to implement modified versions when the standard implementations don’t suit your purposes.

You also need to write accurate, clean, maintainable, demonstrably correct code. To that end, the first half of the class will be devoted to how to program well, with statistical tasks giving us the computational problems. The class will be primarily in R, with one homework in python and the option of another homework in python for those who would like more experience with the language.

Once we have the software engineering down, we will move on to the algorithms used in applied statistics. These can be roughly broken up into optimization methods and stochastic simulation methods. For optimization, we will cover gradient descent, stochastic gradient descent, the EM algorithm, and topics in convex optimization. Stochastic algorithms will include rejection sampling, Metropolis-Hastings, and Gibbs sampling.

Textbooks

The primary textbook for the course with be The Art of R Programming, by Norman Matloff.

The R Cookbook, by Paul Teetor, will also be useful.

Additional readings will be posted on the course website.

Assessment

Assessment will be based on a combination of homework, an in-class midterm, and an in-class final on the scheduled final exam date. Final grades will be based on:

There will be 10 homeworks over the course of the semester, generally graded out of 5 points, with one point for a good-faith effort at all the problems, 5 points for correct answers with clean code, and an intermediate number of points otherwise.

Homeworks will be assigned on Sundays and due the following Tuesday (9 days later). At the time the homework is assigned, we will generally not have covered all the material needed to complete the homework, but we will have covered everything by the Thursday before the due date. The idea is to give you the homework early enough that you can think about it while the material is being covered in lecture. Therefore, it will generally be a good idea to take a look at the homework when it is assigned even if you aren’t able to complete all the problems yet.