Stat 470/670: Exploratory Data Analysis

Logistics

Meetings: Tuesdays and Thursdays, 1:15-2:30, GY 1050

Website: http://jfukuyama.github.io/teaching/stat670

Instructor: Prof. Julia Fukuyama                   jfukuyam at iu dot edu
Office hours: Tuesdays 2:30-4:30 Myles Brand E218              

Associate Instructor: Ms. Fatma Parlak   fparlak at iu dot edu
Office hours: Thursdays and Fridays 9-10, on Zoom (see Canvas for link)            

Associate Instructor: Ms. Gandhali Marunmale   gamarunm at iu dot edu
Office hours: Wednesdays 10-11am, on Zoom (see Canvas for link)            

Course Overview

Graphical and modeling techniques for exploring data, with an emphasis on visualization, interpretation, and clear communication of findings. Use of modern software tools for data manipulation and visualization. Connections to traditional statistical methods.

Textbooks

We will be drawing heavily on Cleveland’s Visualizing Data and Hadley Wickham’s ggplot2: Elegant Graphics for Data Analysis. Both of these are available online through the IU library.

Also useful will be R for Data Science by Wickham and Grolemund, available online.

Readings and notes for topics not covered in the textbooks will be posted to the course website and to canvas.

Class Structure

Classes will be a combination of lecture and tool demonstration. It will generally be helpful for you to have an R session open to follow along wth the code. Slides or notes, with R code, will be posted to the class website before each lecture.

Assessment

Grades will be assigned based on:

There will be no final exam; the last responsibility for the course will be the report for the final project due on the last day of class.

All the assignments will be graded on how well the material is presented in addition to accuracy. This means there should be no extraneous material, plots should be readable, and text and figures should be formatted nicely.