Network Plot

Prof. James Bagrow

Email: james.bagrow [at]
Lectures: Tu/Th 18:00–19:15 in Farrell Hall Decision Theater
Office Hours:  Mo 10:00–11:30, We 09:30–11:00, or by appointment
Office: Farrell Hall room 212 ( Map to my office)
Course syllabus

Extracting meaning from data remains one of the biggest tasks of science. The Internet and modern computers have given us vast amounts of data, so it is more important than ever to understand how to collect, process, and analyze these data while maintaining reproducibility with data provenance or "chain of custody" of the data.

In this course students will learn:

  1. scientific computing pipelines, software testing, “defensive” data analysis, and revision control,
  2. practical implementations of advanced statistical analyses,
  3. dealing with large-scale datasets, remote computing, and "big data"-ready pipelines,
  4. exploring the literature of cutting-edge data analytics
  5. communicating data-driven results.

As with Data Science I, particular emphasis will be placed on nontraditional (non-numeric) data such as networks, text corpora, etc. and on developing good habits for rigorous and reproducible computational science.

Section expectations

The best way to learn is by doing. Lectures will be used for guidance, but students will directly develop their own computer programs and workflows. Students should expect an average of 6-8 hours of work outside of class per week, depending on skill level and experience entering the course. No textbook is required.

Course Prerequisites: STAT 287 Data Science I.


Grades will be based on homework assignments, in-class reading presentations, and a final research project and presentation.

Lecture and assignment materials

Access to course materials will be given through the class git repository. Please pull frequently.