Syllabus
A PDF of the syllabus is available here.
Personnel and Logistics
Meeting Times
In-person Lectures: Tu and Th 11:30am–12:45pm at E-Lab 325
Credit Hours: 3
Office Hours: Tu/We 2:20pm–3:20pm at Marston 214D (or Zoom)
Instructor
Name: Jimi Oke
Email: jimi@umass.edu
Office: 214D Marston Hall
Please allow up to 48 hours for a response to your email. Be sure to put “CEE616” in the subject to ensure a prompt response.
Course Information
Description
This course covers core concepts in machine learning (models and algorithms) from a probabilistic perspective. Key topics include:
- Linear methods for regression and classification (including flexible functional forms)
- Deep neural networks for structured data, sequences and images
- Nonparametric methods: kernels, support vector machines, decision trees
- Unsupervised learning (dimensionality reduction, clustering)
Applications to various subdisciplines will be highlighted, especially in transportation, environmental, structural and industrial engineering. Hands-on programming in Python (R will also be supported) throughout the course will enable students to analyze and train models on real-world datasets. Through this course, students will understand the potential of machine learning in civil, environmental and industrial engineering, among other disciplines, as well as learn to create and train models from data to solve challenging problems.
Objectives
- Understand the theory behind fundamental ML models and algorithms and apply them to engineering problems
- Develop and train ML models for various problems in engineering and beyond
- Learn to use Python or similar programming language (e.g. R) to execute ML models
Texts
The primary texts for this course are:
-
Murphy, K. (2022). Probabilistic Machine Learning: An Introduction. MIT Press. (This text is freely available at https://probml.github.io/pml-book/book1.html. Abbreviated as PMLI in lecture slides and handouts.)
-
Goulet, J.-A. (2020) Probabilistic Machine Learning for Civil Engineers, MIT Press. (This text is freely available at http://profs.polymtl.ca/jagoulet/Site/Goulet_web_page_BOOK.html. Abbreviated as PMLCE in lecture slides and handouts.)
-
Hastie, T., Tibshirani, R., & Friedman, J. (2017). The Elements of Statistical Learning: Data Mining, Inference and Prediction. Springer, New York, NY. Second Edition. (This text is freely available at https://web.stanford.edu/~hastie/ElemStatLearn/. Abbreviated as ESL in lecture slides and handouts.)
Supplementary text:
- Goodfellow, I., Bengio, Y. & Courville, A. (2016). Deep Learning, MIT Press. (This text is freely available at https://www.deeplearningbook.org/. Abbreviated as DL in lecture slides and handouts.)
Any other recommended or required reading will be provided on Moodle.
Prerequisites
College-level knowledge of probability, statistics, linear algebra and calculus. Some programming experience in any language is helpful, but you should be ready to get up to speed with any necessary technical skills. Familiarity with Python/R is encouraged.
Policies and Values
I will use slides in the classroom, and annotate them electronically when possible. These slides will be available to you prior to the lecture. I will endeavor to foster an equitable and inclusive learning environment that will spark your curiosity and challenge you learn actively. I strongly urge you to come to class prepared, having done the reading, ready to reflect on your homework or problem set and engage with new material. I will ask frequent questions of you, and will also expect you to ask as many questions as possible.
Assessments and Grading
There will be no grading on a curve. Consistent with this, after drop date, students who remain in this class are not in jeopardy of seeing their grades change due to the change in class composition.
| Assessment | Value (%) |
|---|---|
| Problem Sets (5) | 50 |
| Midterm Exam 1 | 15 |
| Midterm Exam 2 | 15 |
| Project | 20 |
| TOTAL | 100 |
Final letter grades will be based on the following scale:
| Grade | Range (%) |
|---|---|
| A | 93-100 |
| A- | 90-92 |
| B+ | 87-89 |
| B | 83-86 |
| B- | 80-82 |
| C+ | 77-79 |
| C | 73-76* |
| C- | 70-72* |
| D | 60-69* |
| F | ≤59 |
Note: Graduate students cannot earn grades of C-, D or D+, so scores lower than 73% are Failing grades for Graduate students.
Problem Sets
Five problem sets will be assigned. Submission will be online (PDFs and other supporting code; or Jupyter notebooks) via Moodle. Each will be worth 10% of your total grade. Late problem sets will automatically attract a 25% penalty and will not be accepted more than 4 days beyond the due date (excepting prior permission).
Midterms
There will be 2 take-home midterms, which will be open-resource. Previous exams may be available for practice.
Programming
Some lectures will incorporate engineering applications of machine learning concepts using Python. Problem sets will also involve some coding in Python. I recommend installing JupyterLab. You are welcome to use other languages/platforms such as R/RStudio or Matlab for your assignments. However, I cannot guarantee the same level of support for Matlab in particular.
Computing Resource
Having a laptop is not a requirement for this course. However, if you own one and are able to bring it to the classroom, it may improve your learning experience during the programming segments of the lecture.
Project
The term project will be worth 20% of your total grade. You are encouraged to start thinking about the concepts and methods you would like to investigate further in a real-world setting. I will ask you to submit a project proposal (individually or with a partner or two of your choice) that applies two of the modeling approaches covered in class to a relevant problem. This may be related to your own research as well. Further guidance will be provided midway through the semester. The final exam time will be devoted to in-class presentations of each project.
Attendance and Participation
You are expected to show up to every class (either virtually or in-person), in the absence of any emergencies or illness (please email me ahead of time if any situations arise).
Academic Honesty Policy Statement
Since the integrity of the academic enterprise of any institution of higher education requires honesty in scholarship and research, academic honesty is required of all students at the University of Massachusetts Amherst. Academic dishonesty including but not limited to cheating, fabrication, plagiarism, and facilitating dishonesty, is prohibited in all programs of the University. Appropriate sanctions may be imposed on any student who has committed an act of academic dishonesty. For more information about what constitutes academic dishonesty, please see the Dean of Students’ website: https://www.umass.edu/honesty/
Disability Statement
The University of Massachusetts Amherst is committed to making reasonable, effective and appropriate accommodations to meet the needs of students with disabilities and help create a barrier-free campus. If you are in need of accommodation for a documented disability, register with Disability Services to have an accommodation letter sent to your faculty. For more information, consult the Disability Services website at http://www.umass.edu/disability/.
Title IX Statement
In accordance with Title IX of the Education Amendments of 1972 that prohibits gender-based discrimination in educational settings that receive federal funds, the University of Massachusetts Amherst is committed to providing a safe learning environment for all students, free from all forms of discrimination, including sexual assault, sexual harassment, domestic violence, dating violence, stalking, and retaliation. A summary of the available Title IX resources (confidential and non-confidential) can be found at: https://www.umass.edu/titleix/resources. If you need immediate support, you are not alone. Free and confidential support is available 24 hours a day/7 days a week/365 days a year at the SASA Hotline 413-545-0800.
Schedule
This course is broadly organized around 5 modules. The schedule may be adapted over the duration of the semester to suit the needs of the class. Readings will be provided in lecture notes and on Moodle.
Module 1: Foundations
| Day | Date | Topic | Assignments |
|---|---|---|---|
| Tu | Sep 2 | Introduction | |
| Th | Sep 4 | Probability | |
| Tu | Sep 9 | Statistics | PS1 assigned |
| Th | Sep 11 | Decision theory; Information theory | |
| Tu | Sep 16 | Linear Algebra | |
| Th | Sep 18 | Optimization |
Module 2: Linear Methods
| Day | Date | Topic | Assignments |
|---|---|---|---|
| Tu | Sep 23 | Linear discriminant analysis | PS1 due; PS2 assigned |
| Th | Sep 25 | Logistic regression | |
| Tu | Sep 30 | Linear regression (OLS, WLS) | |
| Th | Oct 2 | Ridge and Lasso regression | |
| Tu | Oct 7 | Splines and generalized additive models (GAMs) | |
| Th | Oct 9 | Generalized linear models (GLMs) | |
| Tu | Oct 14 | Exam I (take-home; no class) | PS2 due |
Module 3: Deep Neural Networks (DNNs)
| Day | Date | Topic | Assignments |
|---|---|---|---|
| Th | Oct 16 | NNs for structured data I (MLP, backpropagation) | PS3 assigned |
| Tu | Oct 21 | NNs for structured data II (training, regularization) | |
| Th | Oct 23 | NNs for images (CNNs) | |
| Tu | Oct 28 | NNs for sequences (RNNs) |
Module 4: Nonparametric Methods
| Day | Date | Topic | Assignments |
|---|---|---|---|
| Th | Oct 30 | Exemplar-based methods (KNN, KDE, LOESS) | PS3 due; PS4 assigned |
| Th | Nov 6 | Gaussian processes | Project proposal assigned |
| Tu | Nov 18 | Support vector machines | |
| Th | Nov 20 | Trees and ensemble methods | |
| Tu | Nov 25 | Exam II (take-home; no class) | PS4 due |
Module 5: Unsupervised Learning
| Day | Date | Topic | Assignments |
|---|---|---|---|
| Tu | Dec 2 | Principal components analysis & Factor analysis | Proposal due; PS5 assigned |
| Th | Dec 4 | Clustering (HAC, KMeans, MM) | |
| Tu | Dec 9 | Autoencoders (AEs, VAEs) | PS5 due (Th) |
| TBD | TBD | Project Presentations |