Skip to content
Snippets Groups Projects
syllabus.md 5.19 KiB
Newer Older
Mark's avatar
Mark committed
# Course Syllabus

### Text and Resources

The primary course resource is Canvas and Gitlab. There is no required textbook. Assignments and necessary resources will be posted on Canvas.
Mark's avatar
Mark committed

### Location

The class meets in person every Tuesday of the quarter at the MSDS Facility from 5:00 p.m. to 8:50 p.m. Some class meetings may be replaced by Zoom sessions. Any Zoom substitutions will be announced during class.

### Communication

The primary communication channel for this course is Canvas. Please use the discussion board to post questions, allowing everyone to benefit from the responses. This also enables other students and TAs to provide answers.  
Mark's avatar
Mark committed
Official UW messages will be sent via email.

### Workload

**This is tentative and subject to change.**

This is a hands-on course, so please bring your laptop to every class session.
Mark's avatar
Mark committed

Most assignments will involve working through data challenges. Your submissions will be evaluated based on timely submission, your attempt, and concluding remarks. No late work will be accepted unless supported by official university accommodations.

Expect occasional in-class participation assignments. There will be no make-up for missed classes unless supported by official university accommodations.

A short final project will likely be assigned to tie everything together.

### Teamwork and Usage of Artificial Intelligence (AI)
Mark's avatar
Mark committed
You are encouraged to collaborate with classmates and/or use AI to help with data assignment challenges. However, the work you submit must be your own creation. Copying and pasting solutions from teammates or AI-generated content will be considered cheating and addressed accordingly.

### Grading

**This is tentative and subject to change.**

Grading will be weighted as follows:
- 60% Data Assignments
- 30% Final Project
- 10% Participation Assignments

There will be an opportunity for extra credit towards the end of the quarter. Details will be announced later.

### Grading Rubric

Your homework and final assignments will be technical. There is no singular correct answer; instead, your assignments will be evaluated based on the following criteria:
- 60% Completion: Awarded for submitting the assignment on time with a meaningful attempt at both the code and narrative components. Partial functionality is acceptable if effort is evident.
- 30% Results: Your code produces outputs that are closely aligned with the expected results and adheres to any outlined design principles. Minor deviations are acceptable as long as your narrative clearly explains your approach and reasoning.
- 10% Efficiency: Your code runs within a reasonable time frame and avoids major technical inefficiencies.

Mark's avatar
Mark committed
### Course Schedule and Topics

**This is tentative and subject to change.**

Expect us to cover data infrastructure, big data processing, query engines, data pipelines, and related topics. Several guest speakers from industry may also present relevant subjects.

| Week | Date       | Modality         | Topic                                 | Assignment                                      | Due Date   |
| :--- | :--------- | ---------------- | :------------------------------------ | :---------------------------------------------- | :--------- |
| 1    | 2024-10-01 | UW MSDS Facility | Intro to class, syllabus              |                                                 |            |
| 2    | 2024-10-08 | UW MSDS Facility | Data basics, AWS Web Console, AWS CLI |                                                 |            |
| 3    | 2024-10-15 | UW MSDS Facility | Polars DataFrames                     | [homework 1](../assignments/hw1/homework_01.md) | 2024-10-28 |
| 4    | 2024-10-22 | UW MSDS Facility | Guest?, Polars DataFrames (cont.)     |                                                 |            |
Mark's avatar
Mark committed
| 5    | 2024-10-29 | UW MSDS Facility | Amazon Redshift                       | [homework 2](../assignments/hw2/homework_02.md) | 2024-11-11 |
| 6    | 2024-11-05 | UW MSDS Facility | Guest?, Amazon Redshift (cont.)       |                                                 |            |
Mark's avatar
Mark committed
| 7    | 2024-11-12 | UW MSDS Facility | Amazon Athena                         | [homework 3](../assignments/hw3/homework_03.md) | 2024-12-02 |
Mark's avatar
Mark committed
| 8    | 2024-11-19 | Async/Offline    | DataCamp Spark Lessons                |                                                 |            |
| 9    | 2024-11-26 | UW MSDS Facility | Guest?, Building a pipeline           | [final](../assignments/final/final.md)          | 2024-12-10 |
| 10   | 2024-12-03 | UW MSDS Facility | Building a pipeline (cont.)           |                                                 |            |
| 11   | 2024-12-10 | none             | No class meeting                      |                                                 |            |

| Assignment                                      | Topic                         |
| ----------------------------------------------- | ----------------------------- |
| [homework 1](../assignments/hw1/homework_01.md) | Data validation with Polars   |
| [homework 2](../assignments/hw2/homework_02.md) | Query warm data with Redshift |
Mark's avatar
Mark committed
| [homework 3](../assignments/hw3/homework_03.md) | Using MLib via Spark   |
| [final](../assignments/final/final.md)          | Build a pipeline              |