Synthesis & Presentation Prep

Theory

Unify the course’s theoretical themes: learning from large-scale structured data (interactions, graphs, sequences) under constraints of computation, memory, and reliability. Revisit the central concepts: sparse matrices and factorization, graph communities and link analysis, many-series forecasting, frequent pattern and sequential mining, and distributed algorithm patterns for ETL. Highlight open research and practice questions in scalable recommenders, graph ML, and retail forecasting. Facilitate discussion where students explicitly articulate how their project deviated from a naive small-data solution and what approximations or design choices scale forced on them.

Technical

Technical portion focused on polishing and rehearsal. Run structured dry-runs: a subset of teams give short practice presentations and receive feedback on clarity, narrative, and emphasis on metrics and trade-offs. Remaining teams conduct internal rehearsals and finalize visualizations, tables, and code organization. Class time is also available for last-minute debugging of pipelines and confirming reproducibility on the ARC cluster.