I had a great time last week attending the summer school on Fundamentals of Data Analysis at UW-Madison. You can find more details on the school’s website, which might probably get updated with recordings of the talks at some point, and also searching for tweets with the hashtag #MadisonDataSS. Three courses introduced me to very interesting topics and the concluding deep learning lab was a blast.
This post mixes some of my tweets with additional comments on those subjects.
Randomized Linear Algebra
This course started with estimating the product of very large matrices by using a sample of rows or columns. I left wondering that one could do something similar for linear programs with too many variables or constraints, and estimate the value of the optimal value. Then Jeff Linderoth told me that Leo Liberti has done exactly that with some collaborators: there is a pre-print at Optimization Online.
Active Machine Learning
This judicious choice of data points to calibrate the model in active learning is important when there is a cost associated with labeling those points. For example, when you need human intervention to determine what the label should be.
The tweet above about graph clustering was one of those memorable moments when my jaw dropped during a lecture. And those were just 3 slides!
The jupyter notebooks of this lab take a bit longer than the lecture time, and I have yet to finish them, but they have been quite easy to follow on my own.