Some highlights on Madison’s IFDS summer school: Randomized linear algebra, active machine learning, random graphs (and, of course, deep learning)

I had a great time last week attending the summer school on Fundamentals of Data Analysis at UW-Madison. You can find more details on the school’s website, which might probably get updated with recordings of the talks at some point, and also searching for tweets with the hashtag #MadisonDataSS. Three courses introduced me to very interesting topics and the concluding deep learning lab was a blast.

This post mixes some of my tweets with additional comments on those subjects.

Randomized Linear Algebra

Michael Mahoney (UC Berkeley) kicks off #MadisonDataSS with Randomization in Numerical Linear Algebra #orms

Slides and other materials can be found in the following link: https://t.co/oM950ZzvjQ pic.twitter.com/tvj3UsB1bD

— Thiago Serra (@thserra.bsky.social) (@thserra) July 24, 2018

This course started with estimating the product of very large matrices by using a sample of rows or columns. I left wondering that one could do something similar for linear programs with too many variables or constraints, and estimate the value of the optimal value. Then Jeff Linderoth told me that Leo Liberti has done exactly that with some collaborators: there is a pre-print at Optimization Online.

Active Machine Learning

Moving on to Robert Nowak teaching theory and applications of Active Machine Learning to learn in less time and with less labeled data #MadisonDataSS pic.twitter.com/kix3Scu3cI

— Thiago Serra (@thserra.bsky.social) (@thserra) July 25, 2018

In active learning, you choose the next data point to label aiming to refine the search for the most suitable model (instead of just picking points at random) #MadisonDataSS pic.twitter.com/JE4szWVShO

— Thiago Serra (@thserra.bsky.social) (@thserra) July 25, 2018

Particularly useful when there is discontinuity #MadisonDataSS pic.twitter.com/reSDwBGXwp

— Thiago Serra (@thserra.bsky.social) (@thserra) July 25, 2018

Notes on the theory of Active Machine Learning can be found in Nowak’s website: https://t.co/QXpjK7a95q #MadisonDataSS pic.twitter.com/hUEve1Kl7Y

— Thiago Serra (@thserra.bsky.social) (@thserra) July 25, 2018

This judicious choice of data points to calibrate the model in active learning is important when there is a cost associated with labeling those points. For example, when you need human intervention to determine what the label should be.

Random Graphs

Next at #MadisonDataSS, Sébastien Roch is giving a mini-course about Probabilities on Graphs. Notes can be found on his website: https://t.co/iNTPlmlGXy #orms

— Thiago Serra (@thserra.bsky.social) (@thserra) July 26, 2018

What a beautiful result in graph clustering:

Assuming two balanced blocks and different probabilities for edges within and between partitions, the blocks emerge from the second eigenvector of the adjecency matrix #MadisonDataSS #orms pic.twitter.com/kwxkY5tl3e

— Thiago Serra (@thserra.bsky.social) (@thserra) July 26, 2018

The tweet above about graph clustering was one of those memorable moments when my jaw dropped during a lecture. And those were just 3 slides!

Deep Learning

Interestingly, Deep Learning is the last mini-course at #MadisonDataSS https://t.co/SsVFdFfERC

— Thiago Serra (@thserra.bsky.social) (@thserra) July 27, 2018

For those not attending #MadisonDataSS, slides and lab materials of the deep learning course are available at https://t.co/CYmxWblnug #orms https://t.co/mmwfqaktc7

— Thiago Serra (@thserra.bsky.social) (@thserra) July 28, 2018

The jupyter notebooks of this lab take a bit longer than the lecture time, and I have yet to finish them, but they have been quite easy to follow on my own.

Share this:

Leave a comment Cancel reply