Understanding Self-supervised Learning

Tengyu Ma (Stanford University)



Self-supervised learning has made empirical breakthroughs in producing representations that can be applied to a wide range of downstream tasks. In this talk, I will primarily present a recent work that analyzes contrastive learning algorithms under realistic assumptions on the data distributions for vision applications. We prove that contrastive learning can be viewed as a parametric version of spectral clustering on a so-called population augmentation graph, and analyze the linear separability of the learned representations and provide sample complexity bounds. I will also briefly discuss two follow-up works studying self-supervised representations' performance under imbalanced training datasets and for shifting test distributions. The talk is based on recent works: https://arxiv.org/abs/2106.04156, https://arxiv.org/abs/2110.05025, https://arxiv.org/abs/2204.00570, https://arxiv.org/abs/2204.02683.


Back to Day 3