“The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks” Paper Summary & Analysis

Objectives/ Goals of the Paper

What problem is the paper tackling?

In Machine Learning, pruning techniques that remove unnecessary parameters are used in neural networks to improve speed and decrease model size without materially affecting accuracy. Earlier work has demonstrated that removing up to 90% of model parameters is possible without reducing accuracy, dramatically speeding up inference once the model is trained. However, up to this point, these models have not been very accurate to train with. The paper’s main contribution is finding a sparse representation of dense neural networks, which improves computational efficiency and generalization. The paper proposes that, given a randomly initialized neural network, there is a subnetwork (with the same initialization) such that, when trained independently, the sparse network can match or improve upon the performance of the original network in a similar or smaller number of iterations.

Paper Limitations, Further Research, & Potential Applications

A major limitation of the paper is that it only runs on relatively small datasets such as MNIST and CIFAR-10, and not on larger datasets like ImageNet. Further work from CSAIL extended this paper and ran it on ImageNet, improving on the methods first developed on this paper. Another problem with the current method is that iterative pruning is very computationally intensive, requiring us to train the network 15 or more times. Further work has worked on reducing this computational requirement.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Cornell Data Science

Cornell Data Science

Cornell Data Science is an engineering project team @Cornell that seeks to prepare students for a career in data science.