3 years ago

Deriving optimal weights in deep neural networks.

Neda Rohani, Aggelos Katsaggelos, Nima Dehmamy

Training deep neural networks generally requires massive amounts of data and is very computation intensive. We show here that it may be possible to circumvent the expensive gradient descent procedure and derive the parameters of a neural network directly from properties of the training data. We show that, near convergence, the gradient descent equations for layers close to the input can be linearized and become stochastic equations with noise related to the covariance of data for each class. We derive the distribution of solutions to these equations and discover that it is related to a "supervised principal component analysis." We implement these results on image datasets MNIST, CIFAR10 and CIFAR100 and find that, indeed, pretrained layers using our findings performs comparable or superior to neural networks of the same size and architecture trained with gradient descent. Moreover, our pretrained layers can often be calculated using a fraction of the training data, owing to the quick convergence of the covariance matrix. Thus, our findings indicate that we can cut the training time both by requiring only a fraction of the data used for gradient descent, and by eliminating layers in the costly backpropagation step of the training. Additionally, these findings partially elucidate the inner workings of deep neural networks and allow us to mathematically calculate optimal solutions for some stages of classification problems, thus significantly boosting our ability to solve such problems efficiently.

Publisher URL: http://arxiv.org/abs/1703.04757

DOI: arXiv:1703.04757v2

You might also like
Never Miss Important Research

Researcher is an app designed by academics, for academics. Create a personalised feed in two minutes.
Choose from over 15,000 academics journals covering ten research areas then let Researcher deliver you papers tailored to your interests each day.

  • Download from Google Play
  • Download from App Store
  • Download from AppInChina

Researcher displays publicly available abstracts and doesn’t host any full article content. If the content is open access, we will direct clicks from the abstracts to the publisher website and display the PDF copy on our platform. Clicks to view the full text will be directed to the publisher website, where only users with subscriptions or access through their institution are able to view the full article.