Sandbox 5

singular learning theory

Singular Learning Theory (SLT) is a mathematical framework that expands and improves upon traditional statistical Learning theory using techniques from algebraic geometry, bayesian statistics, and statistical physics.

In the case of learning algorithms, such as deep neural networks, where there are multiple parameter values corresponding to the same statistical distribution, the preimage of the target distribution may take the form of a singular subspace of the parameter space.

Motivation

  • Deep learning has ability to model extremely complex functions due to its hierarchical structure and hidden variables, which are in general nonidentifiable (the map from the parameter to a statistical model is not one-to-one) and singular (the likelihood function cannot be approximated by any Gaussian function); almost all eigen values of Fisher information matrix are zero.[1][2]
  • The maximum likelihood and Bayesian methods have different predictive performances, even if the sample size goes to infinity.
  • Even if both statistisical model and a prior distribution are in an overparametrized state for unknown uncertainty, the generalization error fails to increase.[4]

These facts show that deep learning is quite different from conventional statistical models. Singular learning theory seeks to construct a mathematical foundation for such models on the basis of algebraic geometry.

SLT obtains mathematical theorems which hold for an arbitrary triple of (a true distribution, a statistical model, and a prior), which can be applied to the real world data science problems where uncertainty is unknown. Furthermore, the marginal likelihood and the generalization error of singular learning machines is clarified and can be calculated in real world problems.[3].

(edit)

Topics in Artificial Intelligence and Machine Learning

(edit)

Topics in Information Theory and Control Theory

Bibliography
1. Shun-ichi Amari, T. Ozeki, H. Park, Learning and inference in hierarchical models with singularities, Syst. Comput. Japan 34:7 (2003) 34–42
2. Sumio Watanabe, Almost all learning machines are singular, Proc. IEEE Symp. Found. Comput. Intell., Apr. 2007, 383–388.
3. Sumio Watanabe, Recent Advances in Algebraic Geometry and Bayesian Statistics, Information Geometry,(2022), https://doi.org/10.1007/s41884-022-00083-9
4. Sumio Watanabe, Mathematical Theory of Bayesian Statistics for Unknown Information Source, Philosophical Transactions A, (2023), https://doi.org/10.1098/rsta.2022.0151
Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License