A NEW APPROACH TO BUILDING ENERGY MODELS OF NEURAL NETWORKS

Main Article Content

Yurii Parzhyn
Mykyta Lapin
Kostiantyn Bokhan

Abstract

Relevance. Modern artificial neural network models require significant energy and other resources for training and operation. Training generative models involves vast amounts of data. At the same time, these models face challenges related to the trustworthiness of the information they generate. An alternative to current paradigms of building and training neural networks is the development of energy-based models, which could potentially overcome these shortcomings and bring information processing closer to biologically and physically grounded processes. However, existing energy-based models differ little from classical models in terms of their limitations and drawbacks. Therefore, developing new approaches to modeling energy-based information processing in neural networks is highly relevant. The object of research is the process of information processing in artificial neural networks. The subject of the research is the mathematical models for the construction and training of artificial neural networks. The purpose of this paper is to develop and experimentally validate a theoretical framework that postulates the energetic nature of information and its role in the self-organization and evolution of complex information systems. Research Results. A fundamental theory is proposed, describing information as a structure of perceived external energy parameters that govern the processes of forming the internal energetic structure of a system—its model of the external world. This theory encompasses concepts of energy landscapes, principles of energy-based structural and parametric reduction, and a critical analysis of existing computational paradigms. Experimental studies on the construction and training of the developed energy-based model confirm its high generalization ability in one-pass training without using the backpropagation algorithm on ultra-small training datasets.

Article Details

How to Cite
Parzhyn , Y. ., Lapin , M. ., & Bokhan , K. . (2025). A NEW APPROACH TO BUILDING ENERGY MODELS OF NEURAL NETWORKS . Advanced Information Systems, 9(4), 100–119. https://doi.org/10.20998/2522-9052.2025.4.13
Section
Intelligent information systems
Author Biographies

Yurii Parzhyn , Augusta University, Augusta, United States of America

Doctor of Technical Sciences, Postdoctoral Fellow, School of Computer and Cyber Sciences

Mykyta Lapin , National Technical University "Kharkiv Polytechnic Institute", Kharkiv, Ukraine

Postgraduate student of the Department of System Analysis and Information and Analytical Technologies

Kostiantyn Bokhan , National Technical University "Kharkiv Polytechnic Institute", Kharkiv, Ukraine

PhD, Associate Professor of the Department of System Analysis and Information and Analytical Technologies

References

Liu, Y., Cao, J., Liu, C., Ding, K. and Jin, L. (2024), “Datasets for Large Language Models: A Comprehensive Survey”, arXiv, arXiv:2402.18041, doi: https://doi.org/10.48550/arXiv.2402.18041

Villalobos, P., Ho, A., Sevilla, J., Besiroglu, T., Heim, L. and Hobbhahn, M. (2024), “Will we run out of data? Limits of LLM scaling based on human-generated data”, arXiv, arXiv:2211.04325v2, available at:: https://arxiv.org/html/2211.04325v2

Henshall, W. (2024), “The Billion-Dollar Price Tag of Building AI”, TIME, available at: https://time.com/6984292/cost-artificial-intelligence-compute-epoch-report/?utm_source=chatgpt.com

Zewe, A. (2025), “Explained: Generative AI’s environmental impact”, MIT News, Massachusetts Institute of Technology, January 17, available at: https://news.mit.edu/2025/explained-generative-ai-environmental-impact-0117

Luccioni, S., Trevelin, B. and Mitchell, M. (2024), “The Environmental Impacts of AI – Primer”, Hugging Face, published September 3, available at: https://huggingface.co/blog/sasha/ai-environment-primer?utm_source=chatgpt.com

Zhou, L., Schellaert, W., Martínez-Plumed, F., Moros-Daval Y., Ferri C. and Hernández-Orallo J. (2024), “Larger and more instructable language models become less reliable”, Nature, vol. 634, pp. 61–68, doi: https://doi.org/10.1038/s41586-024-07930-y

Williams, R. (2024), “Why Google’s AI Overviews gets things wrong”, MIT Technology Review, doi: https://www.technologyreview.com/2024/05/31/1093019/why-are-googles-ai-overviews-results-so-bad/

LeCun, Y. (2022), “A Path Towards Autonomous Machine Intelligence”, OpenReview. 62 p., available at: https://openreview.net/pdf?id=BZ5a1r-kVsf

Marcus, G. (2018), “Deep Learning: A Critical Appraisal”, arXiv, arXiv.1801.00631, doi: https://doi.org/10.48550/arXiv.1801.00631

Zador, A. (2019), “A critique of pure learning and what artificial neural networks can learn from animal brains”, Nat. Commun., vol. 10, 3770, doi: https://doi.org/10.1038/s41467-019-11786-6

Shumailov, I., Shumaylov, Z., Zhao, Y., Papernot, N., Anderson, R. and Gal, Ya. (2024), “AI models collapse when trained on recursively generated data”, Nature, vol. 631, pp. 755–759, doi: https://doi.org/10.1038/s41586-024-07566-y

Alemohammad, S., Casco-Rodriguez, J., Luzi, L., Humayun, A.I., Babaei, H., LeJeune, D., Siahkoohi, A. and Baraniuk, R.G. (2023), “Self-Consuming Generative Models Go MAD”, arXiv, arXiv:2307.01850, doi: https://doi.org/10.48550/arXiv.2307.01850

Glansdorff, P. and Prigogine, I. (1971), Thermodynamic Theory of Structure, Stability and Fluctuations, Wiley-Interscience, 306 p., available at: https://archive.org/details/thermodynamicthe0000glan/page/n5/mode/2up

Hopfield, J. (1982), “Neural networks and physical systems with emergent collective computational abilities”, Proc. Natl. Acad. Sci. U. S. A. vol. 79(8), pp. 2554–2558, available at:: https://www.pnas.org/doi/epdf/10.1073/pnas.79.8.2554

Hopfield, J. (1984), “Neurons with graded response have collective computational properties like those of two-state neurons”, Proc. Natl. Acad. Sci. U.S.A. vol. 81, pp. 3088–3092, available at:: https://www.pnas.org/doi/10.1073/pnas.81.10.3088

Krauth, W. (2026), Statistical Mechanics: Algorithms and Computations, Oxford University Press, 360 p., available at: https://global.oup.com/booksites/content/9780198515364/

Sassano, M. and Astolfi, A. (2013), “Dynamic Lyapunov functions”, Automatica, vol. 49. pp. 1058–1067, available at: https://liberzon.csl.illinois.edu/teaching/dynamic-Lyapunov-functions.pdf?utm_source=chatgpt.com

Walter J. and Barkema, G. (2015), “An introduction to Monte Carlo methods”, Physica A: Statistical Mechanics and its Applications, vol. 418, pp. 78–87, doi: https://doi.org/10.48550/arXiv.1404.0209

LeCun, Y., Chopra, S., Hadsell, R., Ranzato M. and Huang, F. (2006), “A Tutorial on Energy-Based Learning”, MIT Press, available at: http://web.stanford.edu/class/cs379c/archive/2012/suggested_reading_list/documents/LeCunetal06.pdf

Belanger, D. and McCallum, A. (2016), “Structured Prediction Energy Networks”, Proceedings of The 33rd Int. Conference on Machine Learning, PMLR, vol. 48, pp. 983–992, available at: https://proceedings.mlr.press/v48/belanger16.html

Grathwohl, W., Wang, K., Jacobsen, J., Duvenaud, D., Norouzi, M. and Swersky K. (2020), “Your Classifier is Secretly an Energy Based Model and You Should Treat it Like One”, arXiv, doi: https://doi.org/10.48550/arXiv.1912.03263

Friston, K. (2012), “A Free Energy Principle for Biological Systems”, Entropy (Basel), vol. 14(11), pp. 2100–2121, available at: https://pubmed.ncbi.nlm.nih.gov/23204829/

Bengio, Y., Lee, D., Bornschein, J., Mesnard, T. and Lin, Z. (2016), “Towards Biologically Plausible Deep Learning”, arXiv, arXiv.1502.04156, doi: https://doi.org/10.48550/arXiv.1502.04156

Landauer, R. (1961), “Irreversibility and Heat Generation in the Computing Process,” in IBM Journal of Research and Development, vol. 5, no. 3, pp. 183-191, doi: https://doi.org/10.1147/rd.53.0183

Leff, H. and Rex, A. (1990), Maxwell's Demon: Entropy, Information, Computing (Princeton Series in Physics), Princeton University Press. 349 p., ., available at: https://www.amazon.co.uk/Maxwells-Demon-Entropy-Information-Computing/dp/0750300574

Middelburg, J. (2024). “The Gibbs Free Energy”, Thermodynamics and Equilibria in Earth System Sciences: An Introduction. SpringerBriefs in Earth System Sciences. Springer, Cham, pp. 35–46, doi: https://doi.org/10.1007/978-3-031-53407-2_4

Bishop, C. (2006), Pattern Recognition and Machine Learning. Springer, 778 p., available at: https://link.springer.com/book/

Brown, E. and Ahlers, G. (2008), “A model of diffusion in a potential well for the dynamics of the large-scale circulation in turbulent Rayleigh-Benard convection”, arXiv, 0807.3193, 43 p., available at: https://arxiv.org/pdf/0807.3193

Shwartz-Ziv, R. and Tishby, N. (2017) “Opening the Black Box of Deep Neural Networks via Information”, arXiv, arXiv.1703.00810, doi: https://doi.org/10.48550/arXiv.1703.00810

Tishby, N., Pereira F. and Bialek, W. (2000), “The Information Bottleneck Method”, arXiv, arXiv:physics/0004057, available at: https://arxiv.org/abs/physics/0004057

Dawid, A. and LeCun, Y. (2023), “Introduction to Latent Variable Energy-Based Models: A Path Towards Autonomous Machine Intelligence”, arXiv, arXiv:2306.02572, doi: https://doi.org/10.48550/arXiv.2306.02572

Liu, J., Meng, Y., Fitzsimmons, M. and Zhou, R. (2025), “Physics-informed neural network Lyapunov functions: PDE characterization, learning, and verification”, Automatica, vol. 175, pp. 112–193, doi: https://doi.org/10.1016/j.automatica.2025.112193

Acharya, J., Basu, A., Legenstein, R., Limbacher, T., Poirazi, P. and Wu, X. (2022), “Dendritic Computing: Branching Deeper into Machine Learning”, Neuroscience. vol. 489, pp. 275–289, doi: https://doi.org/10.1016/j.neuroscience.2021.10.001

Lerma-Usabiaga, G., Winawer, J. and Wandell, B. (2021), “Population Receptive Field Shapes in Early Visual Cortex Are Nearly Circular”, Journal of Neuroscience, vol. 41(11), pp. 2420–2427; doi: https://doi.org/10.1523/JNEUROSCI.3052-20.2021

Parzhyn, Y. (2025), “Architecture of Information”, arXiv, arXiv:2503.21794, doi: https://doi.org/10.48550/arXiv.2503.21794

Parzhin, Y., Kosenko, V., Podorozhniak, A., Malyeyeva, O. and Timofeyev, V. (2020), “Detector neural network vs connectionist ANNs”, Neurocomputing, vol. 414, pp. 191–203, doi: https://doi.org/10.1016/j.neucom.2020.07.025

Quiroga, R., Reddy, L., Kreiman, G. Koch, C. and Fried, I. (2025), “Invariant visual representation by single neurons in the human brain”, Nature, vol. 435, pp. 1102–1107, doi: https://doi.org/10.1038/nature03687

Bowers, J. (2009), “On the Biological Plausibility of Grandmother Cells: Implications for Neural Network Theories in Psychology and Neuroscience”, Psychological review, vol. 116, pp. 220–251, doi: https://pubmed.ncbi.nlm.nih.gov/19159155/

Roy, A. (2013), “An extension of the localist representation theory: grandmother cells are also widely used in the brain”, Front. Psychol., vol. 4, doi: https://doi.org/10.3389/fpsyg.2013.00300

Gawne, T. (2015), “The responses of V1 cortical neurons to flashed presentations of orthogonal single lines and edges”, J Neurophysiol, vol. 113, is. 7, pp. 2676–2681, doi: https://doi.org/10.1152/jn.00940.2014

Shevelev, I. (1998), “Second-order features extraction in the cat visual cortex: selective and invariant sensitivity of neurons to the shape and orientation of crosses and corners”, Biosystems, vol. 48, pp. 195–204, doi. https://doi.org/10.1016/s0303-2647(98)00066-5

Leopold, D., Bondar, I. and Giese, M. (2006), “Norm-based face encoding by single neurons in the monkey inferotemporal cortex”, Nature, vol. 442, pp. 572–575, doi: https://doi.org/10.1038/nature04951

Poirazi, P. and Mel, B. (2021), “Impact of active dendrites and structural plasticity on the memory capacity of neural tissue”, Neuron, vol. 29, pp. 779–796, doi: https://doi.org/10.1016/s0896-6273(01)00252-5

Nielsen, M. (2017), “Reduced MNIST: how well can machines learn from small data?”, Cognitive Medium, available at: https://cognitivemedium.com/rmnist?utm_source=chatgpt.com

Zhang, L. (2019), Overfitting problem, University of Toronto, available at:

https://www.cs.toronto.edu/~lczhang/360/lec/w05/overfit.html

Brigato, L., and Iocchi, L. (2020), "A Close Look at Deep Learning with Small Data", arXiv, arXiv:2003.12843, doi: https://doi.org/10.48550/arXiv.2003.12843

Chen, W., Liu, Y., Kira, Z., Wang, Y. and Huang, J. (2019), A closer look at few-shot classification”, ICLR 2019, 16 p., available at: https://openreview.net/pdf?id=HkxLXnAcFQ

Snell, J., Swersky, K. and Zemel, R. (2017), “Prototypical Networks for Few-shot Learning”, arXiv, arXiv:1703.05175, doi: https://doi.org/10.48550/arXiv.1703.05175