The Goal: A More Efficient Artificial Intelligence
Dr. Ofir Lindenbaum is working on various methods to improve, optimize, and accelerate training and inference processes for foundation models in artificial intelligence. His work earned him the Rector's Award for Breakthrough Scientific Excellence for 2025, in the field of machine learning
There's no longer any doubt about it: we are in the midst of a major technological revolution, at the center of which lies artificial intelligence. The driving force behind AI, the thing that enables it to make such a tremendous impact on our lives, is machine learning. "As far as I'm concerned, artificial intelligence and machine learning are one and the same. The AI models everyone talks about, from language models like ChatGPT to image models like Midjourney, models that can do incredible things, are all based on machine learning," says Dr. Ofir Lindenbaum, recipient of the Rector's Award for Breakthrough Scientific Excellence for 2025. The award was given to Dr. Lindenbaum for groundbreaking scientific contributions in machine learning that advance the efficient training of foundation models – models used to train artificial intelligence.
Over the past year, Dr. Lindenbaum worked on three different projects focused on improving and optimizing foundation models. "A foundation model is essentially a machine learning model that can perform a huge number of tasks across different domains: language, software, audio processing, graphics, and more. Training and using these models is computationally very expensive, in terms of runtime and memory, requiring enormous resources and massive computers," explains Dr. Lindenbaum. "Across all the projects, our goal was to scale down the models, the memory, and computation power required to train them, so that they demand fewer resources without sacrificing performance. This means that even small labs, such as those at universities, will be able to train models of this kind."
Reducing Memory and Accelerating Training Time
Dr. Lindenbaum and his team of four doctoral students and seven master's students worked on non-specific machine learning, focusing on language models and image models. "These are the largest and most widespread models today, and they stand to benefit the most from the methods we wanted to develop," says Dr. Lindenbaum.
They tackled the goal from multiple angles. "In one project, we exploited the fact that there is significant redundancy in the model's parameters, far more parameters than are actually needed. We therefore projected the search algorithm into a lower-dimensional space, and leveraged that lower space to reduce memory usage."
In another project, Dr. Lindenbaum and his team focused on accelerating the training time of large models. A network comprises a vast space of parameters, an incredible number of values that need to be updated with each learning step. Rather than computing and updating all of them in full, they computed the update directions (gradients) within a smaller subspace containing fewer dimensions, requiring less memory. In the second stage, instead of learning all these values as-is, they expanded their significance so that each direction in the space would carry similar weight. "With standard methods, you see that some parameters have a very strong influence while others have much less, but these differences don't always truly reflect what matters for learning. By balancing significance across directions and preventing a small subset from dominating the process, you allow optimization to advance faster and more efficiently, thereby shortening training time without impeding performance."
The Algorithm That Will Optimize the Entire Network
For their third project, Dr. Lindenbaum and his team focused on shrinking the network. "We came to a surprising result: we found that if you take a network trained on a general task, for example, a language model trained on a vast amount of text, you can find within it smaller sub-networks that can be better suited to specific tasks. If, for instance, I have a language model that is trained on a large amount of text and I want it to only answer medical questions, we can find within the large network a smaller, task-focused network, discard the rest of the network, and wind up with something faster, more efficient, and far less memory-intensive."
This optimization, according to Dr. Lindenbaum, is particularly effective because it requires no changes to the original network besides the removal of unnecessary parts. "The biggest innovation here is the search algorithm that finds the specific sub-network within the larger network. Once we defined the parameters and carved out the sub-network, we ended up with a network that is actually superior to the original because it is more focused, task-specific, consumes fewer resources in terms of memory and compute, and maintains performance."
New Avenues for Learning
Dr. Lindenbaum presented his work on optimization for foundation model training over the past year at several prestigious venues dedicated tomachine learning and AI, including NeurIPS, ICML, ICLR, and TMLR.
Alongside his optimization methods, Dr. Lindenbaum developed innovative approaches for unsupervised learning, feature extraction, and representation learning for scientific tabular data — providing tools to handle high dimensionality, noise, and heterogeneity, and enabling the discovery of hidden structures in complex data. "These contributions broaden the theoretical and algorithmic foundations of modern machine learning and open new avenues for learning from scientific data," he concludes.
Want to know more about Dr. Lindenbaum's research? Watch his lecture from GenML2025
Last Updated Date : 04/03/2026