Optimization & Learning Theory

The mathematics under the hood: optimization methods, convergence guarantees, and the theory that explains when and why deep learning works. Explainers that take proofs seriously without losing the reader.

Why Batch Size Changes What Your Neural Network Learns

Why Batch Size Changes What Your Neural Network Learns

Analysis by the aitrendblend editorial team January 2025 Machine Learning Research Optimization Feature Learning GD (left) settles near a dense interior minimum; SGD with b=1 (right) escapes to a single datapoint on the boundary. From Ghosh et al., JMLR 2025. Pick any mainstream guide to training neural networks and you will read the same advice:

Why Batch Size Changes What Your Neural Network Learns Read More »

When Expected Improvement Falls Short and What EIC Does About It.

When Expected Improvement Falls Short and What EIC Does About It

Practical AI Bayesian Optimization Analysis by the aitrendblend editorial team Published in JMLR 26 (2025) Cumulative regret curves from the EIC paper (Hu et al., JMLR 2025). EIC keeps pace with GP-UCB while closing the gap on traditional EI. Every machine learning practitioner who has tuned a neural network with Bayesian optimization has silently trusted

When Expected Improvement Falls Short and What EIC Does About It Read More »

Wasserstein Convergence Guarantees for Score-Based Generative Models.

Wasserstein Convergence Guarantees for Score-Based Generative Models

Generative Models · Journal of Machine Learning Research 26 (2025) 1 to 54 · 16 min read A research team from the Chinese University of Hong Kong and Florida State University has delivered the first unified convergence theory for a broad class of score based generative models in 2-Wasserstein distance, and it shows that the

Wasserstein Convergence Guarantees for Score-Based Generative Models Read More »

How Dommel and Pichler Finally Cracked the Kernel Approximation Problem That Was Holding Machine Learning Back.

How Dommel and Pichler Finally Cracked the Kernel Approximation Problem That Was Holding Machine Learning Back

How Dommel and Pichler Finally Cracked the Kernel Approximation Problem That Was Holding Machine Learning Back | AI Trend Blend AITrendBlend Machine Learning Cybersecurity Computer Vision About Statistical Learning · Journal of Machine Learning Research 26 (2025) 1–30 · 18 min read How Two Researchers from Chemnitz Quietly Fixed One of the Oldest Problems in

How Dommel and Pichler Finally Cracked the Kernel Approximation Problem That Was Holding Machine Learning Back Read More »

Sliced-Wasserstein Distances and Flows on Cartan-Hadamard Manifolds.

Sliced-Wasserstein Distances and Flows on Cartan-Hadamard Manifolds

Sliced-Wasserstein Distances and Flows on Cartan-Hadamard Manifolds | AI Trend Blend AITrendBlend Machine Learning Cybersecurity About Optimal Transport · Journal of Machine Learning Research 26 (2025) 1–76 · 18 min read Measuring Distance Between Distributions on Curved Spaces Just Got a Lot Faster Bonet, Drumetz, and Courty from ENSAE, IMT Atlantique, and Universite Bretagne Sud

Sliced-Wasserstein Distances and Flows on Cartan-Hadamard Manifolds Read More »

Statistical Inference via Sketched StoSQP: Online Second-Order Methods for Constrained Optimization.

Statistical Inference via Sketched StoSQP: Online Second-Order Methods for Constrained Optimization

Statistical Inference via Sketched StoSQP: Online Second-Order Methods for Constrained Optimization | AI Trend Blend AITrendBlend Machine Learning Cybersecurity About Optimization Theory · Journal of Machine Learning Research 26 (2025) 1–75 · 20 min read The Online Inference Problem That Second-Order Methods Finally Solved — Without Projections Sen Na at Georgia Tech and Michael Mahoney

Statistical Inference via Sketched StoSQP: Online Second-Order Methods for Constrained Optimization Read More »

The ODE Method for Stochastic Approximation with Markovian Noise: Breaking the Deadly Triad in Reinforcement Learning.

The ODE Method for Stochastic Approximation with Markovian Noise: Breaking the Deadly Triad in Reinforcement Learning

The ODE Method for Stochastic Approximation with Markovian Noise: Breaking the Deadly Triad in Reinforcement Learning | AI Trend Blend AITrendBlend Machine Learning Computer Vision About Reinforcement Learning Theory · Journal of Machine Learning Research 26 (2025) 1–76 · 20 min read The ODE Method Gets Its Markovian Upgrade — and Reinforcement Learning’s Most Stubborn

The ODE Method for Stochastic Approximation with Markovian Noise: Breaking the Deadly Triad in Reinforcement Learning Read More »

Orthogonal Bases for Equivariant Graph Learning with Provable k-WL Expressive Power.

Orthogonal Bases for Equivariant Graph Learning with Provable k-WL Expressive Power

Orthogonal Bases for Equivariant Graph Learning with Provable k-WL Expressive Power | AI Trend Blend AITrendBlend Machine Learning Computer Vision About Graph Neural Networks · Journal of Machine Learning Research 26 (2025) 1–35 · 18 min read High-Order GNNs Were Too Expensive — Until These Compact Orthogonal Bases Changed the Game Jia He and Maggie

Orthogonal Bases for Equivariant Graph Learning with Provable k-WL Expressive Power Read More »

Riemannian Bilevel Optimization — When Machine Learning Leaves Flat Space Behind.

Riemannian Bilevel Optimization — When Machine Learning Leaves Flat Space Behind

Riemannian Bilevel Optimization — When Machine Learning Leaves Flat Space Behind | AI Trend Blend AITrendBlend Machine Learning Mathematics About Machine Learning Theory · Journal of Machine Learning Research 26 (2025) · University of Minnesota & Rice University · 20 min read Why Machine Learning on Curved Surfaces Is the Next Big Leap — And

Riemannian Bilevel Optimization — When Machine Learning Leaves Flat Space Behind Read More »