Gradient History Aggregation via Ordered Weighted Averaging for Enhanced Adaptive Optimization
Main Article Content
Abstract
We present OWA-Adam, a reformulated adaptive optimization algorithm that integrates Ordered Weighted Averaging (OWA) operators into the gradient moment estimation mechanism of Adam. Rather than relying on fixed exponential decay coefficients to weight historical gradient information, OWA-Adam employs dynamically ranked aggregation, assigning importance to prior gradient steps according to their relative magnitude rather than solely their temporal recency. This reformulation enables the optimizer to respond more meaningfully to the evolving loss landscape. Through systematic experimentation over ten statistically independent trials, we show that OWA-Adam configured with an exponential decay weighting scheme converges 38.6% faster than standard Adam while achieving statistically equivalent final model performance. Significance testing using Welch's t-test and Mann–Whitney U test confirms that performance equivalence is not coincidental but structurally robust. The proposed optimizer offers a principled and practically deployable improvement to the Adam family, with implications for training efficiency in deep learning pipelines.