Momentums in Deep Learning Optimization

מומנטומים באופטימיזציה של למידה עמוקה

מספר פרויקט
401
סטטוס - הצעה
הצעה
אחראי אקדמי
שנה
2024

הרקע לפרויקט:

A key ingredient of deep learning is (stochastic and nonconvex) optimization. Most popular optimizers include a momentum term. Yet, there are two common approaches to construct a momentum term that are somewhat contradicting. One of them can be understood as interpolation and the other one as extrapolation. The goal of the project is to provide insights (at least on an empirical level) on the pros and cons of each approach and whether one approach dominates the other, or is there a place for an intermediate approach.

מטרת הפרויקט:

The goal of the project is to provide insights (at least on an empirical level) on the pros and cons of each approach and whether one approach dominates the other, or is there a place for an intermediate approach.

תכולת הפרויקט:

Extensive experimentation on common deep neural networks and optimizers with different momentum approaches. Gaining insights on the pros and cons of each approach and whether one approach dominates the other. Potentially developing a new successful optimizer for deep learning and/or establishing theoretical reasoning (in a simplified setting) for the empirical observations.

קורסי קדם:

מבוא ללמידת מכונהֿ, אופטימיזציה, רמה גבוה באלגברה לינארית וחדו״א מרובת משתנים.

מקורות:

https://proceedings.mlr.press/v28/sutskever13.html

תאריך עדכון אחרון : 31/07/2023