Momentums in Deep Learning Optimization
מומנטומים באופטימיזציה של למידה עמוקה
הרקע לפרויקט:
A key ingredient of deep learning is (stochastic and nonconvex) optimization. Most popular optimizers include a momentum term. Yet, there are two common approaches to construct a momentum term that are somewhat contradicting. One of them can be understood as interpolation and the other one as extrapolation. The goal of the project is to provide insights (at least on an empirical level) on the pros and cons of each approach and whether one approach dominates the other, or is there a place for an intermediate approach.
מטרת הפרויקט:
The goal of the project is to provide insights (at least on an empirical level) on the pros and cons of each approach and whether one approach dominates the other, or is there a place for an intermediate approach.
תכולת הפרויקט:
Extensive experimentation on common deep neural networks and optimizers with different momentum approaches. Gaining insights on the pros and cons of each approach and whether one approach dominates the other. Potentially developing a new successful optimizer for deep learning and/or establishing theoretical reasoning (in a simplified setting) for the empirical observations.
קורסי קדם:
מבוא ללמידת מכונהֿ, אופטימיזציה, רמה גבוה באלגברה לינארית וחדו״א מרובת משתנים.
מקורות:
תאריך עדכון אחרון : 31/07/2023