Monaural Audio Speaker Separation with Source Contrastive Estimation

הפרדת דוברים חד ערוצי עם למידה ניגודית

מספר פרויקט
410
סטטוס - הצעה
הצעה
אחראי אקדמי
שנה
2024

הרקע לפרויקט:

This project focuses on solving the cocktail party problem, which involves separating multiple speakers who are talking simultaneously using a single microphone. The proposed algorithm utilizes a deep learning model in a vector space that represents independent speakers. By distinguishing between speaker masks and leveraging negative sampling techniques, the algorithm learns to separate speakers effectively. It offers potential applications in areas such as automatic speech recognition.

מטרת הפרויקט:

The goal of this project is to develop an algorithm that can successfully address the "cocktail party problem" by effectively separating multiple speakers who are speaking simultaneously using only a single microphone

תכולת הפרויקט:

  1. Watch the lectures in youtube - Stanford University CS231n, Spring 2017
  2. Read the paper.
  3. Download the dataset
  4. Integrate a more advanced model as the central component of the network, replacing the traditional RNN approach
  5. Train the model
  6. Expect to satisfactory results :))

The project will be implemented in Pytorch

קורסי קדם:

Deep Learing, Python and Pytorch

דרישות נוספות:

Watching related videos on YouTube

מקורות:

We will implement the following paper:
Monaural Audio Speaker Separation with Source Contrastive Estimation

תאריך עדכון אחרון : 31/07/2023