Monaural Audio Speaker Separation with Source Contrastive Estimation
הפרדת דוברים חד ערוצי עם למידה ניגודית
הרקע לפרויקט:
This project focuses on solving the cocktail party problem, which involves separating multiple speakers who are talking simultaneously using a single microphone. The proposed algorithm utilizes a deep learning model in a vector space that represents independent speakers. By distinguishing between speaker masks and leveraging negative sampling techniques, the algorithm learns to separate speakers effectively. It offers potential applications in areas such as automatic speech recognition.
מטרת הפרויקט:
The goal of this project is to develop an algorithm that can successfully address the "cocktail party problem" by effectively separating multiple speakers who are speaking simultaneously using only a single microphone
תכולת הפרויקט:
- Watch the lectures in youtube - Stanford University CS231n, Spring 2017
- Read the paper.
- Download the dataset
- Integrate a more advanced model as the central component of the network, replacing the traditional RNN approach
- Train the model
- Expect to satisfactory results :))
The project will be implemented in Pytorch
קורסי קדם:
Deep Learing, Python and Pytorch
דרישות נוספות:
Watching related videos on YouTube
מקורות:
We will implement the following paper:
Monaural Audio Speaker Separation with Source Contrastive Estimation
תאריך עדכון אחרון : 31/07/2023