Ircam-Centre Pompidou

Recherche

Recherche simple

Recherche avancée

Panier électronique

Votre panier ne contient aucune notice

Connexion à la base

(Identifiez-vous pour accéder aux fonctions de mise à jour. Utilisez votre login-password de courrier électronique)

Entrepôt OAI-PMH

Soumettre une requête

	Consulter la notice détaillée
	Version complète en ligne
	Version complète en ligne accessible uniquement depuis l'Ircam
	Ajouter la notice au panier
	Retirer la notice du panier

English version

(full translation not yet available)

Liste complète des articles

Consultation des notices

Vue détaillée

Catégorie de document	Article paru dans une revue
Titre	On the use of a spatial cue as prior information for stereo sound source separation based on spatially weighted non-negative tensor factorization
Auteur principal	Yuki Mitsufuji
Co-auteur	Axel Roebel
Paru dans	EURASIP Journal on Advances in Signal Processing, Mars 2014, Vol. 2014, n° 1
Comité de lecture	Oui
Collation	p.1-9
Année	2014
Statut éditorial	Publié
Résumé	This paper proposes a new method to enhance the performance of non-negative tensor factorization (NTF), one of the most prevalent source separation techniques nowadays. The enhancement is mainly achieved by introducing weights on bin-wise NTF cost functions, which differentiates NTF target components from other components so that the target should be approximated more precisely than others. Assuming sources are distributed sparsely in a 2-D sound field, the target components approximating a target source are exclusively selected by a user, or from accompanying images by means of providing a spatial cue to an NTF framework. The spatial cue is given in a similar format to the well-known binaural feature, inter-channel level difference (IID). This helps incorporate the spatial cue into the system, since the similar features of this format can be easily calculated from every spectrogram bin. The weighting functions are designed taking into account the distance between the spatial cue and the calculated features. Namely, the largest values are assigned to the spectrogram bins where the features present the highest similarity to the spatial cue, and the value decreases in proportion to the distance between them. The method is evaluated in terms of separation quality, comparing the proposed algorithm to the conventional NTF technique, PARAFAC-NTF, as well as other source separation techniques. The evaluation results measured by the metric signal-to-distortion ratio (SDR), signal-to-interference ratio (SIR), and signal-to-artifact ratio (SAR) demonstrate the effectiveness of the new method, improved primarily by the weighting function and the initialization based on IID, while demonstrating a decrease in computational costs, a significant problem with NTF.
Mots-clés	non-negative tensor factorization / sound source separation / spatial cue
Equipe	Analyse et synthèse sonores
Cote	Mitsufuji14a

© Ircam - Centre Pompidou 2005.