Résumé |
This master's thesis is dedicated to incremental multi-source recognition using non-negative matrix factorization. A particular attention is paid to providing a mathematical framework for sparse coding schemes in this context. The applications of non-negative matrix factorization problems to sound recognition are discussed to give the outlines, positions and contributions of the present work with respect to the literature. The problem of incremental recognition is addressed within the framework of non-negative decomposition, a modified non-negative matrix factorization scheme where the incoming signal is projected onto a basis of templates learned off-line prior to the decomposition. As it appears that sparsity is one of the main issue in this context, a theoretical approach is followed to overcome the problem. The main contribution of the present work is in the formulation of a sparse non-negative matrix factorization framework. This formulation is motivated and illustrated with a synthetic experiment, and then addressed with convex optimization techniques such as gradient optimization, convex quadratic programming and second-order cone programming. Several algorithms are proposed to address the question of sparsity. To provide results and validations, some of these algorithms are applied to preliminary evaluations, notably that of incremental multiple-pitch and multiple-instrument recognition, and that of incremental analysis of complex auditory scenes. |