Résumé |
In this article, a method is proposed for fast and automatic retrieval of factors of audio content in a large audio database based on user’s audio query. The proposed method, unlike most existing systems, takes explicit considerations of temporal morphology of audio content. This work touches upon several existing approaches and technologies for sound manipulations, such as sound texture synthesis, music and audio mosaicing on the synthesis side, and audio matching, query by audio and audio structure discovery on the analysis side. Destined for creative applications, the proposed method is modular by allowing interactive choice of search criteria. The analysis side of the proposed model features a new audio structure discovery algorithm called Audio Oracle that describes the temporal morphologies of the underlying sound as a compact state-space model. The search engine, and the main focus of this paper, features a fast and novel algorithm based on dynamic programming called Guidage that is capable of reassembling the query audio by concatenating subclips of target audio files. Demonstrated results suggest a degree of semantic-driven control for query guided applications. The article concludes with examples of two immediate applications of audio matching using Guidage on music, speech and natural sounds and a discussion on further development and use of such methods in interactive and creative environments. |