Summary

Workshop on Circuits, Systems and Information Technology

2014

Session Number:S2

Session:

Number:A1

Real-Time Voice Activity Detection using a Simple Webcam

Radu-Laurentiu Vieriu,  

pp.-

Publication Date:2014/7/2

Online ISSN:2188-5079

DOI:10.34385/proc.20.A1

PDF download (587.5KB)

Summary:
Speech activity is one important behavioral cue when analyzing people in a social context or when interacting with a computer. When social interactions come into play, e.g. the case of a simple meeting scenario, then speech activity becomes one powerful feature that offers valuable insights in determining not only individual but also the social profile of the entire group. This paper proposes a simple and fast method for Visual Voice Activity Detection (V-VAD) using statistical features extracted from consecutive video frames. We build our classifier on top of state-of-the-art facial landmark detector [1], which is used to isolate the mouth region. Experimental results conducted on a large annotated corpus validate the proposed approach and the feasibility of detecting speech using only visual information.