International Telecommunication Union   ITU
عربي  |  中文  |  Espa�ol  |  Fran�ais  |  Русский
 
 Advanced Search Advanced Search Site Map Contact us Print Version
 
Home : ITU-T Home
   
ITU-T Study Group 16 (Study Period 2005-2008)
Question 8/16 - Generic Sound Activity Detection
(New Question)

  1. Motivation

    Voice Activity Detection (VAD) is widely used in telecommunications networks as a means of differentiating between wanted and unwanted in-band audio signals, for example to obtain trunking efficiency in circuit multiplication equipment; to ensure correct operation of echo control and other signal enhancement devices, etc.

    The proposal for Generic Sound Activity Detection (GSAD) is motivated by two problems:
    1. With rapid changes in the telecommunication network environment, more and more multimedia services are being provided. Although the network is evolving from a voice to a multimedia network, most VAD algorithms are still mainly designed to handle voice signals and cannot work properly in the presence of rich audio signals, which include voice, music, background environmental noise, information tones etc.
    2. Historically, VAD algorithms have been developed separately for individual network elements and applications, and there are currently numerous VAD algorithms. However, they are based on different principles, which make it difficult to provide common performance enhancements across all VADs.
    Therefore it is beneficial to develop a Generic Sound (rather than Voice) Activity Detector, which can be applied across a range of applications. The benefits from a standardised GSAD are:
    • Enhanced performance to deal with new types of in-band audio signals
    • Reduced development time and cost for new equipment requiring sound activity detection, e.g. codecs, circuit multiplication equipment, echo control, signal enhancement devices, VoIP gateways, terminal adapters, etc.
    • Opportunity for use in existing speech and audio coders which do not include VAD


  2. Study Items

    Study items to be considered include, but are not limited to:
    • Definition and classification of applications and associated performance requirements for generic sound activity detection;
    • Definition of algorithm(s) suitable for generic sound activity detection meeting the applications and performance requirements;
    • Definition of the test conditions and evaluation procedures to be applied in selecting between candidate algorithms on the basis of objective and subjective performance, in conjunction with SG 12;
    • Selection and specification of procedures to be used in verifying the implementation of selected algorithm or algorithms.


  3. Tasks

    Tasks include, but are not limited to:
    • Develop Terms of Reference for GSAD algorithms for different applications;
    • In conjunction with SG 12, develop new Recommendations on testing methodologies;
    • Solicit proposals and conduct selection test(s) for candidate algorithms to meet these Terms of Reference;
    • Develop new Recommendation(s) based on the outcome of the(se) selection test(s).


  4. Relationships

    Recommendations:
    • G.700-series speech and audio coding Recommendations;
    • G.76X-series circuit multiplication Recommendations;
    • G.799.X-series voice over IP gateway Recommendations;
    • G.16X-series speech enhancement Recommendations;

    Questions:
    • Q9, 10, 23/16 on speech coding;
    • Q11, 15, 16, 17, 18/16 on network signal processing.

    Study Groups:
    • ITU-T SG 2 to identify other potential user applications;
    • ITU-T SG 9 on applications within home networking;
    • ITU-T SG 11 on signalling requirements and protocols;
    • ITU-T SG 12 on voice quality evaluation of specified algorithms;
    • ITU-T SG 13 on NGN;
    • ITU-T SG 19 on speech coding in IMT2000 and beyond;
    • ITU-R SG 8 to ensure compatibility with mobile transmission system constraints.

    Other bodies:
    • ETSI (TISPAN);
    • TIA;
    • IETF;
    • 3GPP, 3GPP2.

List of Questions

 

Top - Feedback - Contact Us -  Copyright � ITU 2008 All Rights Reserved
Contact for this page : TSB EDH
Updated : 2008-11-03