Voice activity detection in noisy conditions using tiny convolutional neural network

The paper investigates the problem of voice activity detection from a noisy sound signal. An extremely compact convolutional neural network is proposed. The model has only 385 trainable parameters. Proposed model doesn’t require a lot of computational resources that allows to use it as part of the “...

Full description

Bibliographic Details
Main Authors:	R. S. Vashkevich, E. S. Azarov
Format:	Article
Language:	Russian
Published:	The United Institute of Informatics Problems of the National Academy of Sciences of Belarus 2020-06-01
Series:	Informatika
Subjects:	voice activity detector harmonic signal convolutional neural network pitch speech processing
Online Access:	https://inf.grid.by/jour/article/view/968

id	doaj-892d054342b7416282cf66bd60992042
record_format	Article
spelling	doaj-892d054342b7416282cf66bd609920422021-07-28T21:07:30ZrusThe United Institute of Informatics Problems of the National Academy of Sciences of Belarus Informatika1816-03012020-06-01172364310.37661/1816-0301-2020-17-2-36-43928Voice activity detection in noisy conditions using tiny convolutional neural networkR. S. Vashkevich0E. S. Azarov1Belarusian State University of Informatics and RadioelectronicsBelarusian State University of Informatics and RadioelectronicsThe paper investigates the problem of voice activity detection from a noisy sound signal. An extremely compact convolutional neural network is proposed. The model has only 385 trainable parameters. Proposed model doesn’t require a lot of computational resources that allows to use it as part of the “internet of things” concept for compact low power devices. At the same time the model provides state of the art results in voice activity detection in terms of detection accuracy. The properties of the model are achieved by using a special convolutional layer that considers the harmonic structure of vocal speech. This layer also eliminates redundancy of the model because it has invariance to changes of fundamental frequency. The model performance is evaluated in various noise conditions with different signal-to-noise ratios. The results show that the proposed model provides higher accuracy compared to voice activity detection model from the WebRTC framework by Google.https://inf.grid.by/jour/article/view/968voice activity detectorharmonic signalconvolutional neural networkpitchspeech processing
collection	DOAJ
language	Russian
format	Article
sources	DOAJ
author	R. S. Vashkevich E. S. Azarov
spellingShingle	R. S. Vashkevich E. S. Azarov Voice activity detection in noisy conditions using tiny convolutional neural network Informatika voice activity detector harmonic signal convolutional neural network pitch speech processing
author_facet	R. S. Vashkevich E. S. Azarov
author_sort	R. S. Vashkevich
title	Voice activity detection in noisy conditions using tiny convolutional neural network
title_short	Voice activity detection in noisy conditions using tiny convolutional neural network
title_full	Voice activity detection in noisy conditions using tiny convolutional neural network
title_fullStr	Voice activity detection in noisy conditions using tiny convolutional neural network
title_full_unstemmed	Voice activity detection in noisy conditions using tiny convolutional neural network
title_sort	voice activity detection in noisy conditions using tiny convolutional neural network
publisher	The United Institute of Informatics Problems of the National Academy of Sciences of Belarus
series	Informatika
issn	1816-0301
publishDate	2020-06-01
description	The paper investigates the problem of voice activity detection from a noisy sound signal. An extremely compact convolutional neural network is proposed. The model has only 385 trainable parameters. Proposed model doesn’t require a lot of computational resources that allows to use it as part of the “internet of things” concept for compact low power devices. At the same time the model provides state of the art results in voice activity detection in terms of detection accuracy. The properties of the model are achieved by using a special convolutional layer that considers the harmonic structure of vocal speech. This layer also eliminates redundancy of the model because it has invariance to changes of fundamental frequency. The model performance is evaluated in various noise conditions with different signal-to-noise ratios. The results show that the proposed model provides higher accuracy compared to voice activity detection model from the WebRTC framework by Google.
topic	voice activity detector harmonic signal convolutional neural network pitch speech processing
url	https://inf.grid.by/jour/article/view/968
work_keys_str_mv	AT rsvashkevich voiceactivitydetectioninnoisyconditionsusingtinyconvolutionalneuralnetwork AT esazarov voiceactivitydetectioninnoisyconditionsusingtinyconvolutionalneuralnetwork
_version_	1721262790768852992

Voice activity detection in noisy conditions using tiny convolutional neural network

Similar Items