A CRNN System for Sound Event Detection Based on Gastrointestinal Sound Dataset Collected by Wearable Auscultation Devices

In this article, we set up a novel audio dataset named Gastrointestinal (GI) Sound Set which includes 6 kinds of body sounds Bowel sound, Speech, Snore, Cough, Groan, and Rub. We do sound event detection (SED) based on it, and can accurately detect 6 types of sound events. First, the GI Sound Set is...

Full description

Bibliographic Details
Main Authors: Xue Zheng, Chun Zhang, Ping Chen, Kang Zhao, Hanjun Jiang, Zhiwei Jiang, Huafeng Pan, Zhihua Wang, Wen Jia
Format: Article
Language:English
Published: IEEE 2020-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9179762/
Description
Summary:In this article, we set up a novel audio dataset named Gastrointestinal (GI) Sound Set which includes 6 kinds of body sounds Bowel sound, Speech, Snore, Cough, Groan, and Rub. We do sound event detection (SED) based on it, and can accurately detect 6 types of sound events. First, the GI Sound Set is collected by wearable auscultation devices. To ensure generalization, patients from five different hospital departments are recruited for data collection, along with a group of healthy subjects. GI Sound Set refers to Google AudioSet in data format but varies in audio length and sampling rate. Second, we extract Mel-filter features from the recordings and investigate the performance of different activation functions and neural network architectures for detecting sound events. We use data augmentation, class balance to deal with the problem of quantitative imbalance between classes on the dataset. We apply multiple instances learning(MIL) to give out not only bag-level results but also frame-level results. In this work, GI Sound Set is the largest body sound dataset to date, and our approach shows state-of-the-art performance with an average score of F1=81.06% evaluated on the test set. Due to its simple network and conventional processing method, our CRNN system has high universality, which can be used in other audio datasets, such as respiratory sound and heart sound.
ISSN:2169-3536