Papers
arxiv:2303.15703

AD-YOLO: You Look Only Once in Training Multiple Sound Event Localization and Detection

Published on Mar 28, 2023
Authors:
,
,

Abstract

AD-YOLO, an adaptation of YOLO for sound event localization and detection, improves performance and robustness in polyphony environments by assigning class responsibility to DOA predictions.

Sound event localization and detection (SELD) combines the identification of sound events with the corresponding directions of arrival (DOA). Recently, event-oriented track output formats have been adopted to solve this problem; however, they still have limited generalization toward real-world problems in an unknown polyphony environment. To address the issue, we proposed an angular-distance-based multiple SELD (AD-YOLO), which is an adaptation of the "You Only Look Once" algorithm for SELD. The AD-YOLO format allows the model to learn sound occurrences location-sensitively by assigning class responsibility to DOA predictions. Hence, the format enables the model to handle the polyphony problem, regardless of the number of sound overlaps. We evaluated AD-YOLO on DCASE 2020-2022 challenge Task 3 datasets using four SELD objective metrics. The experimental results show that AD-YOLO achieved outstanding performance overall and also accomplished robustness in class-homogeneous polyphony environments.

Community

Sign up or log in to comment

Get this paper in your agent:

hf papers read 2303.15703
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2303.15703 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2303.15703 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2303.15703 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.