James Parker
March 4, 2024
Audioset is a large dataset created by Google that comprises around 2 million 10-second sound clips (extracted from user-uploaded YouTube videos) all labelled by human annotators for use in training neural networks, machine learning models, to recognize and categorize different types of sounds and audio events. This dataset was designed with the goal of creating automatic audio event recognition systems that can label hundreds or thousands of different sound events in real-world recordings with a time resolution better than one second, similar to how human listeners can recognize and relate the sounds they hear.
- Announcing AudioSet: A Dataset for Audio Event Research. (n.d.). Google AI Blog. Retrieved December 8, 2020, from http://ai.googleblog.com/2017/03/announcing-audioset-dataset-for-audio.html
- Ellis, D. (2018, October 4). Recognizing Sound Events. https://jh.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=4a7e392c-5163-41a6-8229-aadc01099e63
- Gemmeke, J. F., Ellis, D. P., Freedman, D., Jansen, A., Lawrence, W., Moore, R. C., Plakal, M., & Ritter, M. (2017). Audio set: An ontology and human-labeled dataset for audio events. Acoustics, Speech and Signal Processing (ICASSP), 2017 IEEE International Conference On, 776–780.
Dataset Audit: a score