Tags
objectsreading
Contributor
James Parker
Date
March 4, 2024
Folgezettel
8a
Audioset is a large dataset created by Google that comprises around 2 million 10-second sound clips (extracted from user-uploaded YouTube videos) all labelled by human annotators for use in training neural networks, machine learning models, to recognize and categorize different types of sounds and audio events. This dataset was designed with the goal of creating automatic audio event recognition systems that can label hundreds or thousands of different sound events in real-world recordings with a time resolution better than one second, similar to how human listeners can recognize and relate the sounds they hear.
Readings:
- Announcing AudioSet: A Dataset for Audio Event Research. (n.d.). Google AI Blog. Retrieved December 8, 2020, from http://ai.googleblog.com/2017/03/announcing-audioset-dataset-for-audio.html
- Ellis, D. (2018, October 4). Recognizing Sound Events. https://jh.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=4a7e392c-5163-41a6-8229-aadc01099e63
- Gemmeke, J. F., Ellis, D. P., Freedman, D., Jansen, A., Lawrence, W., Moore, R. C., Plakal, M., & Ritter, M. (2017). Audio set: An ontology and human-labeled dataset for audio events. Acoustics, Speech and Signal Processing (ICASSP), 2017 IEEE International Conference On, 776–780.
Experiment
Dataset Audit: a score