How to play a dataset

Date

2025

Date 1
Status
image

Readme.txt

How to play a dataset

In this session, we will attempt to play some datasets together using Konvolute, a software instrument we’ve built. You can use this instrument in a few different ways.

  • to play datasets back like a record or a movie.
  • to search or investigate them like a detective.
  • but also to manipulate, compose, and perform them live like a piano, sampler, or analogue synthesiser.

Who are we?

Machine Listening is a platform for collaborative research and artistic experimentation, founded in 2020 by Sean Dockray, James Parker, and Joel Stern. We work across writing, installation, performance, software, curation, pedagogy, and radio. A lot of this work has involved thinking with and about datasets, including various experiments in dataset critique. Most recently:

55 Falls / Ambient Assisted Living (2025)

image

#C (2025)

image

Here is a dataset (2025)

image

What is a dataset?

A dataset is never just a collection of files. It is:

  • primary data (labels, recordings, measurements)
  • metadata describing how/when it was gathered
  • the code that processes it
  • the papers that cite it
  • the spreadsheets that organise it
  • the communities who interpret and repurpose it.
image

What is a dataset audit?

Every dataset undergoes some kind of auditing process. Sometimes this is more technically oriented (’cleaning’, ‘augmented’), sometimes more political (’debiasing’, ‘bias mitigation’). But most of it is done in-house, by the computer scientists and engineers involved in producing the dataset, and with little regulatory oversight or public scrutiny. As a result, dataset audits tend to be self-serving, as the high-profile firings of various whistleblowers and internal critics attest (eg Timnit Gebru).

image

Computer scientists and engineers

Dulhanty and Wong - 2019 - Auditing ImageNet Towards a Model-driven Framework for Annotating Demographic Attributes of Large-S.pdf94.4 KiB
Huang et al. - 2023 - A Dataset Auditing Method for Collaboratively Trained Machine Learning Models.pdf2.6 MiB
Godinot et al. - 2024 - Under manipulations, are some AI models harder to .pdf744.2 KiB
Gerchick et al. - 2025 - Auditing the Audits Lessons for Algorithmic Accountability from Local Law 144's Bias Audits.pdf1.1 MiB
Lafargue et al. - 2025 - Fairness is in the details Face Dataset Auditing.pdf15.3 MiB
Shao et al. - 2025 - DATABench Evaluating Dataset Auditing in Deep Learning from an Adversarial Perspective.pdf520 KiB

Regulators

Andres - Auditing the quality of datasets used in algorithmic decision-making systems.pdf593.1 KiB
European Parliament. Directorate General for Parliamentary Research Services. - 2022 - Auditing the quality of datasets used in algorithmic decision-making systems..pdf1.1 MiB

What is dataset critique?

We are joining a tradition of artists and technology critics interested in diversifying and expanding on these techniques, and especially who gets to practice them, as a form of counter-auditing. We are interested in developing more critical and inclusive (para-institutional/academic?) forms of dataset auditing, but we do not presume to know in advance what they might be.

image

Artists and tech critics

What is konvolute?

image

Konvolute is custom software that maps and then makes datasets playable /navigable according to how they sound. The name is a reference to Walter Benjamin’s Arcades Project, his unfinished collection of notes, fragments and non-linear writings about the Paris arcades, where Benjamin uses the term ‘konvolute’ to refer to each unwiedly chapter or collection of materials. But we were also thinking of the convolution of convolutional neural nets, which suggests a way of processing materials that is more mathematical and systematic.

Download konvolute and read the user manual here

What are we going to do today?

Playing a dataset

Collecting a dataset

  • What datasets get collected? How? By who? Why? What gets left out?
  • Mimi Onuoha, Library of missing datasets
  • Jennifer Walshe, Ireland: a Dataset
  • What belongs in a New Plymouth dataset?

New Plymouth: a dataset

  • collect our own dataset together
  • play it

Plenary

https://textb.org/m/konvolute/