Datasets¶

This chapter will briefly review datasets for music demixing by summarizing the previous tutorial. You can find a more detailed introduction and explanation from the previous tutorial.

Data for Music Demixing¶

At a high level, the inputs and outputs of a source separation model look like this:

Fig. 1 Inputs and outputs of a source separation model.¶

MUSDB18: tutorial with 7 sec samples¶

The MUSDB18 dataset [RLStoter+17] is one of the most widely used datasets for music demixing. For example, its uncompressed version (also known as MUSDB18-HQ [RLS+19]) was the official training dataset for Leaderboard A of the MDX challenge.

This section shows how to play with the musdb package.

Frist, install musdb pacakge.

pip install musdb

After the installation, please load musdb with download=True. This will download 7 seconds sample tracks of MUSDB18.

import musdb
mus = musdb.DB(download=True)

We can use mus as a iterator.

print('number of tracks: {}'.format(len(mus)))

number of tracks: 144

Let us load the first track of the MUSDB18 dataset

track = mus[0]
print('track name:\t', track)
print('track length:\t {} secs'.format(track.audio.shape[0]//track.rate))

track name:	 A Classic Education - NightOwl
track length:	 6 secs

Let us listen to the mixture (i.e., the input in Fig 1!)

from IPython.display import Audio, display

display(Audio(track.audio.T, rate=track.rate))

Let us listen to the target audio tracks (i.e., the outputs in Fig 1!)

for source in track.sources.keys():
    print('source name: {}'.format(source))
    display(Audio(track.sources[source].audio.T, rate=track.rate))    

source name: vocals

source name: drums

source name: bass

source name: other

Thus, the input and output of the MUSDB18’s music demixing task are:

input: track.audio
output: {source: track.sources[source].audio for source in ['vocals', 'drums', 'bass', 'other']}

MUSDB18-HQ: How to use the full version?¶

You downloaded a sample dataset with 7 seconds tracks in the tutorial above.

To use the full version,

You need to download the dataset here.
unzip musdb18hq.zip to $your_musdb18hq_dir
load your musdb.DB with is_wav=True and root=$your_musdb18hq_dir.

import musdb
musdb18hq_dir = '/mnt/d/repos/musdb18hq' # <= $your_musdb18hq_dir

mus = musdb.DB(root=musdb18hq_dir, is_wav=True)
print('number of tracks: {}\n'.format(len(mus)))

track = mus[0]
print('track name:\t', track)
print('track length:\t {} secs'.format(track.audio.shape[0]//track.rate))

number of tracks: 150

track name:	 A Classic Education - NightOwl

track length:	 171 secs

Quick overview of existing datasets¶

In the MDX challenge, participants must train their system on the training set of MUSDB18-HQ dataset (or MUSDB18 dataset) for Leaderboard A. For Leaderboard B, there have been no constraints in the choice of training data. (i.e., any available datasets can be used by the participants).

Here’s a quick overview of existing datasets released after 2015 for Music Demixing:

Dataset	Year	Instrument categories	Tracks	Avgerage duration (s)	Full songs	Stereo
MedleyDB	2014	82	63	206 $\pm$ 121	✅	✅
DSD100	2015	4	100	251 $\pm$ 60	✅	✅
MUSDB18	2017	4	150	236 $\pm$ 95	✅	✅
MUSDB18-HQ	2019	4	150	236 $\pm$ 95	✅	✅
Slakh2100	2019	34	2100	249	✅	❌

You can check the full list of datasets here. This extended table is based on: SigSep/datasets, and reproduced with permission.

Models trained with unrealised datasets:¶

Some models such as Spleeter [HKVM20] and UMXL Predictor [StoterULM19] were trained with unrealised datasets. Some models submitted to Leaderboard B of the MDX challenge were trained with private datasets.

Dataset	Year	Instrument categories	Tracks	Avgerage duration (s)	Full songs	Stereo
MedleyDB	2014	82	63	206 \(\pm\) 121	✅	✅
DSD100	2015	4	100	251 \(\pm\) 60	✅	✅
MUSDB18	2017	4	150	236 \(\pm\) 95	✅	✅
MUSDB18-HQ	2019	4	150	236 \(\pm\) 95	✅	✅
Slakh2100	2019	34	2100	249	✅	❌

A Primer on Source Separation