musisep.dictsep package¶

Submodules¶

musisep.dictsep.main module¶

Wrapper for the dictionary learning algorithm. When invoked, the audio sources in the supplied audio file are separated.

musisep.dictsep.__main__.main(mixed_soundfile, orig_soundfiles, inst_num, tone_num, pexp, qexp, har, sigmas, sampdist, spectheight, logspectheight, minfreq, maxfreq, out_name, runs, lifetime, num_dicts, mask, plot_range)[source]¶

Wrapper function for the dictionary learning algorithm.

Parameters:

mixed_soundfile (string) – Name of the mixed input file
orig_soundfiles (list of string or NoneType) – Names of the files with the isolated instrument tracks or None
inst_num (int) – Number of instruments
tone_num (int) – Maximum number of simultaneous tones
pexp (float) – Exponent for the addition of sinusoids
qexp (float) – Exponent to be applied on the spectrum
har (int) – Number of harmonics
sigmas (float) – Number of standard deviations after which to cut the window/kernel
sampdist (int) – Time intervals to sample the spectrogram
spectheight (int) – Height of the linear-frequency spectrogram
logspectheight (int) – Height of the log-frequency spectrogram
minfreq (float) – Minimum frequency in Hz to be represented (included)
maxfreq (float) – Maximum frequency in Hz to be represented (excluded)
out_name (string) – Prefix for the file names
runs (int) – Number of training iterations to perform
lifetime (int) – Number of steps after which to renew the dictionary
num_dicts (int) – Number of different dictionaries to generate and train
mask (bool) – Whether to apply spectral masking
plot_range (slice or NoneType) – part of the spectrogram to plot

musisep.dictsep.adam_b module¶

Module containing the modified ADAM algorithm.

class musisep.dictsep.adam_b.Adam_B(init, lo=0, hi=1, alpha=0.0001, beta1=0.9, beta2=0.999, eps=1e-08)[source]¶

Bases: object

Object for the ADAM algorithm with bounds, adapted for the update of instrument dictionaries. Each column refers to one instruments, and the harmonics are in rows.

Parameters:

init (array-like) – Initial value for the dictionary
lo (float) – Lower bound for the dictionary entries
hi (float) – Upper bound for the dictionary entries
alpha (float) – Global step-size
beta1 (float) – Inertia of the first moment estimator
beta2 (float) – Inertia of the second moment estimator
eps (float) – Value to add in the denominator to avoid division by zero

reset(i)[source]¶

Reset an instrument to its initial state.

Parameters:	i (int) – Number of the instrument

step(stepdir)[source]¶

Update the dictionary.

Parameters:	stepdir (array-like) – Step direction (negative gradient)
Returns:	theta – New value of the dictionary
Return type:	ndarray

musisep.dictsep.dictlearn module¶

Module for the training of the dictionary. When invoked, a performance test on artificial data is performed.

class musisep.dictsep.dictlearn.Learner(fsigma, tone_num, inst_num, har, m, lifetime, pexp, qexp, init=None)[source]¶

Bases: object

Container object for the dictionary learning process.

Parameters:

fsigma (float) – Standard deviation (frequency)
tone_num (int) – Maximum number of simultaneous tones
inst_num (int) – Number of instruments in the dictionary
har (int) – Number of harmonics
m (int) – Height of the log-frequency spectrogram
lifetime (int) – Number of steps after which to renew the dictionary
pexp (float) – Exponent for the addition of sinusoids
qexp (float) – Exponent to be applied on the spectrum

get_dict()[source]¶

Get the active part of the dictionary.

Returns:	inst_dict – Dictionary with inst_num columns
Return type:	ndarray

learn(y)[source]¶

Learning step. Automatically renews the dictionary.

Parameters:	y (array_like) – Log-frequency spectrum
Returns:	reconstruction – Synthesized spectrum
Return type:	ndarray

renew_dict(headstart, newinsts)[source]¶

Renew the dictionary.

Parameters:	headstart (int) – Headstart in the lifetime counter (to help new instruments) newinsts (int) – Number of instruments to be renewed

musisep.dictsep.dictlearn.gen_random_inst(har)[source]¶

Generate random harmonic amplitudes according to a Par(1,2) distribution.

Parameters:	har (int) – Number of harmonics
Returns:	inst – Harmonic amplitudes for one instrument, unified to an interval of [0,1]
Return type:	ndarray

musisep.dictsep.dictlearn.gen_random_inst_dict(har, inst_num)[source]¶

Generate a random instrument dictionary according to a Par(1,2) distribution.

Parameters:	har (int) – Number of harmonics inst_num (int) – Number of instruments
Returns:	inst_dict – Dictionary with instruments in columns, unified to an interval of [0,1]
Return type:	ndarray

musisep.dictsep.dictlearn.learn_spect_dict(spect, fsigma, tone_num, inst_num, pexp, qexp, har, m, minfreq, maxfreq, runs, lifetime)[source]¶

Train the dictionary containing the relative amplitudes of the harmonics.

Parameters:	spect (array_like) – Original log-frequency spectrogram of the recording fsigma (float) – Standard deviation (frequency) tone_num (int) – Maximum number of simultaneous tones inst_num (int) – Number of instruments in the dictionary pexp (float) – Exponent for the addition of sinusoids qexp (float) – Exponent to be applied on the spectrum har (int) – Number of harmonics m (int) – Height of the log-frequency spectrogram minfreq (float) – Minimum frequency in Hz to be represented (included) maxfreq (float) – Maximum frequency in Hz to be represented (excluded) runs (int) – Number of training iterations to perform lifetime (int) – Number of steps after which to renew the dictionary
Returns:	inst_dict – Dictionary containing the relative amplitudes of the harmonics
Return type:	ndarray

musisep.dictsep.dictlearn.mask_spectrums(spects, orig_spect)[source]¶

Mask the synthesized spectrograms with the original spectrogram.

Parameters:

spects (list of array_like) – List of synthesized spectrograms
orig_spect (array_like) – Original spectrogram

Returns:

spectrums (list of ndarray) – Masked spectrograms
mask_spect (ndarray) – Array mask

musisep.dictsep.dictlearn.stoch_grad(y, inst_dict, tone_num, adam, fsigma, harscale, baseshift, inst_spect, pexp, qexp)[source]¶

Perform a dictionary training step.

Parameters:

y (array_like) – Log-frequency spectrum to represent
inst_dict (ndarray) – Dictionary containing the relative amplitudes of the harmonics
tone_num (int) – Maximum number of simultaneous tones
adam (Adam_B) – Container object for the ADAM optimizer
fsigma (float) – Standard deviation (frequency)
harscale (float) – Scaling factor
baseshift (int) – Length to add to the spectrum in order to avoid circular convolution
inst_spect (array_like) – Spectra of the instruments, in the columns
pexp (float) – Exponent for the addition of sinusoids
qexp (float) – Exponent to be applied on the spectrum

Returns:

inst_dict (ndarray) – Updated dictionary
reconstruction (ndarray) – Synthesized spectrum
inst_amps (ndarray) – Summed amplitudes for each instruments

musisep.dictsep.dictlearn.synth_spect(spect, tone_num, inst_dict, fsigma, spectheight, pexp, qexp, minfreq, maxfreq)[source]¶

Separate and synthesize the spectrograms from the original spectrogram.

Parameters:

spect (array_like) – Original log-frequency spectrogram of the recording
tone_num (int) – Maximum number of simultaneous tones
inst_dict (ndarray) – Dictionary containing the relative amplitudes of the harmonics
fsigma (float) – Standard deviation (frequency)
spectheight (int) – Height of the linear-frequency spectrograms
pexp (float) – Exponent for the addition of sinusoids
qexp (float) – Exponent to be applied on the spectrum
minfreq (float) – Minimum frequency to be represented (included) (normalized to the sampling frequency)
maxfreq (float) – Maximum frequency to be represented (excluded) (normalized to the sampling frequency)

Returns:

dict_spectrum (ndarray) – Synthesized log-frequency spectrogram with all instruments
inst_spectrums (list of ndarray) – List of synthesized log-frequency spectrograms for the instruments
dict_spectrum_lin (ndarray) – Synthesized linear-frequency spectrogram with all instruments
inst_spectrums_lin (list of ndarray) – List of synthesized linear-frequency spectrograms for the instruments

musisep.dictsep.dictlearn.test_learn(fsigma, tone_num, inst_num, pexp, qexp, har, m, runs, test_samples, lifetime)[source]¶

Evaluate the performance of the dictionary learning algorithm via artificial spectra.

Parameters:	fsigma (float) – Width of the Gaussians in the log-frequency spectrogram tone_num (int) – Maximum number of simultaneous tones inst_num (int) – Number of instruments in the dictionaries pexp (float) – Exponent for the addition of sinusoids qexp (float) – Exponent to be applied on the spectrum har (int) – Number of harmonics m (int) – Height of the log-frequency spectrogram runs (int) – Number of training iterations to perform test_samples (int) – Number of test spectra to generate lifetime (int) – Number of steps after which to renew the dictionary
Returns:	measures – Array containing, in that order, the SDR, SIR, SAR with the original dictionary and the SDR, SID, SAR with the trained dictionary
Return type:	ndarray

musisep.dictsep.dictlearn.test_learn_multi(fsigma, tone_num, inst_num, pexp, qexp, har, m, runs, test_samples, lifetime, num_dicts)[source]¶

Evaluate the performance of the dictionary learning algorithm via artificial spectra.

Parameters:	fsigma (float) – Width of the Gaussians in the log-frequency spectrogram tone_num (int) – Maximum number of simultaneous tones inst_num (int) – Number of instruments in the dictionaries pexp (float) – Exponent for the addition of sinusoids qexp (float) – Exponent to be applied on the spectrum har (int) – Number of harmonics m (int) – Height of the log-frequency spectrogram runs (int) – Number of training iterations to perform test_samples (int) – Number of test spectra to generate lifetime (int) – Number of steps after which to renew the dictionary num_dicts (int) – Number of different dictionaries to generate and train
Returns:	measures – Array containing, in the rows, the SDR, SIR, SAR with the original dictionary and the SDR, SID, SAR with the trained dictionary
Return type:	ndarray

musisep.dictsep.exptool module¶

Back-end module for the generation of spectrograms and their gradients.

musisep.dictsep.exptool.inst_scale()¶

musisep.dictsep.exptool.inst_scale_grad()¶

musisep.dictsep.exptool.inst_shift()¶

musisep.dictsep.exptool.inst_shift_dict_grad()¶

musisep.dictsep.exptool.inst_shift_grad()¶

musisep.dictsep.pursuit module¶

Module for the sparse pursuit algorithm and its helper functions.

class musisep.dictsep.pursuit.Peaks(amps, shifts, sigmas, spreads, insts)[source]¶

Bases: object

Object to represent the parameters for the peaks in the spectrogram.

Parameters:	amps (array_like) – Amplitudes shifts (array_like) – Fundamental frequencies sigmas (array_like) – Standard deviations (frequency) spreads (array_like) – Inharmonicities insts (array_like) – Instrument numbers

copy()[source]¶

Returns:	Copy of the contained peak parameters
Return type:	Peaks

classmethod empty()[source]¶

Construct an empty Peaks object.

Returns:	A Peaks object with zero peaks
Return type:	Peaks

classmethod from_array(array, insts)[source]¶

Construct a Peaks object from an array.

Parameters:	array (array_like) – Array that contains, in consecutive order, the amplitudes, the fundamental frequencies, the standard deviations, and the inharmoniticies insts (array_like) – Instrument numbers

get_array()[source]¶

Returns:	Array that contains, in consecutive order, the amplitudes, the fundamental frequencies, the standard deviations, and the inharmoniticies
Return type:	array_like

get_params()[source]¶

Returns:	amps (array_like) – Amplitudes shifts (array_like) – Fundamental frequencies sigmas (array_like) – Standard deviations (frequency) spreads (array_like) – Inharmonicities insts (array_like) – Instrument numbers

merge(new)[source]¶

Merge the Peaks object with another Peaks object contained in new by concatenating the parameters.

Parameters:	new (Peaks) – Object to merge with

musisep.dictsep.pursuit.calc_harscale(minfreq, maxfreq, numfreqs)[source]¶

Calculate the scaling factor of the frequency axis for the log-frequency spectrogram.

Parameters:	minfreq (float) – Minimum frequency to be represented (included) maxfreq (float) – Maximum frequency to be represented (excluded) numfreqs (int) – Intended height of the spectrogram
Returns:	harscale – Scaling factor
Return type:	float

musisep.dictsep.pursuit.fft_selector(y, prenum, baseshift, inst_spect, qexp)[source]¶

Callback selector to find fundamental frequencies based on the correlation of the spectrum with the instrument spectra.

Parameters:

y (array_like) – Spectrum
prenum (int) – Number of peaks to consider
baseshift (int) – Length to add to the spectrum in order to avoid circular convolution
inst_spect (array_like) – Spectra of the instruments, in the columns
qexp (float) – Exponent to be applied on the spectrum

Returns:

amps (array_like) – Amplitudes
shifts (array_like) – Fundamental frequencies
insts (array_like) – Instrument numbers

musisep.dictsep.pursuit.gen_inst_spect(baseshift, fsigma, inst_dict, harscale, pexp, qexp, m)[source]¶

Generate an instrument log-frequency spectrum.

Parameters:	baseshift (int) – Length to add to the spectrum in order to avoid circular convolution fsigma (float) – Standard deviation (frequency) inst_dict (ndarray) – Dictionary containing the relative amplitudes of the harmonics harscale (float) – Scaling factor pexp (float) – Exponent for the addition of sinusoids qexp (float) – Exponent to be applied on the spectrum m (int) – Height of the spectrogram
Returns:	inst_spect – Spectra of the instruments, in the columns
Return type:	ndarray

musisep.dictsep.pursuit.inst_scale(peaks, inst_dict, pexp, m, n)[source]¶

Synthesize the linear-frequency spectrum.

Parameters:	peaks (Peaks) – Peak parameters inst_dict (array_like) – Dictionary containing the relative amplitudes of the harmonics pexp (float) – Exponent for the addition of sinusoids m (int) – Height of the spectrogram n (int) – Number of instruments
Returns:	Linear-frequency spectrum
Return type:	ndarray

musisep.dictsep.pursuit.inst_shift(peaks, inst_dict, harscale, pexp, m, n)[source]¶

Synthesize the log-frequency spectrum.

Parameters:	peaks (Peaks) – Peak parameters inst_dict (array_like) – Dictionary containing the relative amplitudes of the harmonics harscale (float) – Scaling factor pexp (float) – Exponent for the addition of sinusoids m (int) – Height of the spectrogram n (int) – Number of instruments
Returns:	Log-frequency spectrum
Return type:	ndarray

musisep.dictsep.pursuit.inst_shift_dict_grad(peak_array, insts, inst_dict, harscale, pexp, qexp, m, n, y)[source]¶

Least-squares gradient function for the log-frequency spectrum w.r.t. the dictionary.

Parameters:	peak_array (array_like) – Peak parameters in array form insts (array_like) – Instrument numbers inst_dict (array_like) – Dictionary containing the relative amplitudes of the harmonics harscale (float) – Scaling factor pexp (float) – Exponent for the addition of sinusoids qexp (float) – Exponent to be applied on the spectrum m (int) – Height of the spectrogram n (int) – Number of instruments y (array_like) – Spectrum to compare with
Returns:	grad – Least-squares gradient w.r.t. the dictionary
Return type:	ndarray

musisep.dictsep.pursuit.inst_shift_grad(peak_array, insts, inst_dict, harscale, pexp, qexp, m, n, y)[source]¶

Least-squares gradient function for the log-frequency spectrum w.r.t. the parameters.

Parameters:	peak_array (array_like) – Peak parameters in array form insts (array_like) – Instrument numbers inst_dict (array_like) – Dictionary containing the relative amplitudes of the harmonics harscale (float) – Scaling factor pexp (float) – Exponent for the addition of sinusoids qexp (float) – Exponent to be applied on the spectrum m (int) – Height of the spectrogram n (int) – Number of instruments y (array_like) – Spectrum to compare with
Returns:	grad – Least-squares gradient
Return type:	ndarray

musisep.dictsep.pursuit.inst_shift_obj(peak_array, insts, inst_dict, harscale, pexp, qexp, m, n, y)[source]¶

Least-squares objective function for the log-frequency spectrum.

Parameters:	peak_array (array_like) – Peak parameters in array form insts (array_like) – Instrument numbers inst_dict (array_like) – Dictionary containing the relative amplitudes of the harmonics harscale (float) – Scaling factor pexp (float) – Exponent for the addition of sinusoids qexp (float) – Exponent to be applied on the spectrum m (int) – Height of the spectrogram n (int) – Number of instruments y (array_like) – Spectrum to compare with
Returns:	obj – Least-squares error
Return type:	float

musisep.dictsep.pursuit.make_bounds(fsigma, length)[source]¶

Compute sensible bounds for the peak parameters.

Parameters:	fsigma (float) – Standard deviation (frequency) length (int) – Number of instruments
Returns:	bounds – Bounds for the optimizer
Return type:	list of tuple

musisep.dictsep.pursuit.max_selector(y, prenum, n)[source]¶

Callback selector to find peaks based on the local maxima which are dominant in a discrete interval, viewed from its midpoint.

Parameters:

y (array_like) – Spectrum
prenum (int) – Number of peaks to consider
n (int) – Length of the interval

Returns:

amps (array_like) – Amplitudes
shifts (array_like) – Frequencies
insts (array_like) – Instrument numbers (always 0)

musisep.dictsep.pursuit.peak_pursuit(y, num, prenum, runs, inst_dict, fsigma, harscale, selector, selector_args, pexp, qexp, beta=1, init=None)[source]¶

Sparse pursuit algorithm for the identification of peaks in a spectrum.

Parameters:

y (array_like) – Spectrum
num (int) – Maximum number of peaks
prenum (int) – Number of new peaks to consider per step
inst_dict (ndarray) – Dictionary containing the relative amplitudes of the harmonics
fsigma (float) – Standard deviation (frequency)
harscale (float) – Scaling factor
selector (function) – Callback selector accepting y and prenum as arguments
selector_args (sequence) – Extra arguments to pass to the selector
pexp (float) – Exponent for the addition of sinusoids
qexp (float) – Exponent to be applied on the spectrum
beta (float) – Residual reduction factor
init (Peaks) – Initial value for the peaks

Returns:

peaks (Peaks) – Identified peaks
reconstruction (ndarray) – Synthesized spectrum

musisep.dictsep package¶

Submodules¶

musisep.dictsep.main module¶

musisep.dictsep.adam_b module¶

musisep.dictsep.dictlearn module¶

musisep.dictsep.exptool module¶

musisep.dictsep.pursuit module¶

Module contents¶

Table Of Contents

Related Topics

This Page

musisep.dictsep package¶

Submodules¶

musisep.dictsep.__main__ module¶

musisep.dictsep.adam_b module¶

musisep.dictsep.dictlearn module¶

musisep.dictsep.exptool module¶

musisep.dictsep.pursuit module¶

Module contents¶

musisep.dictsep.main module¶