musisep.dictsep package

Submodules

musisep.dictsep.__main__ module

Wrapper for the dictionary learning algorithm. When invoked, the audio sources in the supplied audio file are separated.

musisep.dictsep.__main__.main(mixed_soundfile, orig_soundfiles, inst_num, tone_num, pexp, qexp, har, sigmas, sampdist, spectheight, logspectheight, minfreq, maxfreq, out_name, runs, lifetime, num_dicts, mask, plot_range)[source]

Wrapper function for the dictionary learning algorithm.

Parameters:
  • mixed_soundfile (string) – Name of the mixed input file
  • orig_soundfiles (list of string or NoneType) – Names of the files with the isolated instrument tracks or None
  • inst_num (int) – Number of instruments
  • tone_num (int) – Maximum number of simultaneous tones
  • pexp (float) – Exponent for the addition of sinusoids
  • qexp (float) – Exponent to be applied on the spectrum
  • har (int) – Number of harmonics
  • sigmas (float) – Number of standard deviations after which to cut the window/kernel
  • sampdist (int) – Time intervals to sample the spectrogram
  • spectheight (int) – Height of the linear-frequency spectrogram
  • logspectheight (int) – Height of the log-frequency spectrogram
  • minfreq (float) – Minimum frequency in Hz to be represented (included)
  • maxfreq (float) – Maximum frequency in Hz to be represented (excluded)
  • out_name (string) – Prefix for the file names
  • runs (int) – Number of training iterations to perform
  • lifetime (int) – Number of steps after which to renew the dictionary
  • num_dicts (int) – Number of different dictionaries to generate and train
  • mask (bool) – Whether to apply spectral masking
  • plot_range (slice or NoneType) – part of the spectrogram to plot

musisep.dictsep.adam_b module

Module containing the modified ADAM algorithm.

class musisep.dictsep.adam_b.Adam_B(init, lo=0, hi=1, alpha=0.0001, beta1=0.9, beta2=0.999, eps=1e-08)[source]

Bases: object

Object for the ADAM algorithm with bounds, adapted for the update of instrument dictionaries. Each column refers to one instruments, and the harmonics are in rows.

Parameters:
  • init (array-like) – Initial value for the dictionary
  • lo (float) – Lower bound for the dictionary entries
  • hi (float) – Upper bound for the dictionary entries
  • alpha (float) – Global step-size
  • beta1 (float) – Inertia of the first moment estimator
  • beta2 (float) – Inertia of the second moment estimator
  • eps (float) – Value to add in the denominator to avoid division by zero
reset(i)[source]

Reset an instrument to its initial state.

Parameters:i (int) – Number of the instrument
step(stepdir)[source]

Update the dictionary.

Parameters:stepdir (array-like) – Step direction (negative gradient)
Returns:theta – New value of the dictionary
Return type:ndarray

musisep.dictsep.dictlearn module

Module for the training of the dictionary. When invoked, a performance test on artificial data is performed.

class musisep.dictsep.dictlearn.Learner(fsigma, tone_num, inst_num, har, m, lifetime, pexp, qexp, init=None)[source]

Bases: object

Container object for the dictionary learning process.

Parameters:
  • fsigma (float) – Standard deviation (frequency)
  • tone_num (int) – Maximum number of simultaneous tones
  • inst_num (int) – Number of instruments in the dictionary
  • har (int) – Number of harmonics
  • m (int) – Height of the log-frequency spectrogram
  • lifetime (int) – Number of steps after which to renew the dictionary
  • pexp (float) – Exponent for the addition of sinusoids
  • qexp (float) – Exponent to be applied on the spectrum
get_dict()[source]

Get the active part of the dictionary.

Returns:inst_dict – Dictionary with inst_num columns
Return type:ndarray
learn(y)[source]

Learning step. Automatically renews the dictionary.

Parameters:y (array_like) – Log-frequency spectrum
Returns:reconstruction – Synthesized spectrum
Return type:ndarray
renew_dict(headstart, newinsts)[source]

Renew the dictionary.

Parameters:
  • headstart (int) – Headstart in the lifetime counter (to help new instruments)
  • newinsts (int) – Number of instruments to be renewed
musisep.dictsep.dictlearn.gen_random_inst(har)[source]

Generate random harmonic amplitudes according to a Par(1,2) distribution.

Parameters:har (int) – Number of harmonics
Returns:inst – Harmonic amplitudes for one instrument, unified to an interval of [0,1]
Return type:ndarray
musisep.dictsep.dictlearn.gen_random_inst_dict(har, inst_num)[source]

Generate a random instrument dictionary according to a Par(1,2) distribution.

Parameters:
  • har (int) – Number of harmonics
  • inst_num (int) – Number of instruments
Returns:

inst_dict – Dictionary with instruments in columns, unified to an interval of [0,1]

Return type:

ndarray

musisep.dictsep.dictlearn.learn_spect_dict(spect, fsigma, tone_num, inst_num, pexp, qexp, har, m, minfreq, maxfreq, runs, lifetime)[source]

Train the dictionary containing the relative amplitudes of the harmonics.

Parameters:
  • spect (array_like) – Original log-frequency spectrogram of the recording
  • fsigma (float) – Standard deviation (frequency)
  • tone_num (int) – Maximum number of simultaneous tones
  • inst_num (int) – Number of instruments in the dictionary
  • pexp (float) – Exponent for the addition of sinusoids
  • qexp (float) – Exponent to be applied on the spectrum
  • har (int) – Number of harmonics
  • m (int) – Height of the log-frequency spectrogram
  • minfreq (float) – Minimum frequency in Hz to be represented (included)
  • maxfreq (float) – Maximum frequency in Hz to be represented (excluded)
  • runs (int) – Number of training iterations to perform
  • lifetime (int) – Number of steps after which to renew the dictionary
Returns:

inst_dict – Dictionary containing the relative amplitudes of the harmonics

Return type:

ndarray

musisep.dictsep.dictlearn.mask_spectrums(spects, orig_spect)[source]

Mask the synthesized spectrograms with the original spectrogram.

Parameters:
  • spects (list of array_like) – List of synthesized spectrograms
  • orig_spect (array_like) – Original spectrogram
Returns:

  • spectrums (list of ndarray) – Masked spectrograms
  • mask_spect (ndarray) – Array mask

musisep.dictsep.dictlearn.stoch_grad(y, inst_dict, tone_num, adam, fsigma, harscale, baseshift, inst_spect, pexp, qexp)[source]

Perform a dictionary training step.

Parameters:
  • y (array_like) – Log-frequency spectrum to represent
  • inst_dict (ndarray) – Dictionary containing the relative amplitudes of the harmonics
  • tone_num (int) – Maximum number of simultaneous tones
  • adam (Adam_B) – Container object for the ADAM optimizer
  • fsigma (float) – Standard deviation (frequency)
  • harscale (float) – Scaling factor
  • baseshift (int) – Length to add to the spectrum in order to avoid circular convolution
  • inst_spect (array_like) – Spectra of the instruments, in the columns
  • pexp (float) – Exponent for the addition of sinusoids
  • qexp (float) – Exponent to be applied on the spectrum
Returns:

  • inst_dict (ndarray) – Updated dictionary
  • reconstruction (ndarray) – Synthesized spectrum
  • inst_amps (ndarray) – Summed amplitudes for each instruments

musisep.dictsep.dictlearn.synth_spect(spect, tone_num, inst_dict, fsigma, spectheight, pexp, qexp, minfreq, maxfreq)[source]

Separate and synthesize the spectrograms from the original spectrogram.

Parameters:
  • spect (array_like) – Original log-frequency spectrogram of the recording
  • tone_num (int) – Maximum number of simultaneous tones
  • inst_dict (ndarray) – Dictionary containing the relative amplitudes of the harmonics
  • fsigma (float) – Standard deviation (frequency)
  • spectheight (int) – Height of the linear-frequency spectrograms
  • pexp (float) – Exponent for the addition of sinusoids
  • qexp (float) – Exponent to be applied on the spectrum
  • minfreq (float) – Minimum frequency to be represented (included) (normalized to the sampling frequency)
  • maxfreq (float) – Maximum frequency to be represented (excluded) (normalized to the sampling frequency)
Returns:

  • dict_spectrum (ndarray) – Synthesized log-frequency spectrogram with all instruments
  • inst_spectrums (list of ndarray) – List of synthesized log-frequency spectrograms for the instruments
  • dict_spectrum_lin (ndarray) – Synthesized linear-frequency spectrogram with all instruments
  • inst_spectrums_lin (list of ndarray) – List of synthesized linear-frequency spectrograms for the instruments

musisep.dictsep.dictlearn.test_learn(fsigma, tone_num, inst_num, pexp, qexp, har, m, runs, test_samples, lifetime)[source]

Evaluate the performance of the dictionary learning algorithm via artificial spectra.

Parameters:
  • fsigma (float) – Width of the Gaussians in the log-frequency spectrogram
  • tone_num (int) – Maximum number of simultaneous tones
  • inst_num (int) – Number of instruments in the dictionaries
  • pexp (float) – Exponent for the addition of sinusoids
  • qexp (float) – Exponent to be applied on the spectrum
  • har (int) – Number of harmonics
  • m (int) – Height of the log-frequency spectrogram
  • runs (int) – Number of training iterations to perform
  • test_samples (int) – Number of test spectra to generate
  • lifetime (int) – Number of steps after which to renew the dictionary
Returns:

measures – Array containing, in that order, the SDR, SIR, SAR with the original dictionary and the SDR, SID, SAR with the trained dictionary

Return type:

ndarray

musisep.dictsep.dictlearn.test_learn_multi(fsigma, tone_num, inst_num, pexp, qexp, har, m, runs, test_samples, lifetime, num_dicts)[source]

Evaluate the performance of the dictionary learning algorithm via artificial spectra.

Parameters:
  • fsigma (float) – Width of the Gaussians in the log-frequency spectrogram
  • tone_num (int) – Maximum number of simultaneous tones
  • inst_num (int) – Number of instruments in the dictionaries
  • pexp (float) – Exponent for the addition of sinusoids
  • qexp (float) – Exponent to be applied on the spectrum
  • har (int) – Number of harmonics
  • m (int) – Height of the log-frequency spectrogram
  • runs (int) – Number of training iterations to perform
  • test_samples (int) – Number of test spectra to generate
  • lifetime (int) – Number of steps after which to renew the dictionary
  • num_dicts (int) – Number of different dictionaries to generate and train
Returns:

measures – Array containing, in the rows, the SDR, SIR, SAR with the original dictionary and the SDR, SID, SAR with the trained dictionary

Return type:

ndarray

musisep.dictsep.exptool module

Back-end module for the generation of spectrograms and their gradients.

musisep.dictsep.exptool.inst_scale()
musisep.dictsep.exptool.inst_scale_grad()
musisep.dictsep.exptool.inst_shift()
musisep.dictsep.exptool.inst_shift_dict_grad()
musisep.dictsep.exptool.inst_shift_grad()

musisep.dictsep.pursuit module

Module for the sparse pursuit algorithm and its helper functions.

class musisep.dictsep.pursuit.Peaks(amps, shifts, sigmas, spreads, insts)[source]

Bases: object

Object to represent the parameters for the peaks in the spectrogram.

Parameters:
  • amps (array_like) – Amplitudes
  • shifts (array_like) – Fundamental frequencies
  • sigmas (array_like) – Standard deviations (frequency)
  • spreads (array_like) – Inharmonicities
  • insts (array_like) – Instrument numbers
copy()[source]
Returns:Copy of the contained peak parameters
Return type:Peaks
classmethod empty()[source]

Construct an empty Peaks object.

Returns:A Peaks object with zero peaks
Return type:Peaks
classmethod from_array(array, insts)[source]

Construct a Peaks object from an array.

Parameters:
  • array (array_like) – Array that contains, in consecutive order, the amplitudes, the fundamental frequencies, the standard deviations, and the inharmoniticies
  • insts (array_like) – Instrument numbers
get_array()[source]
Returns:Array that contains, in consecutive order, the amplitudes, the fundamental frequencies, the standard deviations, and the inharmoniticies
Return type:array_like
get_params()[source]
Returns:
  • amps (array_like) – Amplitudes
  • shifts (array_like) – Fundamental frequencies
  • sigmas (array_like) – Standard deviations (frequency)
  • spreads (array_like) – Inharmonicities
  • insts (array_like) – Instrument numbers
merge(new)[source]

Merge the Peaks object with another Peaks object contained in new by concatenating the parameters.

Parameters:new (Peaks) – Object to merge with
musisep.dictsep.pursuit.calc_harscale(minfreq, maxfreq, numfreqs)[source]

Calculate the scaling factor of the frequency axis for the log-frequency spectrogram.

Parameters:
  • minfreq (float) – Minimum frequency to be represented (included)
  • maxfreq (float) – Maximum frequency to be represented (excluded)
  • numfreqs (int) – Intended height of the spectrogram
Returns:

harscale – Scaling factor

Return type:

float

musisep.dictsep.pursuit.fft_selector(y, prenum, baseshift, inst_spect, qexp)[source]

Callback selector to find fundamental frequencies based on the correlation of the spectrum with the instrument spectra.

Parameters:
  • y (array_like) – Spectrum
  • prenum (int) – Number of peaks to consider
  • baseshift (int) – Length to add to the spectrum in order to avoid circular convolution
  • inst_spect (array_like) – Spectra of the instruments, in the columns
  • qexp (float) – Exponent to be applied on the spectrum
Returns:

  • amps (array_like) – Amplitudes
  • shifts (array_like) – Fundamental frequencies
  • insts (array_like) – Instrument numbers

musisep.dictsep.pursuit.gen_inst_spect(baseshift, fsigma, inst_dict, harscale, pexp, qexp, m)[source]

Generate an instrument log-frequency spectrum.

Parameters:
  • baseshift (int) – Length to add to the spectrum in order to avoid circular convolution
  • fsigma (float) – Standard deviation (frequency)
  • inst_dict (ndarray) – Dictionary containing the relative amplitudes of the harmonics
  • harscale (float) – Scaling factor
  • pexp (float) – Exponent for the addition of sinusoids
  • qexp (float) – Exponent to be applied on the spectrum
  • m (int) – Height of the spectrogram
Returns:

inst_spect – Spectra of the instruments, in the columns

Return type:

ndarray

musisep.dictsep.pursuit.inst_scale(peaks, inst_dict, pexp, m, n)[source]

Synthesize the linear-frequency spectrum.

Parameters:
  • peaks (Peaks) – Peak parameters
  • inst_dict (array_like) – Dictionary containing the relative amplitudes of the harmonics
  • pexp (float) – Exponent for the addition of sinusoids
  • m (int) – Height of the spectrogram
  • n (int) – Number of instruments
Returns:

Linear-frequency spectrum

Return type:

ndarray

musisep.dictsep.pursuit.inst_shift(peaks, inst_dict, harscale, pexp, m, n)[source]

Synthesize the log-frequency spectrum.

Parameters:
  • peaks (Peaks) – Peak parameters
  • inst_dict (array_like) – Dictionary containing the relative amplitudes of the harmonics
  • harscale (float) – Scaling factor
  • pexp (float) – Exponent for the addition of sinusoids
  • m (int) – Height of the spectrogram
  • n (int) – Number of instruments
Returns:

Log-frequency spectrum

Return type:

ndarray

musisep.dictsep.pursuit.inst_shift_dict_grad(peak_array, insts, inst_dict, harscale, pexp, qexp, m, n, y)[source]

Least-squares gradient function for the log-frequency spectrum w.r.t. the dictionary.

Parameters:
  • peak_array (array_like) – Peak parameters in array form
  • insts (array_like) – Instrument numbers
  • inst_dict (array_like) – Dictionary containing the relative amplitudes of the harmonics
  • harscale (float) – Scaling factor
  • pexp (float) – Exponent for the addition of sinusoids
  • qexp (float) – Exponent to be applied on the spectrum
  • m (int) – Height of the spectrogram
  • n (int) – Number of instruments
  • y (array_like) – Spectrum to compare with
Returns:

grad – Least-squares gradient w.r.t. the dictionary

Return type:

ndarray

musisep.dictsep.pursuit.inst_shift_grad(peak_array, insts, inst_dict, harscale, pexp, qexp, m, n, y)[source]

Least-squares gradient function for the log-frequency spectrum w.r.t. the parameters.

Parameters:
  • peak_array (array_like) – Peak parameters in array form
  • insts (array_like) – Instrument numbers
  • inst_dict (array_like) – Dictionary containing the relative amplitudes of the harmonics
  • harscale (float) – Scaling factor
  • pexp (float) – Exponent for the addition of sinusoids
  • qexp (float) – Exponent to be applied on the spectrum
  • m (int) – Height of the spectrogram
  • n (int) – Number of instruments
  • y (array_like) – Spectrum to compare with
Returns:

grad – Least-squares gradient

Return type:

ndarray

musisep.dictsep.pursuit.inst_shift_obj(peak_array, insts, inst_dict, harscale, pexp, qexp, m, n, y)[source]

Least-squares objective function for the log-frequency spectrum.

Parameters:
  • peak_array (array_like) – Peak parameters in array form
  • insts (array_like) – Instrument numbers
  • inst_dict (array_like) – Dictionary containing the relative amplitudes of the harmonics
  • harscale (float) – Scaling factor
  • pexp (float) – Exponent for the addition of sinusoids
  • qexp (float) – Exponent to be applied on the spectrum
  • m (int) – Height of the spectrogram
  • n (int) – Number of instruments
  • y (array_like) – Spectrum to compare with
Returns:

obj – Least-squares error

Return type:

float

musisep.dictsep.pursuit.make_bounds(fsigma, length)[source]

Compute sensible bounds for the peak parameters.

Parameters:
  • fsigma (float) – Standard deviation (frequency)
  • length (int) – Number of instruments
Returns:

bounds – Bounds for the optimizer

Return type:

list of tuple

musisep.dictsep.pursuit.max_selector(y, prenum, n)[source]

Callback selector to find peaks based on the local maxima which are dominant in a discrete interval, viewed from its midpoint.

Parameters:
  • y (array_like) – Spectrum
  • prenum (int) – Number of peaks to consider
  • n (int) – Length of the interval
Returns:

  • amps (array_like) – Amplitudes
  • shifts (array_like) – Frequencies
  • insts (array_like) – Instrument numbers (always 0)

musisep.dictsep.pursuit.peak_pursuit(y, num, prenum, runs, inst_dict, fsigma, harscale, selector, selector_args, pexp, qexp, beta=1, init=None)[source]

Sparse pursuit algorithm for the identification of peaks in a spectrum.

Parameters:
  • y (array_like) – Spectrum
  • num (int) – Maximum number of peaks
  • prenum (int) – Number of new peaks to consider per step
  • inst_dict (ndarray) – Dictionary containing the relative amplitudes of the harmonics
  • fsigma (float) – Standard deviation (frequency)
  • harscale (float) – Scaling factor
  • selector (function) – Callback selector accepting y and prenum as arguments
  • selector_args (sequence) – Extra arguments to pass to the selector
  • pexp (float) – Exponent for the addition of sinusoids
  • qexp (float) – Exponent to be applied on the spectrum
  • beta (float) – Residual reduction factor
  • init (Peaks) – Initial value for the peaks
Returns:

  • peaks (Peaks) – Identified peaks
  • reconstruction (ndarray) – Synthesized spectrum

Module contents