"""Convert a power spectrogram (amplitude squared) to decibel (dB) units
"""Convert a power spectrogram (amplitude squared) to decibel (dB) units. This computes the scaling `10 * log10(spect / ref)` in a numerically stable way.
This computes the scaling ``10 * log10(spect / ref)`` in a numerically
Args:
stable way.
spect (np.ndarray): STFT power spectrogram of an input waveform.
ref (float, optional): Scaling factor of spectrogram. Defaults to 1.0.
amin (float, optional): Minimum threshold. Defaults to 1e-10.
top_db (Optional[float], optional): Threshold the output at `top_db` below the peak. Defaults to 80.0.
"""Mu-law encoding. Encode waveform based on mu-law companding. When quantized is True, the result will be converted to integer in range `[0,mu-1]`. Otherwise, the resulting waveform is in range `[-1,1]`.
Compute the mu-law decoding given an input code.
When quantized is True, the result will be converted to
integer in range [0,mu-1]. Otherwise, the resulting signal
"""Mu-law decoding. Compute the mu-law decoding given an input code. It assumes that the input `y` is in range `[0,mu-1]` when quantize is True and `[-1,1]` otherwise.
Compute the mu-law decoding given an input code.
it assumes that the input y is in
Args:
range [0,mu-1] when quantize is True and [-1,1] otherwise
y (np.ndarray): The encoded waveform.
mu (int, optional): The endoceding parameter. Defaults to 255.
Reference:
quantized (bool, optional): If `True`, the input is assumed to be quantized to `1 + mu` distinct integer values. Defaults to True.
"""Do adpative spectrogram augmentation. The level of the augmentation is gowern by the paramter level, ranging from 0 to 1, with 0 represents no augmentation.
The level of the augmentation is gowern by the paramter level,
Args:
ranging from 0 to 1, with 0 represents no augmentation。
spect (np.ndarray): Input spectrogram.
tempo_axis (int, optional): Indicate the tempo axis. Defaults to 0.
level (float, optional): The level factor of masking. Defaults to 0.1.
Returns:
np.ndarray: The augmented spectrogram.
"""
"""
assertspect.ndim==2.,'only supports 2d tensor or numpy array'
assertspect.ndim==2.,'only supports 2d tensor or numpy array'
"""Compute spectrogram of a given signal, typically an audio waveform.
"""Compute spectrogram of a given signal, typically an audio waveform.
The spectorgram is defined as the complex norm of the short-time
The spectorgram is defined as the complex norm of the short-time
Fourier transformation.
Fourier transformation.
Parameters:
n_fft (int): the number of frequency components of the discrete Fourier transform.
Args:
The default value is 2048,
n_fft (int, optional): The number of frequency components of the discrete Fourier transform. Defaults to 512.
hop_length (int|None): the hop length of the short time FFT. If None, it is set to win_length//4.
hop_length (Optional[int], optional): The hop length of the short time FFT. If `None`, it is set to `win_length//4`. Defaults to None.
The default value is None.
win_length (Optional[int], optional): The window length of the short time FFT. If `None`, it is set to same as `n_fft`. Defaults to None.
win_length: the window length of the short time FFt. If None, it is set to same as n_fft.
window (str, optional): The window function applied to the single before the Fourier transform. Supported window functions: 'hamming', 'hann', 'kaiser', 'gaussian', 'exponential', 'triang', 'bohman', 'blackman', 'cosine', 'tukey', 'taylor'. Defaults to 'hann'.
The default value is None.
power (float, optional): Exponent for the magnitude spectrogram. Defaults to 2.0.
window (str): the name of the window function applied to the single before the Fourier transform.
center (bool, optional): Whether to pad `x` to make that the :math:`t \times hop\_length` at the center of `t`-th frame. Defaults to True.
The folllowing window names are supported: 'hamming','hann','kaiser','gaussian',
pad_mode (str, optional): Choose padding pattern when `center` is `True`. Defaults to 'reflect'.