Post on 07-Jul-2020
transcript
Meinard Müller, Christof Weiss, Stefan Balke
Further Topics in MIR
International Audio Laboratories Erlangen{meinard.mueller, christof.weiss, stefan.balke}@audiolabs-erlangen.de
TutorialAutomatisierte Methoden der Musikverarbeitung47. Jahrestagung der Gesellschaft für Informatik
Why is Music Processing Challenging?
Waveform
Chopin, Mazurka Op. 63 No. 3 Example:A
mpl
itude
Time (seconds)
Why is Music Processing Challenging?
Waveform / Spectrogram
Chopin, Mazurka Op. 63 No. 3 Example:Fr
eque
ncy
(Hz)
Time (seconds)
Why is Music Processing Challenging?
Waveform / Spectrogram
Performance– Tempo– Dynamics– Note deviations– Sustain pedal
Chopin, Mazurka Op. 63 No. 3 Example:
Why is Music Processing Challenging?
Waveform / Spectrogram
Performance– Tempo– Dynamics– Note deviations– Sustain pedal
Polyphony
Chopin, Mazurka Op. 63 No. 3 Example:
Main Melody
AccompanimentAdditional melody line
Decomposition of audio stream into different sound sources
Central task in digital signal processing
“Cocktail party effect”
Source Separation
Source Separation
Decomposition of audio stream into different sound sources
Central task in digital signal processing
“Cocktail party effect”
Several input signals
Sources are assumed to be statistically independent
Source Separation (Music)
Time
Time
Main melody, accompaniment, drum track
Instrumental voices
Individual note events
Only mono or stereo
Sources are often highly dependent
Harmonic-Percussive Decomposition
Harmonic component
Percussive component
Clearly percussive soundsClearly harmonic sounds
Mixture:
Harmonic-Percussive Decomposition
Clearly percussive soundsClearly harmonic sounds
Mixture:
Harmonic component
Residualcomponent
Percussive component
Harmonic-Percussive Decomposition
Mixture:
• Clearly harmonic sounds of singing voice and accompaniment
• Drum hits• Fricatives &
plosives in singing voice
• Noise-like sounds• Vibrato/glissando
sounds
Demo: https://www.audiolabs-erlangen.de/resources/2014-ISMIR-ExtHPSep/
Harmonic component
Percussive component
Residualcomponent
Literature: [Driedger/Müller/Disch, ISMIR 2014]
Singing Voice Extraction
Original recording HPR
Harmonic component Residual componentPercussive component
Harmonic portion singing voice
MR TR SL
F0 annotation
Harmonic portion accompaniment
Fricativessinging voice
Instrument onsetsaccompaniment
Vibrato & formantssinging voice
Diffuse instruments soundsaccompaniment
+ +
Estimatesinging voice
Estimateaccompaniment
Time
Freq
uenc
y
Score-Informed Source SeparationExploit musical score to support separation process
Time
Pitc
hP
itch
Time
Pitc
h
Time
Freq
uenc
y (H
z)
Render
Parametric Model Approach
Estimate
≈
Parameters
Time (seconds) Time (seconds)
Freq
uenc
y (H
z)
Rebuild spectrogram information
NMF (Nonnegative Matrix Factorization)
≈
Templates Activations
N
M K
K
M
Magnitude Spectrogram
Templates: Pitch + Timbre
Activations: Onset time + Duration
“How does it sound”
“When does it sound”
NMF-Decomposition
Not
e nu
mbe
r
Freq
uenc
y
Note number Time
Initialized template Initialized activations
Random initialization
NMF-Decomposition
Not
e nu
mbe
r
Freq
uenc
yFr
eque
ncy
Note number
Not
e nu
mbe
r
Time
Learnt templates Learnt activations
Initialized template Initialized activations
Random initialization → No semantic meaning
NMF-Decomposition
Not
e nu
mbe
r
Freq
uenc
y
Note number Time
Initialized template Initialized activations
Constrained initialization
NMF-Decomposition
Not
e nu
mbe
r
Freq
uenc
y
Note number Time
Activation constraints for p=55
Initialized template Initialized activations
Template constraint for p=55
Constrained initialization
NMF-Decomposition
Not
e nu
mbe
r
Freq
uenc
yFr
eque
ncy
Not
e nu
mbe
r
Time
Org
Model
Note number
Initialized template Initialized activations
Constrained initialization → NMF as refinement
Learnt templates Learnt activations
Score-Informed Audio Decomposition
500
580
523
Freq
uenc
y (H
ertz
)
0 10.5Time (seconds)
9876
1600
1200
800
400
9876
1600
1200
800
400
500
580
554Fr
eque
ncy
(Her
tz)
0 10.5Time (seconds)
Application: Audio editing
Informed Drum-Sound Decomposition
Demo: https://www.audiolabs-erlangen.de/resources/MIR/2016-IEEE-TASLP-DrumSeparationLiterature: [Dittmar/Müller, IEEE/ACM-TASLP 2016]
Remix:
Loop Decomposition of EDM
Demo: https://www.audiolabs-erlangen.de/resources/MIR/2016-ISMIR-EMLoopLiterature: [López-Serrano/Dittmar/Müller, ISMIR 2016]
Decomposition Patterns Activations
Audio MosaicingSource signal: BeesTarget signal: Beatles–Let it be
Mosaic signal: Let it Bee
Demo: https://www.audiolabs-erlangen.de/resources/MIR/2015-ISMIR-LetItBeeLiterature: [Driedger/Müller, ISMIR 2015]
NMF-Inspired Audio Mosaicing
≈
. =
Non-negative matrix factorization (NMF)
Proposed audio mosaicing approach
≈
.
Non-negative matrix Components Activations
Target’s spectrogram Source’s spectrogram Activations Mosaic’s spectrogram
fixed
learnedfixed
learned
fixed
learned
=
Time source
Freq
uenc
y
Tim
e so
urce
Time targetTime target
Freq
uenc
y
NMF-Inspired Audio Mosaicing
Time target
Freq
uenc
y
Time source
Freq
uenc
y
Freq
uenc
y
Tim
e so
urce
Time targetTime target
. =≈
Spectrogram target
Spectrogram source
SpectrogrammosaicActivation matrix
NMF-Inspired Audio Mosaicing
Time target
Freq
uenc
y
Time source
Freq
uenc
y
Freq
uenc
y
Tim
e so
urce
Time targetTime target
. =≈
Spectrogram target
Spectrogram source
SpectrogrammosaicActivation matrix
Core idea: support the development of sparse diagonal activation structures
Activation matrix
This image cannot currently be displayed.This image cannot currently be displayed.
Iterative updates
Preserve temporal context
NMF-Inspired Audio Mosaicing
Time target
Freq
uenc
y
Time source
Freq
uenc
y
Freq
uenc
y
Tim
e so
urce
Time targetTime target
. =≈
Spectrogram target
Spectrogram source
SpectrogrammosaicActivation matrix
NMF-Inspired Audio Mosaicing
Time target
Freq
uenc
y
Time source
Freq
uenc
y
Freq
uenc
y
Tim
e so
urce
Time targetTime target
. =≈
Spectrogram target
Spectrogram source
SpectrogrammosaicActivation matrix
Teaching
Academic training of students
Fundamental research
Summary
Music information retrieval
Audio decomposition techniques
Machine learning
Music applications & musicology
Multimedia scenarios
Web-based interfaces
Book: Fundamentals of Music Processing
Meinard MüllerFundamentals of Music ProcessingAudio, Analysis, Algorithms, Applications483 p., 249 illus., hardcoverISBN: 978-3-319-21944-8Springer, 2015
Accompanying website: www.music-processing.de