Music Decomposition and Synthesis of Musical sound with Neural Networks


James Owers

We learn an instrument synthesizer by designing a network which generates sound in a similar way to an additive synthesizer. An instrument and pitch are provided as input, and a sound wave is generated as output. The instrument information is passed to an encoding layer - we explore how movement in this latent space corresponds to changing output and potentially new instrument sounds. Additionally, we explore how changing the synthesizer parameters affects the position in latent space. This work contrasts with NSynth (Engel 2017) as it seeks to restrict the model and utilise what is know about combining sinusoids. This constraint allows for this model to be used as an artist's tool, mapping opaque synthesizer parameters to a more fluid instrument space e.g. making the sound more brassy would correspond to moving towards the location of brass instruments -- it's not immediately obvious how this would be achieved using the additive synthesizer parameters themselves if you are not an expert.
Supervisors: Amos Storkey & Mark Steedman