Music making is increasingly digitized here in 2020, but some analog audio effects are still very difficult to reproduce in this way. One of those effects is the kind of screeching guitar distortion favored by rock gods everywhere. Up to now, these effects, which involve guitar amplifiers, have been next to impossible to re-create digitally.
That’s now changed thanks to the work of researchers in the department of signal processing and acoustics at Finland’s Aalto University. Using deep learning artificial intelligence (A.I.), they have created a neural network for guitar distortion modeling that, for the first time, can fool blind-test listeners into thinking it’s the genuine article. Think of it like a Turing Test, cranked all the way up to a Spınal Tap-style 11.
“It has been the general belief of audio researchers for decades that the accurate imitation of the distorted sound of tube guitar amplifiers is very challenging,” Professor Vesa Välimäki told Digital Trends. “One reason is that the distortion is related to dynamic nonlinear behavior, which is known to be hard to simulate even theoretically. Another reason may be that distorted guitar sounds are usually quite prominent in music, so it appears difficult to hide any problems there; all inaccuracies will be very noticeable.”
To train the neural network to recreate a variety of distortion effects, all that is needed is a few minutes of audio recorded from the target amplifier. The researchers used “clean” audio recorded from an electric guitar in an anechoic chamber, and then ran it through an amplifier. This provided both an input in the form of the unblemished guitar sound, and an output in the form of the corresponding “target” guitar amplifier output.
“Training is done by feeding the neural network a short segment of clean guitar audio, and comparing the network’s output to the ‘target’ amplifier output,” Alec Wright, a doctoral student focused on audio processing using deep learning, told Digital Trends. “This comparison is done in the ‘loss function,’ which is simply an equation that represents how far the neural network output is from the target output, or, how ‘wrong’ the neural network model’s prediction was. The key is a process called ‘gradient descent,’ where you calculate how to adjust the neural network’s parameters very slightly, so that the neural network’s prediction is slightly closer to the target amplifier’s output. This process is then repeated thousands of times — or sometimes much more — until the neural network’s output stops improving.”
You can check out a demo of the A.I. in action at research.spa.aalto.fi/