What Is The Cocktail Party Problem?
The Cocktail Party Problem refers to the presence of multiple sound sources in a social setting. Making it hard to determine a single source as varying frequencies of voices make it difficult to identify the main source.
The differences in frequencies are another problem. Furthermore, echo also produces problems. However, human beings haven’t been able to replicate this ability.
How Is AI Helping To Solve The Problem?
Keith McElveen, founder and chief technology officer of Wave Sciences, has developed an AI that can analyze how sound bounces around a room before reaching the microphone or ear. He is using AI with complex mathematical algorithms to isolate different voices.
The technology was initially used in array beamforming. However, feedback from potential commercial partners indicated that the system required too many microphones. Thus, the cost involved in giving good results in many situations created hurdles.
McElveen founded Wave Sciences in 2009, hoping to develop a technology that could separate overlapping voices. After 10 years of internally funded research, the company filed a patent application in September 2019. The company’s solution was an AI that can analyze how sound bounces around a room. Simultaneously, the camera lens subject and object focus help in the same way.
The technology had its first real-world forensic use in a US murder case. The FBI arranged to trick the family into believing they were being blackmailed for their involvement. Moreover, the court authorized the use of Wave Sciences’ algorithm. Thus, it meant that the audio went from being inadmissible to a pivotal piece of evidence. Since then, other government laboratories, including in the UK, have put it through a battery of tests. The company is now marketing the technology to the US military, which has used it to analyze sonar signals.
The technology could also have applications in hostage negotiations and suicide scenarios. Late last year, the company released a software application using its learning algorithm. Eventually, the company aims to introduce tailored versions of its product for use in audio recording kits, voice interfaces for cars, and smart speakers.
Areas of forensics, such as voice pattern analysis and detection of manipulations, already use AI for fraud detection. Bosch’s SoundSee technology uses audio signal processing algorithms to analyze a motor’s sound to predict a malfunction before it happens.
More recent tests of the Wave Sciences algorithm have outgrown human years. Furthermore, only two microphones are required.