If you ever heard of ‘Deepfake’ then you must be aware of the sinister side of deepfake video editing software. It has now the ability to manipulate videos through AI and puts words in the mouths of people who don’t exist.
Researchers have worked on how to transform the written text into a realistic video of someone saying those very words in the latest advance in video manipulation.
We have seen this kind of ‘deepfake’ trickery before, where videos of renowned individuals are run through AI algorithms to get them to say just about anything. But technology continues to improve and become harder to identify.
The new program allows users to edit a video’s written transcript and then returns those edits through digital manipulation layers. It literally puts words in the mouths of people.
This may be used to solve small issues in an acting performance in post-production, according to the researchers, but they also recognize that their creation could be used for more sinister reasons.
Also Read: The Way of Building Video Games is Changing
“This technology is really about better storytelling,” says Stanford University computer scientist Ohad Fried. “Visually, it’s seamless. Nothing needs to be re-recorded.”
Using machine learning methods, the newly generated algorithm matches phrases in a transcript with the movements of a speaking head. At the moment it has only been tested on videos showing people up from the shoulders. It takes at least 40 minutes of sample footage to create a fake that looks realistic.
In order to make the speech seem more natural, an intelligent smoothing mechanism is applied. The resulting 3D model goes through a process called Neural Rendering. This uses neural networks to bridge the gap between the 3D model and the actual face.
The final AI-produced videos were rated as “real” nearly 60 percent of the time when shown to 138 volunteers.
Also Read: What is Neuromorphic Computing?
What we have here in its most developed form is the ability to edit a speaking head video almost as simple as you could edit a Word document. As time goes on, technology will only become more efficient and precise. It can also be used with synthesized voices and language-to-language speech translation.
As we’ve seen over the past couple of years, different projects are now able to create realistic-looking speaking heads reading from a script, and soon the individuals on video may not even have to exist. Given enough training information, AI can also produce them.
So, what about the technology’s potential misuse?
The researchers say that they have taken this into account, although their solutions may not sound completely compelling to everyone. They indicate that watermarking systems, enhanced forensic analysis, and better education and knowledge of video manipulation might help spectators develop a good cynicism about video clips ‘ authenticity.
To support their case, they mention that when it comes to picture editing, we have already learned to live with this – knowing that images can be manipulated and faked to a very high standard.
“Unfortunately, such techniques are always going to attract poor performers,” Fried says. “But the fight is worth it considering the many applications that allow creative video editing and content development.”