Every day, most of us use multiple screens to get our news, keep in touch with other people, learn something new, or be entertained.
How many screens do you look at during a day, and for how long?
Now for the critical question:
How much of the information that you see on these screens can you believe?
You’ve probably heard about deepfakes or “fake news” recently. Still, have you seen or heard what this technology can actually do?
Imagine viewing a video of a Fortune 100 company’s CEO posted online announcing significant layoffs or a disastrous financial report.
What would that do to the employees’ morale and productivity?
If it were a publicly-traded company, what would happen to the stock price?
For a different example, think about the potential reaction if a video of Iran’s leader were published on the Internet threatening immediate nuclear war?
What could the possible outcomes be?
Later, you discover that both of these examples were fakes using artificial intelligence and that the videos’ messages weren’t real.
How much damage could be done from the moment when the video is published until it’s found to be fake?
Now let me give you another example.
Imagine a scenario where an employee receives a call from her boss or even the CFO.
She recognizes the voice, which instructs her to immediately wire funds to an important client, after providing detailed wiring instructions.
Except the caller isn’t actually the person they claim to be, and by the time the truth is discovered, the funds have disappeared.
All of these examples are possible today with what is known as deepfake technology.
What Is It?
The data you see on screens every day is already manipulated more than you know, but new technologies have taken the potential for crime and even more dangerous uses to a new level.
Some of the first deepfake videos were pornography, where a celebrity’s head replaced an actor’s head.
There have been several deepfake videos of politicians, and I’ll show you some examples.
There are many new ways that this technology can be integrated with artificial intelligence to do other things. One example is to create a digital clone of your voice and then editing “your” voice to make it say almost anything.
Artificial intelligence is being used to create human-like digital “people.” This technology can simulate customer service agents, teachers, or digital spokespersons.
Several of these systems can use your device’s camera and microphone to interact with you directly and change their response based on your facial expression, voice tone, and soon, perhaps even your emotions.
But showing you what’s already possible is better than telling you, so let’s get started.
To see the entire videos in the examples below, just click on the image and you’ll be able to see the video.
Face2Face is a research program from the Technical University of Munich. Researchers using a consumer webcam can manipulate facial expressions of a target speaking in a YouTube video. The software then renders a new synthesized face of the target, showing the altered video.
The lab is also working on algorithms to help detect fake videos.
You can find more information on the lab’s website at https://www.tum.de/nc/en/about-tum/news/press-releases/details/35502/.
Talking Head Models from Still Photos
Other research from Cornell University uses artificial intelligence to create realistic “talking heads” from still photos. The video demonstrates several examples of videos generated from a variety of images, and even brings the Mona Lisa painting “to life.”
See more details about their research at https://arxiv.org/abs/1905.08233v1
Text-based Editing of Talking-head Video
This project shows examples of manipulating text from a video to alter the words that appear to be spoken by the subject in the video.
For more information to to the project’s webpage at https://www.ohadf.com/projects/text-based-editing/
Lyrebird is a division of Descript and uses artificial intelligence to take a small audio clip of a person’s voice. Using the clone allows anyone using a text editor to create a statement that sounds like the target’s voice.
There is also a “do-it-yourself” demo to change the sample text and hear the altered output.
Learn more about Lyrebird and see the demonstrations on Lyrebird’s website at https://www.descript.com/lyrebird.
CereProc is a very advanced text-to-speech service that allows the user to select from a wide variety of voices or clone your own. The website has a fascinating demo where you can type in your text and choose the voice you wish to hear speaking.
Another of the company’s products named “CereVoiceMe” uses artificial intelligence to create a clone of any person’s voice that can then be used to convert text into speech that sounds just like the original subject. One example on the website is a former radio host who suffered the loss of his speech due to illness. CereProc’s technology allowed him to clone his voice.
Jordan Peele Manipulates Video of Former President Obama
A YouTube clip from a Good Morning America broadcast in 2018 demonstrates how artificial intelligence can be misused. In the clip, comedian Jordan Peele manipulates the voice of former President Barack Obama and provides a warning about this technology.
This Person Does Not Exist
This website displays a different realistic photo of a non-existent person every time you visit the site. Artificial intelligence takes features from their image database and combines them to create new faces of any age, gender, and ethnicity. Visit the website five times, and you will see five different faces…of people who don’t exist.
Soul Machines, a New Zealand firm, has created what they describe as a “Digital Brain,” which uses “Embodied Cognitive User Experience” to create a “Digital Person.”
To demo their product, they’ll ask your permission to access your computer’s camera and microphone. You then have a choice to speak to either “Sam” or “Roman,” and provide your email address. You’ll then receive an invitation to carry on a conversation with whichever digital person you choose.
The AI is capable of using your device’s camera and microphone to not only converse with you but also to read your facial expressions and interpret your tone of voice in order to change how it responds.
Yes, there are times when the voice still sounds too much like a machine. You can throw the AI off with random questions or statements. Still, I think you’ll be surprised at how capable the technology already is.
The image above is from one of my previous conversations with “Sam.”
What’s Coming Next?
So now you can see how technology can be used to change reality…or at least what you perceive the truth to be.
Think about how each of these examples might be used to change your perception of what actually happened.
What evidence can you believe?
If someone alters a video or audio record, how would you know?
For example, could this technology be used to create a fake of an emergency alert warning?
What about destroying someone’s marriage or reputation with a fake sex video?
Can you imagine the possible scandal over a fake video or audio recording of a political candidate shortly before an election?
When you consider how something like this could quickly go viral on social media, you can appreciate how dangerous this technology can be.
You’ve all heard the old saying that “Seeing is believing,” but the truth is that believing is seeing.
Studies have shown that humans tend to seek out information that supports what we want to believe and to ignore the rest.
With audio, it’s an even more difficult problem.
Research says that our brains have a hard time detecting the differences between real and artificial voices.
It’s easier for us to pick up on a fake image than to recognize an artificial voice.
Hacking that human tendency gives malicious people, criminals, and even nation-states a lot of power to control what we believe to be true.
What Can We Do?
First of all, we need to ask whether deep fakes are legal.
Certainly, if someone uses deepfake technology to commit a crime, then existing law can apply. But what about just the act of creating a deepfake?
It’s an interesting and problematic question currently unresolved in law, at least in the United States.
We need to consider the First Amendment of the U.S. Constitution, intellectual property law, privacy law, and the new revenge porn statutes many states across the United States have enacted of late.
Now complicate the possibilities with deepfake “diplomacy manipulation.”
Think about the damage that could be done to a nation’s foreign policy or ongoing trade negotiations.
Many organizations are working on ways to detect deepfake videos, photos, and audio.
Some of these involve a highly technical analysis of the data. Still, it might be challenging to conduct this type of analysis in realtime.
And, as we mentioned before, a lot can go wrong during the time between the production of a deepfake and the time when it’s found to be false.
Another possible answer to countering deepfakes would be to use blockchain technology to validate the original if it’s a photograph, a video, or a recorded audio.
But what about a live-streaming video, a live telephone call, or the person you think you see on a screen?
Justin Hendrix, the executive director of NYC Media Lab, says: “In the next two, three, four years we’re going to have to plan for hobbyist propagandists who can make a fortune by creating highly realistic, photo-realistic simulations. And should those attempts work, and people come to suspect that there’s no underlying reality to media artifacts of any kind, then we’re in a really difficult place. It’ll only take a couple of big hoaxes to really convince the public that nothing’s real.”