How Criminals Use AI to Mimic Your Boss’s Voice

...and demand $243,000 in fraudulent wire transfers

Jan 27, 2020

Facts:
The FBI says Business Email Compromise (BEC) scams netted thieves more than $26 billion between 2016 and 2019 alone.
In a common version of the BEC scam, an employee in a company’s accounting department wires money somewhere, based on what appears to be a legitimate email from the CEO.
To verify the legitimacy of the request, employees are advised to make a phone call to their boss, rather than relying on an email.
But as it turns out, you may not be able to trust voice instructions either.

A Cautionary Tale

Imagine initiating a wire transfer for hundreds of thousands of dollars, based on a call from your boss. You were sure it was him because you recognized his voice. However, you later discover that you were talking to a machine and the money is nowhere to be found.

One company executive doesn’t have to imagine this happening...because it already did.

When the CEO of a U.K.-based energy firm received a call from his boss, — the chief executive of the firm’s German parent company — he didn’t think much of it. The caller’s request was simple, but time sensitive: send a wire transfer of $243,000 to one of their suppliers in Hungary, within the hour. An email containing the wiring instructions soon followed. So, the CEO fulfilled the order and went on with his day.

Source: Jim Reardan on Unsplash

After the transfer went through, the CEO’s boss called him a second time, to say that he had transferred funds to the energy firm’s bank account as reimbursement.

But a few hours later, his boss called again. This time, he asked for a second wire transfer to be initiated. However, the CEO noticed that, instead of a German number, the call had come from Austria.

As the reimbursement funds had not yet arrived — and the phone number did not match up with anything on file — the CEO grew suspicious and did not comply with this second request.

What he wasn’t aware of at the time was that he had just followed the instructions of a scammer, who had used AI-powered voice technology to impersonate his German boss. By the time he realized the full extent of what had just happened, the money had already been moved out of the Hungarian bank account to Mexico. And that’s where the trail ended.

The Larger Scheme At Play

While it has long been predicted that criminals would one day use artificial intelligence (AI) to automate their cybercrimes, companies are still trying to navigate the challenges created by this new landscape. After all, traditional cybersecurity solutions are meant to keep hackers locked out of corporate networks, not detect voice-spoofing attacks. Therefore, whoever was behind this incident was able to use AI-based voice-mimicking software to successfully impersonate the German executive’s voice — even down to his slight German accent and the natural rhythm present in his way of speaking. Needless to say, it was very convincing.

And it’s among the first of its kind in the world of cybercrime.

Traditionally, hackers have relied on forging and spoofing emails to commit Business Email Compromise (BEC).

Business Email Compromise v.: A type of attack in which cybercriminals target company email systems. By spoofing a corporate email account, these bad actors are able to impersonate high ranking executives and request wire transfers into their own bank accounts.

However, the ability to use synthetic audio — also known as deepfake audio — provides attackers with a powerful new tool to optimize the effectiveness of this malicious activity. Being able to call employees up on the phone and use this technology to impersonate their bosses makes payment transfer requests seem more authentic. It also allows attackers to create more pressure to comply, elevating the urgency of requests and employing other social engineering tactics.

With a wide range of voice-synthesis software solutions to choose from, cybercriminals can produce convincing copies of their targets’ voices to exploit anything from business deals to family phone calls to presidential campaigns.

How It Works

Deepfake audio relies on a machine learning algorithm to mimic the voice of the target. By processing a person’s voice and breaking it down into sounds and syllables, the system can rearrange these components to form new phrases — all of which have similar speech patterns, pitch, and tone to the target. The AI then uses something called generative adversarial networks (GAN) where one network creates a fake and another tries to identify it as fake. By repeating this process again and again, these networks are able to learn from each attempt and adapt accordingly.

Of course, in order for this to work properly, an attacker needs to feed the algorithm a variety of voice clips of his target. This is why CEOs make such appealing targets — their voices are recorded across a number of publicly-available platforms, such as media appearances, speeches and presentations, YouTube videos, and so on. And all of this can be collected to build voice replicas.

Once the deepfake audio profile is robust enough, it can be paired with a specialized “text-to-speech” software to create scripts for the fake voice to read.

(You can see where this is going.)

Granted, it can take a considerable amount of time and resources to create the perfect deepfaked voice, and some of these attempts would fail to fool any listener sitting in a quiet room. But attackers seem to have worked around this by masking imperfections with background noise. This gives the impression that the fake CEO is calling from a spot that has spotty cellular service or is in a busy area with lots of traffic.

And here’s the real kicker: all of this can be done using commercially-available, voice-generating software — some of which is free to use and requires little sophistication, speech data, or computing power. In other words, you don’t need to be a Ph.D. in mathematics to create one of these fake voices.

In addition to the impressive capabilities of this technology, attackers will also use tried-and-true social engineering tactics — during the BEC call — to boost their odds of success. These include using time-sensitive deadlines and exercising authority to evoke a mix of fear and subservience. (After all, the boss is the head honcho.) Another crucial tactic involves targeting someone at the company who is authorized to initiate wire transfers and fulfill payment requests, without having to request permission from anyone else.

Be creating a stressful, or high-pressure, situation for the employee, their ability to question themselves — and the nature of the call — quickly diminishes. Thus, enabling the thieves to effectively pull off their heist.

Long story short, the scam and the end goals are the same as in traditional Business Email Compromise attacks. The only difference is the medium by which the plot is carried out.

Take Steps to Protect Yourself

While AI developers are working hard to build systems that can detect and combat deepfake audio, the technology itself is rapidly evolving. In the interim, protecting your organization against these kinds of attacks comes down to awareness and protocol.

Educate employees on how BEC scams work and call attention to the possibility that faked audio can be used to simulate a call from a superior. Doing this can motivate an employee to question an unusual payment or network access request — potentially saving your company from falling victim to these attacks.

Implement security protocols such as filtering and authentication frameworks wherever possible. Standard payment protocols that require multi-factor authentication or calling the authorizing party back on a number known to be authentic can go a long way in stopping cybercriminals.

With any luck, future preventative solutions will include upper-level executives having more control over who can view their public audio and organizations implementing blockchain technology with voice-over-IP (VoIP) calls to authenticate the caller.

Until then, the best approach is to remain vigilant and verify wire transfer requests to the furthest extent possible.

Final Notes and Updates

While the CEO of the energy firm thought the initial transfer request was strange, the voice sounded so much like his boss that he felt he had to comply. But after the second request for payment, he decided to call his boss directly. As the two discussed the situation over the phone, the thieves called back a fourth time.

That’s when the entire ruse unraveled: with the fake German boss demanding to speak to the CEO while he was on the phone with his real German boss.

Fortunately, the firm’s insurance company covered their losses.

Unfortunately, the current whereabouts of the original $243,000 remain unknown...as do the individuals behind the attack.

In the wider scope of things, developers of voice-mimicking technology continue to emphasize the advantages of its applications, such as humanizing automated phone systems and helping mute people to speak again. However, AI’s unregulated growth has emboldened criminals to expand their operations and go after bigger fish.

It’s no longer up for debate: the era of criminals exploiting artificial intelligence has arrived.

Useful Resources

To report a BEC crime:

https://www.ic3.gov/media/2018/180611.aspx

To report data theft and protect employees:

https://www.irs.gov/individuals/form-w2-ssn-data-theft-information-for-businesses-and-payroll-service-providers

Editor’s Note: Have you been affected by fraud? Most people have, in some form or another. If you have a story you would like to share, we’re sure our readers would benefit from hearing it. Please send an email to editor@theconartist.pub detailing your experience, and we will be in touch. Your privacy and any wishes of anonymity will be respected.

Thanks for reading! If you haven’t already, consider joining our community to receive in-depth exposés on the latest scams, hoaxes, and other forms of fraud.

The Con Artist