Some of you may have recently experienced Google calling and asking for business updates? The exchange couldn’t be more commonplace, but the fact that a computerized voice carried out the conversation broke new ground. In fact, the caller’s voice wasn’t just a pre-recorded message that was played out: it reacted and adapted its script to customer service’s questions and answers.
The intonation wasn’t a-tonal with the choppy quality we call “robotic.” It was the light, breezy, a little sing-songy, an average intonation of a millennial inquiring over the phone. It showcased speech disfluency, which is the use of non-syllabic filler words like “uh,” “hum,” or the hit with the audience: “Mm-hmm.” It was a showcase of the newest advancement in the AI field: Google Duplex. An extension of Google Assistant can make calls for humans to book restaurant appointments in natural-sounding voices.
If you have been following the news about AI development, then you may have heard about Google’s new project, “Google Duplex.” If you haven’t, it’s all right because I will be explaining in-depth what this revolutionary and yet controversial new AI actually is and what it might mean to the world of AI development and human-computer interaction. Google Duplex is a new tool developed by Google that uses Artificial Intelligence to accomplish real-world tasks over the phone. The interesting part behind it is that this AI is actually made to replicate human interactions as close as possible to the actual thing, meaning that the AI will do most of the things that happen during a normal human speech, like adding “umms” and “ahhs” to the conversations.
Currently Targeted to specific tasks.
Google Duplex is currently directed towards completing specific tasks, such as scheduling appointments. And while doing so, the system tries to make the conversation feel as natural as possible, allowing people to speak normally, as though they were interacting with another person and not with an AI. How this works is the interesting part, though, for Google Duplex to work the way it is intended, the system must be able to do several things: The system needs to understand what the person on the other side of the phone is saying, figure out what it means, and decide if what the person is offering will work for you or not. How the system can do this is even more amusing.
Google Duplex Development
Development was initially announced in 2016. The first trials in the development stage featured the metallic-sounding voice, which everyone associates with robocalls, and the reactions were predictably unfavorable. In fact, businesses did not want to deal with such calls, and most of the time, they hang up. Hence, to gain trust from the humans conversing with Duplex, developers employed speech disfluencies, non-verbal communication such as “Uhm” and “hmm.” Other innovations in the development of humanized speech were elaborations (“for Friday next week, the 18th.”), syncs (“can you hear me?”), and interruptions (“the number is 212-” “sorry, can you start over?”).
A lot of emphasis was also placed on speech dialects and accents.
The Google Duplex system was trained in real-time by adjusting the behavior as needed. Also, the company chose not to rush the fully automated version because users and business owners needed time to adapt. Hence, once the system reached the desired level of competency and desired level of trust with businesses and users, it will operate on its own without supervision.
The technology was initially rolled out in a trial phase to a limited number of users due to the caution inherent in introducing any new technology to a larger society.
Google Duplex feature is another step in making Google Assistant the unlimited, most convenient, most-attuned-to-humans assistant tool in a person’s everyday need, or as said by CEO Sundar Pichai in 2016: “Think of it as building your own individual Google.”
The Duplex feature is another step in making Google Assistant the unlimited, most convenient, most-attuned-to-humans assistant tool in a person’s everyday need, or as said by CEO Sundar Pichai in 2016: “Think of it as building your own individual Google.”
Google Duplex Demo & Rollout
Google showed a demo of this new system at Google I/O 2018, in which it became the talk of many people around the world of technology. One of them was Jerry Hildenbrand from androidcentral.com, who wrote an article about Google’s demo, “What is Google Duplex?”. During the demo, Google had the new “Assistant” call a hair salon to set up an appointment for them; the person on the other side of the phone seems to have thought it was talking to an actual person instead of an AI.
As Hildenbrand himself puts it, “The person taking the appointment didn’t seem to know they were talking to a computer because it didn’t sound like a computer. Not even a little bit.” — Jerry Hildenbrand.
From this, we can feel how impressive this AI might actually be, as Hildenbrand himself basically says that it didn’t sound even a little to a computer. This itself is quite amazing for the world of Artificial Intelligence (AI). It has been one of the biggest goals of human-computer interaction, even if it can only do basic things like setting up appointments. Google Duplex was initially available only to a limited number of users.
By Spring 2019, it became available to a large number of Android and iPhone customers. Booking restaurant reservations, making hair appointments, and checking holiday hours were the only options available. By this point, Duplex was still dependent on human supervision: 25% of the calls made by Duplex started with a human from a call center, whereas 15% of those initiated by AI had to be taken up by the employees at the call center, to complete the conversation.
In November 2019, Google rolled out the option to use Duplex to purchase movie tickets online. This service, called “Duplex On The Web,” doesn’t involve phone calls. The user tells Google Assistant to search for movie showings at particular times and gives a geographic area. Duplex searches the web for times and lists the options to the user. It can also purchase the tickets at the user’s request by navigating the sites of major ticket vendors and theater venues and placing the order with the user’s payment information saved on Google.
In the spring of 2020, Google deployed Duplex to make calls to small businesses asking them to update their listings for hours of operation and note closures due to COVID restrictions. During the global pandemic, the rising number of infected cases prompted local governments to issue safeguards such as limiting store operations to curbside pickups. Customers became restricted to calling stores and asking for inventory before ordering an item for pickup. When stores opened for limited capacity, it became necessary to avoid getting in long lines for a product that wasn’t available. Google adapted to these challenges by offering a Duplex option to inquire about product availability in a particular store.
In September 2020, Google announced the App “Hold For Me.” It uses Duplex’s technology to handle long phone holds for users by waiting in line for them. Once the wait is over and a live party is reached, Duplex alerts the user through phone vibration, sound, and a message prompt. Its technology can pick up on the nuances between a recorded message and a real person answering the phone.
In October 2020, Google reported a significant increase in Duplex autonomy with 99% unsupervised calls. The service expanded to eight countries. It also announced that since its release, users have made over a million bookings.
What is AI (Artificial Intelligence)
What does this new AI mean to the world, then? As innovative as Google Duplex might be, there is a lot of controversy about what it entitles to have an AI capable of recreating human speech as accurately as Google Duplex can. This comes as no surprise because of how amazing this AI could be. If you were to look up Google Duplex, two things seem always to show up:
First, that as amazing as this new technology is; the idea of making people think they are talking to a real person and not a computer comes off as, how many describe it, “Creepy.” This seems to be one of the biggest issues for people. Initially, Google wanted the person on the phone not to know that they were talking to a computer but decided that the AI should let the person on the other side know that they are talking to a computer and not an actual person.
Second, after the demo that Google provided on Google I/O 2018, many rumors have surfaced that the AI demonstration was fake and the conversation showcased were actually pre-recorded. This isn’t surprising, as such an advancement in technology seems hard to believe for many people.
How does a conversation with AI work?
AI conversations are limited in specific reasoning patterns. When given an unclear or misguiding answer, the AI rephrases or repeats the question. These are preprogrammed gambits, meaning that Duplex picks up only on a set of answers that fit within the answer they’re looking for (the opposite applies when Duplex is on the receiving end of questions).
AI cannot veer off a preprogrammed conversation, and all of its verbal repertoires are designed to get to the information needed by the user. Duplex can only share the contact information that the user allows. For instance, If a business on the other end of the line asks for a phone number, Duplex will deny it and suggest the e-mail address which the user has allowed. The overall tone of the Google Duplex speech is polite and, at times, even apologetic. It can be interrupted, thrown off track with deceiving statements, and still handle the conversation well.
Background of AI
Artificial Intelligence is the motor that operates Google Duplex. AI enables machines to independently learn, think, communicate and act in ways that humans would. The lure of AI is its ability to rapidly process unimaginable amounts of data and sift through it to solve a quarry. It is given parameters to operate in, and in turn, it gives an output of results.
The field has developed the concept of neural networks, which, similar to the human nervous system, use vast amounts of data to establish systems and learn behavior. Over time, machines have learned to apply these principles to spoken words, establishing cognition and the ability to speak.
Google Duplex uses a recurrent (feedback) neural network, which allows the system to ‘memorize’ parts of the inputs and use them to make accurate predictions. These networks are the heart of speech recognition, translation, and more. (Simeon Kostadinov). Using this Google Duplex is basically talking to an entire network of high-powered computers, gathering data in the cloud, and sending them to your phone.
In the case of Duplex, Google programmers have used an array of AI components. Natural Language Processing is a human language deciphering tool. The Recurrent Neural Network detects underlying patterns and relationships within a set of data. Automatic Speech Recognition translates spoken word into text. Concatenative Text To Speech turns written word into computer-generated speech.
Google Duplex is unique in AI because it has combined these principles into a single system and released it for large populations. Its limited scope (placing reservations) enables it to work on such a grand scale because it’s not too complex by today’s standards. The downside is that it will take time for Duplex to gather data, train, and be more automated and less reliant on human assistance from call centers.
Immediately following the 2018 demo, the first privacy concerns started to be voiced in the media. Some of the pressing questions were: Will Duplex identify itself as AI to the person answering the call? How will the recorded conversation be used? How will the user’s data be used? Basically, the perennial worries that have haunted humans since the advent of AI have made technology more and more obtrusive to our privacy. Should AI be unrecognizable from a human voice, or should the developers cut down on the realism so that the machine calling is clear right away? It’s undeniably unethical that a computer should impersonate a living person while working with businesses over the phone and not make it explicit that they are AI.
At the beginning of each call, Google Duplex has to present itself as AI to the business representative on the other line and inform them that the conversation is being recorded. But Google is uneasy about the AI saying explicitly, “this is a robocall” as it might turn many people off who’ll hang up, so for now, they’re trying to keep it ethical while hoping that society will organically evolve to co-exist with computerized human-like interactions, without the paranoia.
Using Google Duplex for benign, low-stake activities such as booking a table at the local restaurant or ordering a pair of tickets seems trustworthy enough, especially as the AI is programmed by Google, one of the Big Tech companies under constant legal scrutiny over privacy concerns, obligated to be transparent with the way their technology obtrudes people’s lives. But what if this AI technology gets into the wrong hands? If the realistic-sounding voice cannot be told apart from a human’s, what’s stopping ill-intended criminals from using it to release spam calls, fishing scams, or lure people in worse hoaxes.
The technology could easily end up being used for deception, and that’s why there’s a clear need for oversight and regulation.
Future of Google Duplex
Duplex is clearly a feature that will stay, master its functions and expand the domain of possibilities. Google is currently piloting a feature that enables shopping and food ordering for faster checkouts. Another announced task that Duplex will perform for the wide public will be renting cars.
At the Google I/O conference in May 2019, it was announced that Duplex would help users fill out online forms soon. By now, this feature is partially available to users: when they ask Duplex to order tickets and finalize the purchase with saved credit card information. In the future, we might see this feature expanded and applied to any form, from shopping to job applications to maybe even filing taxes.
First appeared on https://visualwebz.com/google-duplex/