Yesterday Apple finally released their iOS5 software update for the iPhone 4 and the blogs are buzzing about Siri, a virtual personal assistant that lets you use your voice to send messages, schedule meetings, place phone calls, and more. Earlier this month I received a tip that led me to believe Google would release a similar offering around the launch of Ice Cream Sandwich and ever since I’ve been trying to dig up additional details about how their product might differ.
For starters, I have been unable to get anyone else to confirm this rumor to me (besides the original source). I asked around at CTIA if anyone had heard about this secret project, but I came up empty. However, Google has a history of surprise announcements and I still believe in the original information that was shared with me.
Google’s speech technology efforts are being led by Mike Cohen, who joined the company back in 2004. Prior to joining Google, Mike spent 10 years at Nuance Communications, which he co-founded. That’s the same Nuance that actually powers the technology behind Siri. We don’t talk about Mike much but he has helped Google launch Voice Search, Voice Input and Voice Actions for our Android phones.
Voice Actions for Android is the closest product that Google has to Apple’s Siri and that’s what we heard was getting a major update. One way that Google’s assistant might differ from Siri is the addition of an animated avatar. I previously compared it to something like an intelligent version of Talking Tom Cat, but then I discovered a better example called Speaktoit Assistant.
Speaktoit is similar to Siri in that it supports natural language so you can converse with the virtual assistant in a normal human speaking style. The Android app also includes similar functionality that can “send emails, send texts, look up information, post to Twitter, check you in places, update your Facebook, find news, look up traffic, look up weather, call people, take notes, add things to your calendar, translate foreign languages, help you find nearby places like bars, and tons more.”
Natural language support allows you to talk like a human instead of a robot and I was surprised with how well it works. Of course, the responses coming back from Speaktoit are still in that robotic sounding voice we expect from our phones.
Google’s virtual assistant should solve that robotic voice output thanks to their acquisition of Phonetic Arts, which occurred at the tail end of 2010. The company had previously been working on speech synthesis in games, but Google was after their technology that could convert lines of recorded dialog into a speech library. The result was surprisingly realistic automated voices that sounded more human and fluid.
I’m not aware of Google incorporating Phonetic Art’s technology into Android yet, but the acquisition happened right around the time that Android engineers started working on Ice Cream Sandwich.
We already expect the next version of Android to include some kind of facial recognition and tracking technology, so Google’s virtual assistant could turn out to be really creepy. Imagine a virtual character that you talk to in a human language, it responds in a human sounding voice, it scans your face to detect your emotions, and it knows everything about you.
I don’t know if the average consumer is ready for this kind of futuristic artificial intelligence, but Google has the technology to pull it off. As Mike Elgan of Datamation puts it, it’s all about the data and Google has plenty of that. “Google already knows who your niece is and what her interests are. They know your web browsing history and what marketing people in your area are both looking for work and highly recommended. Google knows what you say on email, where you go and what you like to read.”
Hopefully, we should find out more in the coming weeks as Google unveils all the new features of Ice Cream Sandwich. Are you ready for a human-sounding virtual assistant that knows all of your private and public data? Download Speaktoit Assistant and let us know what kinds of features you would like to see from a Google-powered virtual assistant.