A couple days ago we posted about Majel, and now some more tips are starting to come in. We compared Majel to Apple’s Siri voice assistant because that’s how it was described to us, but the project could be much larger than we initially imagined. Read on for new details and some interesting quotes from Google employees.
First we had a tip from “Ted,” who described his experience with an early release of Majel on an Android tablet. Even though this tip was sent from an anonymous IP, we believe it to be accurate since it matched an earlier description we received.
Ted wrote: “It’s definitely as good, or better, than Siri. At least on the tablet you can sort through different answers with these swipe-able trays. Like, if you say “show me the Statue of Liberty” it’ll automatically take you to Google Image results, but another tray beneath it might be its location on Google Maps and then another tray might have a Wikipedia page. It’s also pretty good at giving you succinct answers if you ask it a question. The UI is definitely more powerful than Siri’s, even if a little harder to navigate.
At least at one phase of the development you would activate it by saying “Computer…” It was hard not to use a Jean Luc Piccard accent when doing it!”
As you can see, the first release of Majel might be rather simple and focus solely on natural language questions with answers from Google Search.
Next up we have some comments posted to Reddit from an ex-employee of Google who claims to have worked at the secret Google X Lab.
The anonymous Googler wrote: “This is in total violation of the NDA, but I don’t care anymore. Sue me.
The central focus of Google X for the past few years has been a highly advanced artificial intelligence robot that leverages the underlying technology of many popular Google programs. As of October (the last time I was around the project), the artificial intelligence had passed the Turing Test 93% of the time via an hour long IM style conversation. IM was chosen to isolate the AI from the speech synthesizer and physical packaging of the robot.
The robot itself isn’t particularly advanced because the focus was not on mechanics, but rather the software. It is basically a robotish looking thing on wheels. Speech recognition is somewhat better than what you would get with normal speech input, mostly because of the use of high quality microphones and lip-reading assistance.
I have had the chance to interact with the robot personally and it is honestly the most amazing thing that I have ever seen. I like to think of it like Stephen Hawking because it is extremely smart and you can interact with it naturally, but it is incapable of physically doing much. There is a planned phase two for development of an advanced robotics platform.“
This sounds more along the lines of the shoot-for-the-stars ideas that the NYTimes described when they wrote about Google X. Obviously, Google has been working on artificial intelligence for many years.
Moving along, we return to some comments from Mike Cohen, Google’s Manager of Speech Technology and co-founder of Nuance Communications (the company that powers some of the technology behind Siri).
Mike Cohen wrote: “In Star Trek, they don’t spend a lot of time typing things on keyboards–they just speak to their computers, and the computers speak back. It’s a more natural way to communicate, but getting there requires chipping away at a range of hard research problems.
We’ve recently made some strides with speech technologies and tools that take voice input. But what about when the computer speaks to you–in other words, voice output?
That’s why we’re pleased to announce we’ve acquired Phonetic Arts, a speech synthesis company based in Cambridge, England. Phonetic Arts’ team of researchers and engineers work at the cutting edge of speech synthesis, delivering technology that generates natural computer speech from small samples of recorded voice.
We are excited about their technology, and while we don’t have plans to share yet, we’re confident that together we’ll move a little faster towards that Star Trek future.”
Many readers joked in the comments of our previous article that they wish Majel Barrett-Roddenberry’s voice could be used for Google’s project, and it turns out they have the technology to do it. They would still need to license the rights to Majel’s voice samples, but Google could essentially replicate any voice they want.
Keeping with the Star Trek theme, we have more comments from Google’s Amit Singhal found in The Evolution of search video posted in November.
Amit Singhal says: “My dream has always been to build the Star Trek computer, and in my ideal world, I would be able to walk up to a computer, and say, ‘Hey, what is the best time for me to sow seeds in India, given that monsoon was early this year?’ And once we can answer that question (which we don’t today), people will be looking for answers to even more complex questions. These are all genuine information needs. Genuine questions that if we — Google — can answer, our users will become more knowledgeable and they will be more satisfied in their quest for knowledge.”
Finally, we have the comments of Matias Duarte, the computer-interface designer and user-experience lead for Android, from an interview with The Daily Beast.
Matias Duarte said: “Voice is absolutely going to be an essential part of user interfaces. I mean Google and Android have been working on Voice for years. Even in Ice Cream Sandwich we released significant improvements to the way Voice dication works. What I think is going to be interesting about Voice is trying to treat Voice as something that is universally accessible in every application and not confine it to just a gimmick or something you only use when you are in the car or on the go.
I really want computers to be multimodal. When you watch a science fiction show like Star Trek, someone walks up to a wall and starts touching things and speaking to a computer at the same time. That’s the way that I think our interfaces need to evolve. You need to be able to start using email, touching things on screen, speak to it, touch more things, and not really have to think about ‘am I using Voice now or not using Voice.’ You just use the computer input that is most natural at that time.”
That sounds a little more advanced than how we described the first release of Majel, but Matias said they were already working on the user interface for the next version of Android, codenamed Jelly Bean, and the next version after that.
We’re just in the early stages of comprehending how large a project Majel has become, but we still expect some kind of release on Android devices early next year. Google engineers are already testing a version of Majel that might be released as an upgrade to Google’s Voice Actions application, but we fully expect it will be a core part of Android’s next major release.
Hopefully, we will have some concrete details to share in the coming weeks.