Saturday, April 1, 2017

Making your mobile app respond to voice commands (iOS vs Android)

Making you app to understand voice command was not an easy task before the introduction of Siri (iOS) and Google assistant. Both are now opened for developers. However, as usual Apple's Siri is not fully open to the developers when compared to the capabilities of "Google Assistant".

Siri is said to be limited to the following domains.

- VoIP calling
- Messaging
- Payments
- Photos/Videos
- Workouts
- Ride booking
- Car commands
- CarPlay (automotive vendors only)
- Restaurant reservations (requires additional support from Apple)
- IOT (using HomeKit)

This means that Siri will only understand certain phrases. Therefore certain phrases will sound in-appropriate for other domains. For example "Look for beach photos taken last summer in your-app-name." will be an appropriate search only if your app contains images. I'm afraid at the moment Siri cannot be taught to respond to "Look for notes taken last summer in your-app-name.". To be fair they are at least supporting all the apps that are used on the move. Apple will also not want their stock apps to be less popular either :)

Both Siri and Google assistant could be programmed to use your app to respond to voice commands even when the app is not running in the background. As anticipated Google is a step ahead by not limiting support for a selected set of domains. Google does this by allowing the user to switch to a private channel with an app of their choice. While the user is in this space Google assistant will be deaf to the rest of the commands it can support and will only respond to the commands defined by the app. The Conversation API defines a request and response format that you must adhere to when communicating with the Google Assistant.

Conversation Actions help you fulfill user requests by letting you have a two-way dialog with users. When users request an action, the Google Assistant processes this request, determines the best action to invoke, and invokes your Conversation Action if relevant. From there, your action manages the rest, including how users are greeted, how to fulfill the user's request, and how the conversation ends. Even though you control the user experience when your action is invoked, the Google Assistant still brokers and validates the communication between your action and the user. The Actions SDK gives you all the tools you need to build Conversation Actions, but you might want to choose to build with one of our supported tools, such as API.AI. These tools generally provide a better developer experience by making it easier to build and deploy actions within a single interface and also provide additional features to make building actions easier. By using API.AI to build actions, you gain a lot of conveniences, such as:

• API.AI NLU - API.AI's Natural Language Understanding is integrated into API.AI and does in-dialog processing of input. This offers you with conveniences such as natural expansion of phrases and built-in features to easily distinguish and parse arguments from user input.
• A GUI interface - You can define and configure action components, such as dialogs and invocation, in an easy-to-use UI.
• Conversation building features - API.AI has advanced features such as contexts, which let you easily maintain state and contextualize user requests, a built-in simulator, and machine learning algorithms to better understand user input.

Ready to dig in? Follow the links below to get started.

- Siri: https://developer.apple.com/sirikit/
- Google assistant: https://developers.google.com/actions/develop/conversation

1 comment:

Hannah Baker said...

Thank you for your post. Hawkscode Uk is the leading IT service provider. Hawkscode Uk
provides website development services.