C
Clint Rutkas
This week’s theme in our Windows 10 by 10 development series is extending customer engagement using the Windows 10 without your users even entering your app. Last week’s topic, Live Tiles and notifications showed how one way to extend your app’s experience, now let’s show how to use the Windows 10 personal assistant, Cortana, to do so. To illustrate what you can do with Cortana, we’ll be using the AdventureWorks sample on GitHub as our base for the code snippets below in this blog post.
In this blog post, we’ll talk about what Cortana is, how to make Cortana have a meaningful conversation with your customer, the initial work needed to get Cortana integrated into your app, and then two of the many ways your app could interact with your end user depending on the scenario.
What exactly is Cortana?
One of the neatest capabilities introduced in Windows 10 is the Cortana personal assistant, who joins us from Windows Phone. Cortana is a front and center experience in Windows 10 that users can engage with via natural language. With Cortana, users can engage with Windows (both the core OS, but also apps such as yours) the same way they would speak to a person. For example, think about the questions that only your app can answer to the user using Cortana’s voice such as “When is my next trip?”, “Search for a radio station”, “Is Jack online?”. Your app can then answer these questions by providing the answer for Cortana to say and display. Think about tasks that the user can ask Cortana to do in your app such as: “Cancel my trip to London”, “Like this station”, “Tell Jack I’m running late”.
Voice commands can provide quick access to info inside the app when you use voice command as deep links into your application. Just as you currently create a tile to provide a shortcut into your app, you can also use a voice command to serve as a shortcut to a screen inside your app. And just as you give the user the ability to pin that recent trip they created in your app onto their Start screen, you can also enable that same user to use a voice command in the Cortana experience to get them to that same trip. This ability can make your users more productive, and your app’s experience more accessible to them.
By extending Cortana, you can engage and delight your users by empowering them to get things done with a quick voice command. This blog post isn’t on the customer features of Cortana however, it is about how you can integrate it into your app.
Hey Cortana, let’s have a conversation
Since interacting with Cortana is speech based, your user experience needs to flow as if you were having a natural conversation with them. There are general Cortana design guidelines on MSDN that explain best practices for user interactions. For successful Cortana interactions, follow these principles as well: efficient, relevant, clear and trustworthy interactions.
What do they actually mean?
Also, you should consider localizing your Cortana interactions, especially if you’ve already localized the rest of your app or are making it available globally. Cortana is currently available in the US, UK, China, France, Italy, Germany and Spain, with more markets coming on board in the future. Localizing and adapting the interactions will aid in encouraging your customers to use the Cortana feature of your app.
Teaching Cortana what to respond to
Cortana uses a Voice Command Definition (VCD) file to define the speech interactions the user can have with your app. This file can be XML based or generated via code. Once your app runs for the first time, the command sets in the VCD will be installed. Here is a quick sample VCD:
<?xml version="1.0" encoding="utf-8"?>
<VoiceCommands xmlns="http://schemas.microsoft.com/voicecommands/1.1">
<CommandSet xml:lang="en-us" Name="AdventureWorksCommandSet_en-us">
<CommandPrefix> Adventure Works, </CommandPrefix>
<Example> Show trip to London </Example>
<Command Name="showTripToDestination">
<Example> show trip to London </Example>
<ListenFor RequireAppName="BeforeOrAfterPhrase"> show trip to {destination} </ListenFor>
<Feedback> Showing trip to {destination} </Feedback>
<Navigate/>
</Command>
<PhraseList Label="destination">
<Item> London </Item>
<Item> Dallas </Item>
</PhraseList>
</CommandSet>
<!-- Other CommandSets for other languages -->
</VoiceCommands>
When your app is activated, InstallCommandSetsFromStorageFileAsync should be called in the OnLaunched app event handler to register the commands that Cortana should listen for. Keep in mind that if a device backup is restored and your app is reinstalled, voice command data is not preserved. To ensure the voice command data for your app stays intact, consider initializing your VCD file each time your app launches or activates. You can also store a setting that indicates if the VCD is currently installed, then check that setting each time your app launches or activates. Here is some basic code to get the VCD loaded into your app:
var storageFile = await Windows.Storage.StorageFile.GetFileFromApplicationUriAsync(new Uri("ms-appx:///CortanaVcd.xml"));
await Windows.ApplicationModel.VoiceCommands.VoiceCommandDefinitionManager.InstallCommandSetsFromStorageFileAsync(storageFile);
Being dynamic
Now that we have the grammar initialized, we can dynamically alter it at runtime. Here is a simple example of dynamically altering the VCD we loaded above:
private Windows.ApplicationModel.VoiceCommands.VoiceCommnadDefinition.VoiceCommandSet commandSetEnUs;
if (Windows.ApplicationModel.VoiceCommands.VoiceCommandDefinitionManager.InstalledCommandSets.TryGetValue("AdventureWorksCommandSet_en-us", out commandSetEnUs))
{
// this code will fully replace the destination list
await commandSetEnUs.SetPhraseListAsync("destination", new string[] {"Chicago", "Seattle", "New York", "Phoenix"});
}
How should my app interact with Cortana?
There are a number of ways for your app to interact with Cortana. The three most typical ways are:
If you have a complex task and want the user to jump directly into your app, using Cortana is a great solution. Since some complex tasks can actually be done faster and more accurately by voice commands, this may be the way to go.
protected override void OnActivated(IActivatedEventArgs e)
{
// Was the app activated by a voice command?
if (e.Kind != Windows.ApplicationModel.Activation.ActivationKind.VoiceCommand)
{
return;
}
var commandArgs = e as Windows.ApplicationModel.Activation.VoiceCommandActivatedEventArgs;
var navigationParameterString = "";
Windows.ApplicationModel.VoiceCommands.VoiceCommand.SpeechRecognitionResult speechRecognitionResult = commandArgs.Result;
// Get the name of the voice command and the text spoken
string voiceCommandName = speechRecognitionResult.RulePath[0];
string textSpoken = speechRecognitionResult.Text;
// The commandMode is either "voice" or "text", and it indicates how the voice command was entered by the user.
// Apps should respect "text" mode by providing feedback in a silent form.
string commandMode = this.SemanticInterpretation("commandMode", speechRecognitionResult);
switch (voiceCommandName)
{
case "showTripToDestination":
// Access the value of the {destination} phrase in the voice command
string destination = speechRecognitionResult.SemanticInterpretation.Properties["destination"][0];
// Create a navigation parameter string to pass to the page
navigationParameterString = string.Format("{0}|{1}|{2}|{3}",
voiceCommandName, commandMode, textSpoken, destination);
// Set the page where to navigate for this voice command
navigateToPageType = typeof(TripPage);
break;
default:
// There is no match for the voice command name. Navigate to MainPage
navigateToPageType = typeof(MainPage);
break;
}
if (this.rootFrame == null)
{
// App needs to create a new Frame, not shown
}
if (!this.rootFrame.Navigate(navigateToPageType, navigationParameterString))
{
throw new Exception("Failed to create voice command page");
}
}
Simple interaction to store or return data to/from your app within Cortana
Now that you have Cortana connected to your VCD and executing basic interactions, we’ll dive into having Cortana do some of the heavier lifting. For example, you can have Cortana provide data back to the user, or store some data. MSDN has a comprehensive walkthrough for setting up a background app for Cortana. Here’s a quick summary of the steps.
Here is a sample of what the Package.appxmanifest XML will look like:
<Package>
<Applications>
<Application>
<Extensions>
<Extension Category="windows.appService"
EntryPoint=
"AdventureWorks.VoiceCommands.AdventureWorksVoiceCommandService">
<AppService Name="AdventureWorksVoiceCommandService"/>
</Extension>
</Extensions>
<Application>
<Applications>
</Package>
Once launched, the app background service has 0.5 seconds to call ReportSuccessAsync. Cortana uses the data provided by the app to show and verbalize the feedback specified in the VCD file. If the app takes longer than 0.5 seconds to return from the call, Cortana inserts a hand-off screen, as shown below. Cortana displays the hand-off screen until the application calls ReportSuccessAsync, or for up to 5 seconds. If the app service doesn’t call ReportSuccessAsync, or any of the VoiceCommandServiceConnection methods that provide Cortana with information, the user receives an error message and the app service call is cancelled.
Here is the basic code needed for the IBackgroundTask implementation to act as an app service:
using Windows.ApplicationModel.Background;
namespace AdventureWorks.VoiceCommands
{
public sealed class AdventureWorksVoiceCommandService : IBackgroundTask
{
public void Run(IBackgroundTaskInstance taskInstance)
{
BackgroundTaskDeferral _deferral = taskInstance.GetDeferral();
//
// TODO: Insert code
//
_deferral.Complete();
}
}
}
Having user interactions within Cortana
Now that you know the basics, you’re ready for richer user interactions within Cortana. The app can specify different types of screens to support functionality that includes:
Let’s dive into one of these scenarios above: disambiguation. There are times where your app will have multiple choices to return. Your app then needs to disambiguate what to do next. If the user was picking music and they could pick between ABBA, Nickelback or White Snake for their favorite band to play next, Cortana can handle this. The code below from the Adventure Works sample will show you how to handle disambiguation from within your app service,:
// Create a VoiceCommandUserMessage for the initial question.
var userPrompt = new VoiceCommandUserMessage();
userPrompt.DisplayMessage = "Which one do you want to cancel?";
userPrompt.SpokenMessage = "Which Chicago trip do you wanna cancel?";
// Create a VoiceCommandUserMessage for the second question,
// in case Cortana needs to reprompt.
var userReprompt = new VoiceCommandUserMessage();
userReprompt.DisplayMessage = “Which one did you want to cancel?”;
userReprompt.SpokenMessage = "Which one did you wanna to cancel?";
// Create the list of content tiles to show the selection items.
var destinationsContentTiles = new List<VoiceCommandContentTile>();
// create your VoiceCommandContentTiles
for(int i=0; i < 5; i++)
{
var destinationTile = new VoiceCommandContentTile();
destinationTile.ContentTileType = VoiceCommandContentTileType.TitleWith68x68IconAndText;
// The AppContext is optional.
// Replace this value with something specific to your app.
destinationTile.AppContext = "id_Vegas_00" + i;
destinationTile.Title = "Tech Conference";
destinationTile.TextLine1 = "May " + i + "th";
destinationsContentTiles.Add(destinationTile);
}
// Create the disambiguation response.
var response = VoiceCommandResponse.CreateResponseForPrompt(userPrompt, userReprompt, destinationsContentTiles);
// Request that Cortana shows the disambiguation screen.
var voiceCommandDisambiguationResult = await voiceServiceConnection.RequestDisambiguationAsync(response);
if (voiceCommandDisambiguationResult != null)
{
// Use the voiceCommandDisambiguationResult.SelectedItem to take action.
// Call Cortana to present the next screen in .5 seconds
// and avoid a transition screen.
}
Wrapping up Cortana for now
We hope that you now better understand how Cortana can easily be added to your application, opening up a multitude of interaction models with your customers. From launching your app, all the way to a complex interaction without them even launching the app, Cortana integration really does add to user engagement. We hope that you thought about how your app could take advantage of Cortana’s extensibility – even if it’s simply providing a new way of deeply linking into your app experience.
If you feel Cortana makes sense for your apps, definitely take advantage of it. And once your updated app is submitted, be sure to redeem the “Adding Cortana to your app” DVLUP challenge, so you can claim points and XP for updating your apps. Also, let us know via @WindowsDev and #Win10x10 – we love to hear what developers are building on Windows.
Also, check out the full Windows 10 by 10 development series schedule for the topics we will be covering in the series. For more on Cortana, check back here in a couple weeks as we dive into using Cortana’s natural language capabilities in your app to deliver a more personal user experience.
Additional Resources on Extending Cortana
For more information on extending Cortana, below are some additional resources that we believe may be of use to you.
Continue reading...
In this blog post, we’ll talk about what Cortana is, how to make Cortana have a meaningful conversation with your customer, the initial work needed to get Cortana integrated into your app, and then two of the many ways your app could interact with your end user depending on the scenario.
What exactly is Cortana?
One of the neatest capabilities introduced in Windows 10 is the Cortana personal assistant, who joins us from Windows Phone. Cortana is a front and center experience in Windows 10 that users can engage with via natural language. With Cortana, users can engage with Windows (both the core OS, but also apps such as yours) the same way they would speak to a person. For example, think about the questions that only your app can answer to the user using Cortana’s voice such as “When is my next trip?”, “Search for a radio station”, “Is Jack online?”. Your app can then answer these questions by providing the answer for Cortana to say and display. Think about tasks that the user can ask Cortana to do in your app such as: “Cancel my trip to London”, “Like this station”, “Tell Jack I’m running late”.
Voice commands can provide quick access to info inside the app when you use voice command as deep links into your application. Just as you currently create a tile to provide a shortcut into your app, you can also use a voice command to serve as a shortcut to a screen inside your app. And just as you give the user the ability to pin that recent trip they created in your app onto their Start screen, you can also enable that same user to use a voice command in the Cortana experience to get them to that same trip. This ability can make your users more productive, and your app’s experience more accessible to them.
By extending Cortana, you can engage and delight your users by empowering them to get things done with a quick voice command. This blog post isn’t on the customer features of Cortana however, it is about how you can integrate it into your app.
Hey Cortana, let’s have a conversation
Since interacting with Cortana is speech based, your user experience needs to flow as if you were having a natural conversation with them. There are general Cortana design guidelines on MSDN that explain best practices for user interactions. For successful Cortana interactions, follow these principles as well: efficient, relevant, clear and trustworthy interactions.
What do they actually mean?
- Efficient: Less is more. Be concise and use as few words as possible without losing meaning
- Relevant: Keep the topic on track. If I request my favorite ABBA song be added to my playlist, don’t tell me my battery is low as well. Instead, confirm to me I’ll be rocking out to ABBA shortly
- Clear: Write the conversation for your audience. Be sure the dialogue uses everyday language instead of jargon that few people may know.
- Trustworthy: Responses should accurately represent what is happening and respect user preferences. If your app hasn’t completed a task, don’t say it has. And don’t return dialog that someone may not want to hear out loud
Also, you should consider localizing your Cortana interactions, especially if you’ve already localized the rest of your app or are making it available globally. Cortana is currently available in the US, UK, China, France, Italy, Germany and Spain, with more markets coming on board in the future. Localizing and adapting the interactions will aid in encouraging your customers to use the Cortana feature of your app.
Teaching Cortana what to respond to
Cortana uses a Voice Command Definition (VCD) file to define the speech interactions the user can have with your app. This file can be XML based or generated via code. Once your app runs for the first time, the command sets in the VCD will be installed. Here is a quick sample VCD:
<?xml version="1.0" encoding="utf-8"?>
<VoiceCommands xmlns="http://schemas.microsoft.com/voicecommands/1.1">
<CommandSet xml:lang="en-us" Name="AdventureWorksCommandSet_en-us">
<CommandPrefix> Adventure Works, </CommandPrefix>
<Example> Show trip to London </Example>
<Command Name="showTripToDestination">
<Example> show trip to London </Example>
<ListenFor RequireAppName="BeforeOrAfterPhrase"> show trip to {destination} </ListenFor>
<Feedback> Showing trip to {destination} </Feedback>
<Navigate/>
</Command>
<PhraseList Label="destination">
<Item> London </Item>
<Item> Dallas </Item>
</PhraseList>
</CommandSet>
<!-- Other CommandSets for other languages -->
</VoiceCommands>
When your app is activated, InstallCommandSetsFromStorageFileAsync should be called in the OnLaunched app event handler to register the commands that Cortana should listen for. Keep in mind that if a device backup is restored and your app is reinstalled, voice command data is not preserved. To ensure the voice command data for your app stays intact, consider initializing your VCD file each time your app launches or activates. You can also store a setting that indicates if the VCD is currently installed, then check that setting each time your app launches or activates. Here is some basic code to get the VCD loaded into your app:
var storageFile = await Windows.Storage.StorageFile.GetFileFromApplicationUriAsync(new Uri("ms-appx:///CortanaVcd.xml"));
await Windows.ApplicationModel.VoiceCommands.VoiceCommandDefinitionManager.InstallCommandSetsFromStorageFileAsync(storageFile);
Being dynamic
Now that we have the grammar initialized, we can dynamically alter it at runtime. Here is a simple example of dynamically altering the VCD we loaded above:
private Windows.ApplicationModel.VoiceCommands.VoiceCommnadDefinition.VoiceCommandSet commandSetEnUs;
if (Windows.ApplicationModel.VoiceCommands.VoiceCommandDefinitionManager.InstalledCommandSets.TryGetValue("AdventureWorksCommandSet_en-us", out commandSetEnUs))
{
// this code will fully replace the destination list
await commandSetEnUs.SetPhraseListAsync("destination", new string[] {"Chicago", "Seattle", "New York", "Phoenix"});
}
How should my app interact with Cortana?
There are a number of ways for your app to interact with Cortana. The three most typical ways are:
- Have Cortana launch your app. Along with launching your app to the foreground, you can specify a deep link for an action or command to execute within the app.
- Within Cortana, allow simple user interaction for your app to store or return data in the background.
- Within Cortana, let your app and user interact with each other.
If you have a complex task and want the user to jump directly into your app, using Cortana is a great solution. Since some complex tasks can actually be done faster and more accurately by voice commands, this may be the way to go.
protected override void OnActivated(IActivatedEventArgs e)
{
// Was the app activated by a voice command?
if (e.Kind != Windows.ApplicationModel.Activation.ActivationKind.VoiceCommand)
{
return;
}
var commandArgs = e as Windows.ApplicationModel.Activation.VoiceCommandActivatedEventArgs;
var navigationParameterString = "";
Windows.ApplicationModel.VoiceCommands.VoiceCommand.SpeechRecognitionResult speechRecognitionResult = commandArgs.Result;
// Get the name of the voice command and the text spoken
string voiceCommandName = speechRecognitionResult.RulePath[0];
string textSpoken = speechRecognitionResult.Text;
// The commandMode is either "voice" or "text", and it indicates how the voice command was entered by the user.
// Apps should respect "text" mode by providing feedback in a silent form.
string commandMode = this.SemanticInterpretation("commandMode", speechRecognitionResult);
switch (voiceCommandName)
{
case "showTripToDestination":
// Access the value of the {destination} phrase in the voice command
string destination = speechRecognitionResult.SemanticInterpretation.Properties["destination"][0];
// Create a navigation parameter string to pass to the page
navigationParameterString = string.Format("{0}|{1}|{2}|{3}",
voiceCommandName, commandMode, textSpoken, destination);
// Set the page where to navigate for this voice command
navigateToPageType = typeof(TripPage);
break;
default:
// There is no match for the voice command name. Navigate to MainPage
navigateToPageType = typeof(MainPage);
break;
}
if (this.rootFrame == null)
{
// App needs to create a new Frame, not shown
}
if (!this.rootFrame.Navigate(navigateToPageType, navigationParameterString))
{
throw new Exception("Failed to create voice command page");
}
}
Simple interaction to store or return data to/from your app within Cortana
Now that you have Cortana connected to your VCD and executing basic interactions, we’ll dive into having Cortana do some of the heavier lifting. For example, you can have Cortana provide data back to the user, or store some data. MSDN has a comprehensive walkthrough for setting up a background app for Cortana. Here’s a quick summary of the steps.
- Create a Windows Runtime Component project in your solution.
- Create a new class that implements the IBackgroundTask interface, which will serve as our app service.
- In your UWP app’s Package.appxmanifest, add a new Extension for the new app service. The MSDN documentation goes through this step in detail.
Here is a sample of what the Package.appxmanifest XML will look like:
<Package>
<Applications>
<Application>
<Extensions>
<Extension Category="windows.appService"
EntryPoint=
"AdventureWorks.VoiceCommands.AdventureWorksVoiceCommandService">
<AppService Name="AdventureWorksVoiceCommandService"/>
</Extension>
</Extensions>
<Application>
<Applications>
</Package>
Once launched, the app background service has 0.5 seconds to call ReportSuccessAsync. Cortana uses the data provided by the app to show and verbalize the feedback specified in the VCD file. If the app takes longer than 0.5 seconds to return from the call, Cortana inserts a hand-off screen, as shown below. Cortana displays the hand-off screen until the application calls ReportSuccessAsync, or for up to 5 seconds. If the app service doesn’t call ReportSuccessAsync, or any of the VoiceCommandServiceConnection methods that provide Cortana with information, the user receives an error message and the app service call is cancelled.
Here is the basic code needed for the IBackgroundTask implementation to act as an app service:
using Windows.ApplicationModel.Background;
namespace AdventureWorks.VoiceCommands
{
public sealed class AdventureWorksVoiceCommandService : IBackgroundTask
{
public void Run(IBackgroundTaskInstance taskInstance)
{
BackgroundTaskDeferral _deferral = taskInstance.GetDeferral();
//
// TODO: Insert code
//
_deferral.Complete();
}
}
}
Having user interactions within Cortana
Now that you know the basics, you’re ready for richer user interactions within Cortana. The app can specify different types of screens to support functionality that includes:
- Successful completion
- Hand-off
- Progress
- Confirmation
- Disambiguation
- Error
Let’s dive into one of these scenarios above: disambiguation. There are times where your app will have multiple choices to return. Your app then needs to disambiguate what to do next. If the user was picking music and they could pick between ABBA, Nickelback or White Snake for their favorite band to play next, Cortana can handle this. The code below from the Adventure Works sample will show you how to handle disambiguation from within your app service,:
// Create a VoiceCommandUserMessage for the initial question.
var userPrompt = new VoiceCommandUserMessage();
userPrompt.DisplayMessage = "Which one do you want to cancel?";
userPrompt.SpokenMessage = "Which Chicago trip do you wanna cancel?";
// Create a VoiceCommandUserMessage for the second question,
// in case Cortana needs to reprompt.
var userReprompt = new VoiceCommandUserMessage();
userReprompt.DisplayMessage = “Which one did you want to cancel?”;
userReprompt.SpokenMessage = "Which one did you wanna to cancel?";
// Create the list of content tiles to show the selection items.
var destinationsContentTiles = new List<VoiceCommandContentTile>();
// create your VoiceCommandContentTiles
for(int i=0; i < 5; i++)
{
var destinationTile = new VoiceCommandContentTile();
destinationTile.ContentTileType = VoiceCommandContentTileType.TitleWith68x68IconAndText;
// The AppContext is optional.
// Replace this value with something specific to your app.
destinationTile.AppContext = "id_Vegas_00" + i;
destinationTile.Title = "Tech Conference";
destinationTile.TextLine1 = "May " + i + "th";
destinationsContentTiles.Add(destinationTile);
}
// Create the disambiguation response.
var response = VoiceCommandResponse.CreateResponseForPrompt(userPrompt, userReprompt, destinationsContentTiles);
// Request that Cortana shows the disambiguation screen.
var voiceCommandDisambiguationResult = await voiceServiceConnection.RequestDisambiguationAsync(response);
if (voiceCommandDisambiguationResult != null)
{
// Use the voiceCommandDisambiguationResult.SelectedItem to take action.
// Call Cortana to present the next screen in .5 seconds
// and avoid a transition screen.
}
Wrapping up Cortana for now
We hope that you now better understand how Cortana can easily be added to your application, opening up a multitude of interaction models with your customers. From launching your app, all the way to a complex interaction without them even launching the app, Cortana integration really does add to user engagement. We hope that you thought about how your app could take advantage of Cortana’s extensibility – even if it’s simply providing a new way of deeply linking into your app experience.
If you feel Cortana makes sense for your apps, definitely take advantage of it. And once your updated app is submitted, be sure to redeem the “Adding Cortana to your app” DVLUP challenge, so you can claim points and XP for updating your apps. Also, let us know via @WindowsDev and #Win10x10 – we love to hear what developers are building on Windows.
Also, check out the full Windows 10 by 10 development series schedule for the topics we will be covering in the series. For more on Cortana, check back here in a couple weeks as we dive into using Cortana’s natural language capabilities in your app to deliver a more personal user experience.
Additional Resources on Extending Cortana
For more information on extending Cortana, below are some additional resources that we believe may be of use to you.
- MSDN article on Cortana interactions
- MSDN article on Cortana design guidelines
- MSDN article on Interacting with a background app in Cortana
- Nikola Metulev has a great blog post talking about Cortana and speech inside Windows 10. He has some extremely fun samples and a detailed walk through on programming with Cortana. This blog post actually uses some of his screen shots (with permission).
- Oliver Matis does an in-depth post with some great best practices on integrating your app with Cortana from beginning to end.
Continue reading...