Monday, May 6, 2019

Roman Ruins Adventure, Part 2: Alexa Skill

In this series of posts I'm sharing my experiences creating an old-style text Adventure game, adapted somewhat for the 21st century. In Part 1 I reviewed the history of Adventure games and the game design for my version, Roman Ruins Adventure. Here in Part 2 I'm covering an Alexa Skill implementation of Roman Ruins Adventure.


I've already covered the basic game design, game data representation, and game code in JavaScript. Now we'll look at what it takes to implement the game as an Alexa Skill, a voice-driven app. Since classic Adventure games are textual in nature, it's a perfect fit for a voice assistant.

Alexa Skills Project

Alexa Skills are defined in the Amazon Developer Console.

Alexa Skill Project

Our project has these intents defined:
  • north, east, west, south, up, down : movement commands
  • look : examine surroundings
  • examine object : examine an object
  • take object : take an object
  • drop object : drop an object
  • give object : give an object
  • help : explain available commands
  • mission : restate the mission
  • restart : restart a new game
Each intent includes a set of utterances, which gives us a change to provide a range of possible words and wordings. For utterances that reference an object, such as take shown below, the object is defined as a slot named object. Elsewhere under slots we defined the possible value for object that might be spoken, such as "bone" or "bottle".

Take Intent

The Alexa Skill project will recognize intents out of what a user speaks and categorize parts of that speech as objects. The gameplay itself however needs to be in our back end.

Lambda Node Project

The back end for our skill is a serverless function, an AWS Lambda function. Lambda functions can be written in a variety of languages. We're usinde node.js, because it fits the decision in Part 1 to write the game code in TypeScript/JavaScript.

Variables

At the top of our code are our variables, in an object named session. These include the current command and resulting output; the current room; a game over flag; the number of items being carried; an image URL for the current room; and lastly our arrays of rooms and objects. Those arrays get populated when the game initializes.

Lambda functions don't inherently preserve state, so we will be passing this object to and from Alexa between invocations in order to preserve our game state.
var session =
{
    command: '',
    output: '',
    room: 0,
    location: null,
    gameOver: true,
    itemsCarried: 0,
    it: null,
    imageUrl: image("0.jpg"),
    objects: [],
    rooms: []
}

Boilerplate Functions

In a node.js Lambda function for an Alexa Skill, a number of boilerplate functions are used to interact with Alexa. I'm using ones that came with one of the quickstarts, ColorPicker.
  • onSessionStart fires when a new session is created. 
  • onLaunch fires when the skill has been opened.
  • Exports.handler routes incoming requests based on type.  
  • onIntent is called by Exports.handler to process an intent.
Let's look at one of those functions, onIntent. This dispatches each receieved intent from Alexa to an appropriate handler function.
/**
 * Called when the user specifies an intent for this skill.
 */
function onIntent(intentRequest, session, callback) {
    console.log(`onIntent requestId=${intentRequest.requestId}, sessionId=${session.sessionId}`);

    const intent = intentRequest.intent;
    const intentName = intentRequest.intent.name;

    // Dispatch to your skill's intent handlers
    if (intentName === 'help') perform("HELP", intent, session, callback);
    else if (intentName === 'mission') perform("MISSION", intent, session, callback);
    else if (intentName === 'actions') perform("ACTIONS", intent, session, callback);
    else if (intentName === 'look') perform("LOOK", intent, session, callback);
    else if (intentName === 'examine') performObject("EXAMINE", intent, session, callback);
    else if (intentName === 'take') performObject("TAKE", intent, session, callback);
    else if (intentName === 'drop') performObject("DROP", intent, session, callback);
    else if (intentName === 'use') performObject("USE", intent, session, callback);
    else if (intentName === 'inventory') perform("INVENTORY", intent, session, callback);
    else if (intentName === 'north') perform("NORTH", intent, session, callback);
    else if (intentName === 'east') perform("EAST", intent, session, callback);
    else if (intentName === 'west') perform("WEST", intent, session, callback);
    else if (intentName === 'south') perform("SOUTH", intent, session, callback);
    else if (intentName === 'up') perform("UP", intent, session, callback);
    else if (intentName === 'down') perform("DOWN", intent, session, callback);
    else if (intentName === 'look') perform("LOOK", intent, session, callback);
    else if (intentName === 'repeat') performRepeat(intent, session, callback);
    else if (intentName === 'restart') perform("RESTART", intent, session, callback);
    
    else if (intentName === 'AMAZON.HelpIntent') {
        getWelcomeResponse(callback);
    } else if (intentName === 'AMAZON.StopIntent' || intentName === 'AMAZON.CancelIntent') {
        handleSessionEndRequest(callback);
    } else {
        throw new Error('Invalid intent');
    }
}

Command Functions

The perform function is called for simple commands that don't reference any objects, such as NORTH or LOOK. A command can be determined from the intent name, and is executed with the game function cmd(...).
// Perform a one-word command - ex: LOOK

function perform(command, intent, alexaSession, callback) {
    const repromptText = "Tell me a command, or say HELP for a list of commands.";
    let shouldEndSession = false;
    let speechOutput = '';
    
    if (alexaSession.attributes)
      session = alexaSession.attributes;

    speechOutput = cmd(command) + suffix();

    callback(session,
         buildSpeechletResponse(session.command.toUpperCase(), speechOutput, repromptText, shouldEndSession));
}
The performObject function is similar, but expects an object to passed along with the intent for commands such as TAKE BOTTLE or EXAMINE BONE. The command is the intent name, a space, and the object name, which is passed as a skill slot value.
// Perform a command that references an object - ex: EXAMINE BONE

function performObject(command, intent, alexaSession, callback) {
    const repromptText = "Tell me a command, or say HELP for a list of commands.";
    let shouldEndSession = false;
    let speechOutput = '';
    
    if (alexaSession.attributes)
      session = alexaSession.attributes;
        
    command += " " + intent.slots.object.value;

    speechOutput = cmd(command) + suffix();

    callback(session,
         buildSpeechletResponse(session.command.toUpperCase(), speechOutput, repromptText, shouldEndSession));
}
Both perform and performObject must extract the saved session variables from alexaSession.attributes in order to know the game state. Both must pass session back to buildSpeechletResponse so that the updated game state is preserved.

buildSpeechletResponse

The buildSpeechetResponse function builds the speech output. Functions like perform and performObject pass output containing the text we want to speek. buildSpeechletResponse creates the JSON that causes that text to be spoken.
// --------------- Helpers that build all of the responses -----------------------

function buildSpeechletResponse(title, output, repromptText, shouldEndSession) {
    
    var outputSpeech = null;
    
    var cardText = replace(output, '', '\r\n\r\n');

    if (output.indexOf('<') != -1) {
        outputSpeech = {   // output contains markup (audio, breaks) - output SSML
            type: 'SSML',
            ssml: '' + output + '',
        };   
    }
    else {
        outputSpeech = {  // output is just text
            type: 'PlainText',
            text: output
        };
    }

    return {
        outputSpeech: outputSpeech,
        card: {
            type: 'Standard',
            title: `${title}`,
            text: cardText,
            content: `SessionSpeechlet - ${output}`,
            image: {
                "smallImageUrl": session.imageUrl,
                "largeImageUrl": session.imageUrl
            }
        },
        reprompt: {
            outputSpeech: {
                type: 'PlainText',
                text: repromptText,
            },
        },
        shouldEndSession,
    };
}
In the case of Alexa devices that can display cards (visuals), output is also created that includes a title of the spoken command, and an image to go with the current room. Those images are my photos of Pompeii, hosted in S3. Here's what the display looks like:

Card Display

And that's it. The rest of the code is game data and game code, which I've described somewhat in Part 1. I'm not going to post the full code just yet since that would give away secrets of the game.

I've created other Alexa skills, so the skills creation went pretty rapidly. One odd thing I ran into is that Alexa doesn't seem to recognize the term "statue" even though it's in the list of objects I defined for the skill. I had to code around this for now pending further study. Aside from that, I didn't run into any problems. If your skill wants to remain open pending further input, which this game does, you are required to prompt the user with each spoken response. That's why Roman Ruins is constantly asking "What next?" after it tells you something. It wouldn't pass certification otherwise.

I submitted the Roman Ruins skill over the weekend and it was approved this morning. Here's how the listing looks on Amazon.com:


Playing Roman Ruins Adventure on Alexa

So, what's it like to play this on Alexa? Below is an excerpt. It works pretty well, but you can always do more to handle a wider range of responsesand you particularly feel that when you're working on voice applications.

Playing Roman Ruins Adventure on Alexa

Feel free to play! Try "Alexa, open Roman Ruins" and let me know what you think. As I extend and polish the game my goal is to make better and better. Your feedback will help a great deal.

Previous: Part 1: Nostalgia and Game Design
Next: Part 3: Twitch Minigame



No comments: