What I Learned From Building an Alexa Kid’s Skill with the New Alexa Python SDK

Writing an Alexa kid’s skill is something I had been intending to do for a while. I have young ones at home, and they frequently use our Echo Dot. I like that Alexa’s emphasis on voice-first interactions provides a different experience than using a TV or iPad, even if the conversation is ultimately with a bot.

Recently, Amazon announced the availability of the Alexa Skills Kit SDK for Python, complementing their existing SDKs for Node.js and Java. This announcement provided me, as someone who regularly uses Python, the impetus to dive in.

Animal Patterns

The possibilities for Alexa skills are virtually limitless. I settled on building a skill called “Animal Patterns,” intended for pre-school or kindergarten-age kids who are learning to recognize patterns. The skill is a simple quiz-style game that asks kids to complete a series of patterns. For example:

turkey dog turkey dog -> What comes next?

lion bear bear lion bear bear lion -> What comes next?

The skill randomly selects patterns and varies their difficulty based on prior responses.

Here’s a simplified flow, which omits the built-in user intents like “Help” and “Stop.”

Animal Patterns Flow

I chose to implement the server-side logic using AWS Lambda, which is by far the easiest way to develop an Alexa skill.

The Python SDK is very similar in design to the Node.js and Java SDKs. It reduces boilerplate code in the areas of request parsing, request routing, response building, and session management. One aspect specific to Python that is convenient is the ability to use Python decorators to define request handlers. This further reduces boilerplate code by removing the need to define a class per request handler. For example, instead of writing the following code:

sb = SkillBuilder()

class AnimalIntentHandler(AbstractRequestHandler):
    def can_handle(self, handler_input):
        return (handler_input.request_envelope.request.object_type == "IntentRequest"
                and handler_input.request_envelope.request.intent.name == "AnimalIntent")

    def handle(self, handler_input):
        # request handler code ...

# register all request handler classes
sb.request_handlers.extend([
    AnimalIntentHandler(),
    ...])

One can simply write:

sb = SkillBuilder()

@sb.request_handler(can_handle_func=is_intent_name("AnimalIntent"))
def animal_intent_handler(handler_input):
    # request handler code ...

One gotcha specific to the Animal Pattern skill that I ran into is a limitation with the number of audio clips that can be embedded within an Alexa response. I initially intended for Alexa to play an animal sound every time it said the name of the animal in the pattern, like this:

lion <lion roar sound> bear <bear growl sound> bear <bear growl sound>
lion <lion roar sound> bear <bear growl sound> bear <bear growl sound>
lion <lion roar sound> -> What comes next?

But Alexa restricts the number of audio clips to at most 5. This limitation is documented here, and has been raised as an issue here. Lifting this limitation would certainly enable a wider range of Alexa skills. I ended up removing the animal sounds from the patterns altogether.

Tips for Making a Kid-Friendly Skill

My Beta testers at home provided some valuable feedback that was incorporated into the final version of the skill. Their feedback led me to the following tips about building a kid-friendly skill:

Keep the interaction model as focused and simple as possible
The speech of young kids is imperfect and can be hard for Alexa to understand. Free form input is especially difficult. Alexa is not very forgiving of long pauses, which kids sometimes take to think things over. Noise in the environment, made worse with multiple kids, is also a factor. And sometimes kids say completely nonsensical things. All of these factors point to keeping things simple. The more the back and forth between the skill and the user, the greater the risk the conversation can go sideways. With Animal Patterns, I intentionally chose to have it jump directly into the game after it was launched, skipping over any intermediate setup. With each question, kids are only asked to provide the name of a single animal. It’s about as directed as it can get.

Gamify the skill
Gamification is a tried-and-true strategy, for kids and adults alike. With Animal Patterns, I prefaced the presentation of the patterns with a story. For instance, Old MacDonald left the gate of his farm open and his animals escaped, and he needs to get them back, and you can help him by completing the patterns. To make things more interesting, I came up with a couple of different stories that the skill randomly chooses from when the skill is launched.

Keep the conversation short
Kids, especially young kids, don’t have much of an attention span. I initially designed the skill to ask 7 to 10 pattern questions. That proved to be too many, and I brought the number down to 5.

Conclusion

The process of creating a simple Alexa skill is pretty straightforward. The Alexa Skills Kit SDK (Python or otherwise) reduces boilerplate code and lets one focus more on the skill’s business logic and user experience. At the same time, from developing this simple skill, it’s apparent to me that there’s hidden complexity in scaling a skill to do more – a topic for a future post.

If you’re interested, full source for the skill is available here, and the skill itself is available on Alexa here.

Leave a Reply

Your email address will not be published.