JeffBeal.net

Job 8: Amazon Alexa

Towards the end of my year at Google, I was finding myself generally disappointed with my experiences at the company. Coming off of a couple of years at Amazon building an amazing product, and learning and growing as much as I did on that project, my year at Google was a little bit of a letdown. Specifically as it related to my career, I found that I missed being a manager. Google managers tended to be more business-oriented than technical, and I didn't see myself being as happy in that role, so I decided to reach out to a couple of my former managers at Amazon about the possibility of going back.

I interviewed for two roles at Amazon. The success of RDS as a service for AWS had allowed that organization to launch several new database products, and one of the roles I interviewed was for an early manager role on Dynamo DB. My original manager at Amazon from the eCommerce Platform Product Group also had an intriguing position he thought I would be good for, but couldn't tell me much about.

I can tell you it will be a new type of electronic device — think Kindle, he said, but I can't tell you what it will be. I think it could potentially be in every home in America if we are successful.

While the Dynamo DB role was extremely interesting, I ended up choosing the mystery device, and do not regret that decision. On my first day back at Amazon, after going through the standard new hire orientation, I met my manager for lunch, and he still couldn't tell what we were working on. Only after getting to the Fiona building (named after the code name for the original Kindle device) and to our project's secure floor on the fifth floor did I finally learn that I would be working on a small black cylinder with no screen that aspired to be the world's first computing device controlled with only your voice. The name "Alexa" hadn't been chosen yet. Internally, the project team called ourselves "Doppler", but outside of our secure walls, we were only "Project D".

As the father of three young children, one of the first things I realized about this device was that it posed some interesting challenges in the domain of identification, authentication, and authorization, so I asked to be put in charge of those areas. Our first challenge was just figuring out a decent way to register the device without a screen. My team built the first versions of the device registration flows for the Echo devices, which required users, first, to connect to a Wifi network hosted by the device itself, and then submit the user's home WiFi credentials to the device. Once we had that part minimally working, we started to think about how to handle different aspects of the Alexa UX that needed to take into account the identify of the person issuing verbal commands.

Obviously, it seemed preferable for the voice recognition technology behind Alexa to be able to automatically (and correctly) distinguish between different speakers in the household, but a cursory review of the state of the art in the technology at the time was not encouraging. The best speaker identification models at the time either required extensive enrollment processes (something we wanted to avoid for Alexa) or were highly susceptible to false positives within a family. So, what we proposed to build for the initial launch of the Echo was that, for any activity where the specific identity of the human mattered, Alexa would ask the user to provide their name:

Alexa, remind me to pick up the cleaning on my way home tonight

OK. To confirm, whose reminder list should I add this to?

This is Jeff.

OK, Jeff; I've added that to your list.

For more sensitive actions, such as eventually ordering things from Amazon, we planned to offer either a spoken security code, or a manual confirmation step on the user's phone, to provide stronger authentication.

When it came time for us to start figuring out how to build this, we started to talk with our teammates on the platform team and the speech and language teams, and realized that there weren't any other use cases planned for the Alexa launch that required this sort of multi-step interactive dialogue between the human speaker and Alexa. Everything else was predominantly a one-shot command which Alexa would interpret and respond with a final action. The framework for multi-step dialogue just didn't exist yet.

After a bit of conversation with the organization leadership, part of my team (myself included) ˘ was reorganized under the Alexa platform team to try to build some of the software primitives that would eventually enable conversational interactions with Alexa. This team ended up building a lot of the integration code between the speech and language systems of Alexa and what we considered the Alexa application teams — music, tasks, alarms, etc. In building this, we worked closely with the ML scientists who were building the language understanding components, and tried to come up with good programming abstractions to allow application developers to respond to speech events, without needing to become experts in speech understanding themselves.

Since the idea for this team originated with a realization that the platform didn't have capabilities that my original team was looking for, I wanted to make sure we could anticipate functionality needed by other teams, and that we could build out support for features before they were needed. To that end, I spent a lot of my time trying to learn as much as I could about the desired feature set for Alexa, and the evolving capabilities of the Alexa's speech and language components. I leaned on engineers on my own team for a lot of it, but also tried to talk with as many people across the larger organization as I could, including other team managers, ML scientists, product managers, and UX designers. For the most part, these teams were focused on the eventual end user experience; it fell to me and my team to figure out how to build out a roadmap of platform capabilities that would enable those experiences.

Although I had held the manager title for maybe a year total during my previous two roles at Amazon, the Alexa team was the first role where I was managing a team where I had never been an engineer. After my year at Google, where I found going back to being a software engineer very dissatisfying, I worked hard to understand the difference between being a successful engineer and a successful manager, and to really build and develop my leadership and management skillsets. I wasn't entirely successful at keeping my engineer hat off (a lot of the feedback I received from my team was that I didn't leave them enough room to decide how to implement things, and that my project roadmaps included too many of my own assumptions about how the system should be built), but I did try, and was definitely better towards the end of my time on the team.

In terms of my growth as a manager, my time on Alexa demanded a lot more from me than my previous short turns at the role had done. As I alluded to above, I had a lot of the responsibility for determining the strategy and vision for my team. Especially in the early days of the project, there weren't enough dedicated Product Managers to have anybody supporting my teams full-time, so the work of determining priorities fell largely on my. In this effort, I wrote several papers using Amazon's 6-Pager format describing things like Echo device registration, how Alexa would work with multiple people in a household, how we would create capabilities for multi-step conversational dialogue, and how we would measure success and efficiency as users were interacting with the device. Each of these papers involved weeks of writing and revising for me before reviewing it in an intense, two-hour session with all of Alexa team leadership.

As we got closer to the Alexa launch, it became more clear that we would launch without full support for a lot of what we had wanted to build on this team. At the same time, one of the other team manaagers decided to leave for a role elsewhere at Amazon, leaving a key application team without a manager. One of the engineers on my team had expressed a desire to take on a management role, so we worked on a reorganization where he would be promoted to manager of our original team (which, due to a simplified scope, could tolerate losing an engineer), and I would step into the newly open role in the applications organization. In that role, I helped get the timers, alarms, and reminders applications fully ready for the Alexa launch, and also led a remote team in Poland to build out the first version of the Daily Briefing; a relatively bare-bones news application for Alexa.

1: Lessons I'm Learning

Predicting the future is hard

From the beginning of the Alexa project, the organization's VPs emphasized the importance of being able to predict when features and software would be ready to launch. Launching Alexa to the public was going to involve a significant marketing push, and would require some lead time to make sure we had a stockpile of Echo devices manufactured and ready to sell, so they wanted to make sure to avoid scenarios where those processes needed to be put on hold because software wasn't ready when it was supposed to be. Every engineering team at Alexa maintained a fully scoped-out and estimated roadmap that, at least in theory, included everything that team needed to accomplish before Alexa was ready for public use. We kept track of how much work we did each 2-week sprint, and reported on an ongoing basis whether we were on schedule or running behind. Tracking projects with this level of detail required a LOT of work, but in the end, I don't know how successful it was. When I was hired in January 2012, the official target launch date was April of that year, but we were told that was already proving too optimistic, so expect to ship by Thanksgiving for Christmas rush. In the end, Alexa was announced a full two years after that, with limited availability announced in November, 2014. The dates shifted multiple times during the nearly three years I was on the project, and at every point in time, teams had a fully-defined estimated roadmap proving they would achieve that date.

Vision, or execution?

Relatively early in my time on Alexa, I had asked one of my managers what I should focus on in terms of my own career growth, and his answer was that I should learn to shift my focus more towards defining and setting a vision for the team than on leading the team in execution. I still struggle with this distinction. I don't remember if I said this out loud at the time, but I remember thinking that there was more than enough vision in Build the world's first fully voice-controlled computing devices to go around, and if all of the managers were focused on defining this vision, who would focus on getting it done? In looking back at this role, I think I did a decent job at putting together an ambitious vision for my team, but I am proudest of the work we did as a team to execute, build, and ship software.