Job 5: Amazon eCommerce Platform

When thinking about where to go after Syndesis, my wife and I decided we wanted to look at opportunities outside of Pittsburgh. My wife had moved to Pittsburgh in 4th grade, and most of her family was still there, but there weren't a ton of job openings at any given time that were the type of work I was looking for, and it seemed there could be better options out there. I had grown up in Boise, and (strange as this may seem) generally liked it, so we started out by creating a job profile (I believe monster.com was the prevailing choice at the time) and targeting a job search in Boise.

I like to joke that I overshot; a recruiter from Amazon saw my profile, and reached out about an engineer opening at their Seattle HQ on a team called the eCommerce Platform Product Group, or EPPG. In 2007, Amazon had already evolved from being an online bookstore to an everything store — though most of its sales were still physical media (books, CDs, and DVDs) — but Amazon Web Services still looked like an experiment and digital media like Kindle, Amazon's MP3 store, and Prime Video hadn't happened yet. The EPPG team was based on the idea of taking Amazon's internal eCommerce platform (the software and services that powered shopping on Amazon's own site) and make it available to developers at any other company, up to and including Amazon's largest direct competitors in the retail space. (Target.com, I learned, was actually built by Amazon, and part of the idea behind our product was that by giving Target developers direct access to Amazon APIs, we would give Target more control over their own website.)

Amazon's service architecture at the time was a mostly proprietary stack built on Tibco Rendezvous but based on a lot of pieces of the SOAP standard. SOAP was a popular and increasingly standard way to define how remote services could represent their APIs, but Amazon was concerned that the performance overhead of XML-over-HTTP would be too inefficient, so they replaced XML-over-HTTP with Tibco Rendezous as the transport protocol. In theory, all we had to do was build a proxy that could convert the XML-over-HTTP messages of standard SOAP into the Tibco Rendezvous message format and back, and we could easily expose any existing service at Amazon to external developers.

By the time I joined the team, the SOAP–Tibco proxy already existed and the team had demonstrated they could successfully make SOAP API calls that would get proxied to internal services. One of the things they had found, just in getting this set up, was that Amazon's internal teams were not very consistent about basic naming patterns. Punctuation, capitalization, and other API design patterns varied widely based on the preferences of each team, which was OK for internal-only services, but would be a major impediment to the developer experience of external developers working on the SDK. I heard a story that one of the most senior executives at Amazon was so upset about this type of inconsistency that he declared in one meeting that our project could be deemed a success if all we managed was to come up with a consistent way to spell MarketplaceId (a nearly universal parameter on most APIs) across all services.

After a couple of starter projects, the main piece I started working on was to solve this problem. I felt like my early experience working with XML transformation at ANSYS helped me to wrap my head around some of the things we needed to do as part of this API transformation piece. SOAP API specifications were all defined in XML (and this was one of the pieces of SOAP that Amazon kept), so I wrote a transformation language that would allow us to rename and replace different fields and parameters in our APIs, and include instructions in our proxy to perform similar translations on the HTTP requests and responses. We didn't succeed in standardizing how MarketplaceId was spelled internally to Amazon, but the external SDK we created was, at least, consistent. Since we were applying these changes starting at the API definition, we were also able to automatically generate documentation (in DocBook) for the APIs starting with the transformed names.

While we were working on this project, Amazon was starting to grow its business via acquisition. ShopBop, Zappos, and CreateSpace were a few companies acquired during this time period. As somewhat of a captive audience, we started to work with engineers from these companies — ShopBop in particular — to try to migrate their applications to the SDK we were building. Up to this point, I had approached the Amazon APIs as something that had been built by engineers much smarter, more capable, and more experienced than I was. All that was probably true, but what I didn't account for was that APIs need to be built for the user, not the builder, and when we started getting feedback from external developers, we quickly learned that what worked and made sense for internal Amazon systems, didn't necessarily make sense externally. As one example, Amazon's overall system depended heavily on caching at every level. In part to maximize cacheability of data, services tended to expose data very granularly. The search API didn't return full product data; it only returned a product ID. This allowed clients of the search API to cache the details of each product ID individually, which was more efficient for products that tended to appear in the results of many searches, but it increased the complexity of building the search page of a website. Developers couldn't just fire off a search and display the results; they had to then make multiple API requests for each product — not just one per product, but one for the product details, one for the product price, and another for the product images (just that I remember). When all of this was happening internally to Amazon's data center, and caching was working well at every layer, the overall performance of the system was pretty decent. When each API request meant a round trip between client data centers and Amazon's data centers, the time it took to load each page was abysmal.

We had built a working system, but not, as it turns out, a viable product.

Around the same time we were starting to get this feedback from our captive beta testers at Shopbop, there was some key turnover in major roles in the organization. The Director of the organization left for a social media startup, and my manager left to run a separate organization at Amazon. We hired a couple of replacements, but I also was offered, and accepted, the responsibility to manage this small team of 3-4 engineers. After what I remember as a couple of months, the project was cancelled and our entire team was moved to a different organization to work on something completely different that has had a lasting impact on my professional development, Amazon's business success, and maybe even the software development industry as a whole.

1: Lessons I'm Learning

Even though this particular role at Amazon lasted just a little over a year, I don't think there is any way I can fit everything I learned into one article. I will probably touch on some other aspects of this role in more detail in later articles, but wanted to just highlight a couple of areas where this role left a particularly large impact on my career.

Failure isn't always failure

The biggest lesson for me was in how this project ended. In many ways, it was an abject failure (even though we did manage to spell MarketplaceId consistently), certainly from any business or profit/loss analysis perspective. However, both individually and as a team, we learned a lot. That learning was, in and of itself, enough of a success to make the endeavor something less than a total waste, but it also got somebody's attention enough that our team was attached to a much more important project than where we had started. I think a lot about what might have been different if Amazon of 2008 had decided to lay off our team when our project was cancelled instead of moving us to the new project. It certainly would have been different for the other engineers and me, but maybe Amazon also would have missed our contributions over the next few years.

Agile Software Development can work

The manager who was hired to be my supervisor towards the end of this project challenged our team to read some of the original books on the Scrum software development methodology, and our team agreed that adopting those practices would help us to get more done than what we were doing at the time. Ever since this, I've generally been an advocate for adopting some or all of Scrum methodology on most teams I've been on. My opinions have definitely evolved over time, and I'm somewhat less of a zealous advocate now than I've been at some points in the past, but for a significant part of my career, Scrum was a tool I leaned on very heavily, and that started with this role.

JeffBeal.net

Job 5: Amazon eCommerce Platform

1: Lessons I'm Learning