Amazon and Testing in Productions: Some good, some bad

7 August, 2015 (08:52) | Uncategorized | By: seth

Amazon has a well deserved reputation of being data-driven in its decision making.  TiP is a vital part of this, but may not have always been approached as a legitimate methodology instead of an ad hoc approach.  An example of the latter can be seen by anyone on the production site who searches for {test ASIN}. where ASIN” is the Amazon Standard Identification Number assigned to all items for sale on the Amazon site.  Such a search will turn up the following Amazon items for sale”:

This is TiP done poorly as it diminishes the perceived quality of the website, and exposes customers to risk — a $99,999 charge (or even $200 one) for a bogus item would not be a customer satisfying experience.

Another TiP slip” occurred prior to the launch of Amazon Unbox (now Amazon Instant Video).  Amazon attempted to use Exposure Control to limit access to the yet un-launched site, however and enterprising hacker” found the information anyway and made it public.

However Amazon’s TiP successes should outweigh these missteps.   Greg Linden talks about the A/B experiment he ran to show that making recommendations based on the contents of your shopping cart was a good thing (where good thing equals more sales for Amazon).  A key take-away was that prior to the experiment an SVP thought this was a bad idea, but as Greg says:

I heard the SVP was angry when he discovered I was pushing out a test. But, even for top executives, it was hard to block a test. Measurement is good. The only good argument against testing would be that the negative impact might be so severe that Amazon couldn’t afford it, a difficult claim to make. The test rolled out.

The results were clear. Not only did it win, but the feature won by such a wide margin that not having it live was costing Amazon a noticeable chunk of change. With new urgency, shopping cart recommendations launched.

Another success involved the move of Amazon’s ordering pipeline (where purchase transactions are handled) to a new platform (along with the rest of the site).  A simple” migration, the developers did not expect much trouble, however testers’ wisdom prevailed and a series of online experiments used TiP to uncover revenue impacting problems before the launch [Testing with Real Users, slide 56].


Pingback from Destroy all* test environments | WriteAsync .NET
Time August 26, 2015 at 6:01 am

[…] good, they’re bad. Though most of the time they’re just bad. So what then, do we just test in production exclusively? Obviously it’s not that simple and there is a more nuanced viewpoint behind my […]

Write a comment