Log in

No account? Create an account

Previous Entry | Next Entry

Why hasn't Amazon fixed things overnight?

Well, if you have read Amazon CTO Werner Vogel's blog or seen any of his recent presentations, you'll definitely be (like me) inclined to "cock-up over conspiracy" as the explanation for the current shambles.

So why have things gone this desperately wrong this quickly?

The simple answer is Amazon's architecture. It's highly distributed, and there's no operations team. Each component (and over 200 go into a single page) is run by its development team, of four to five people. They are responsible for its features, its development - and for making sure it runs effectively. The result should be a company that can move quickly in response to outside events.

At least that's the theory.

I'm afraid the real world doesn't work like that. I've been a developer and I've managed developers and I can tell you that what really happens is something like this:

Someone comes up with a neat idea that they evangelise among the other developers, and it gets added to the platform. The developers become wedded to their idea, and they keep adding features. Something from the outside occurs that affects the data managed by the service, and they don't notice. After all, it's their design and it's perfect. The problem gets worse, and a few external symptoms are noted and passed on to the developers. They're too busy to pay much attention to them, and so they ignore them. Then suddenly, BANG, and everything breaks.

Oh, and it's a holiday weekend and there's no one there to actually handle the problem as the whole team's gone off on a skiing trip.

Now I can't guarantee that's what has happened with the deletion of GLBT content from the Amazon ratings system, but I suspect it's more likely than not.

So here's where my conjecture comes in:

Someone probably had the idea of reducing Amazon's exposure to bad publicity without increasing the site's legal liability. Manual censorship of the rankings would certainly make the service more liable, so the idea was probably a tool that would let the site's users do the work for it. After all, if the community doesn't like it, then, well, US community standards laws apply and you're safe. A group of developers coded it up, and it worked well - for a while.

Either a parameter wasn't quite right, or someone released a new version of a keyword file without testing - and, well, suddenly the GLBT books were off the list. Maybe someone gamed the system, too - it's impossible to tell from outside.

A separate test and operations team would have been likely to spot the underlying flaw before it got released - or at least spotted the first wave of complaints and started to triage them effectively, with a more productive response than "It's a glitch".

So now Amazon has to unwind data that's spread across its distributed application platform, which may be stored in any or all of three different kinds of database, and in at least three different geographies and many more data centres.


That's going to take a while to deal with.

Meanwhile their Seattle-based PR team is just about to start a very long day - and a group of developers are going to be desperately trying to explain just went wrong.

[ETA 23/4/2012. After three years of this post being targeted heavily by spammers, I have locked commenting.]


Apr. 14th, 2009 10:28 pm (UTC)
Hmm, I seem to be missing the edit button. Oh well.

To add to the previous post: In other words, call me when something significant actually happens, as opposed to when people say something significant will happen.

There was a specific action, but I'd love to hear how you've determined that the small movements we have seen were due to this action as opposed to noise.

There are no results for me to rationalize away. 3% down is not a significant result, no matter how much you wish it so.
Apr. 14th, 2009 10:52 pm (UTC)
"I'd love to hear how you've determined that the small movements we have seen were due to this action as opposed to noise."

Because we have six data points for comparison: AMZN vs. each index of the Dow, S&P, and NASDAQ, both today and yesterday. Of those six data points, AMZN has underperformed all six times. While it's possible 0-for-6 is a random result, it doesn't seem likely.

"There are no results for me to rationalize away. 3% down is not a significant result..."

Hey, billg! We haven't talked in ages! (Not an exaggeration -- Atlanta, 1996. And Bill's known to post anonymously. Just ask Melinda.) May I have an insignificant $1.1B, please? Heck, I'll even take Jeff's insignificant quarter-billion. Beats the $92 million at MegaMillions...
Apr. 14th, 2009 11:58 pm (UTC)
Words, meet wall.
Apr. 15th, 2009 12:11 am (UTC)
Well, since you've already repeatedly demonstrated you're not reality-based, I've mostly been writing for the sake of the lurkers.

Good of you to admit here that no amount of evidence can persuade you, though. Admitting you have a problem is often the first step, and it explains why you've leaned so often on bald assertion. For your trouble, you get this week's Palin-Cleese That's Not an Argument Award.
Apr. 25th, 2009 02:28 pm (UTC)
What you have failed to demonstrate is that Amazon's stock movement is entirely (or even largely) related to "#amazonfail" (incidentally, if someone could explain when to #use @which &random *punctuation !character I'd be $most ?grateful) and not due to anything else. All you have is evidence of correlation, not of causation.
Apr. 25th, 2009 06:34 pm (UTC)
"What you have failed to demonstrate is that Amazon's stock movement is entirely (or even largely) related to "#amazonfail"..."

At the time, it seemed the best explanation for the facts in hand. Now, not so much. I was wrong.

"(incidentally, if someone could explain when to #use @which &random *punctuation !character I'd be $most ?grateful)"

The same way all spelling is determined -- usage.

In this instance, the most frequent use of the term was on Twitter, where a leading hash mark combined with a string-without-a-space has a specific use for searching purposes. This usage spilled into other media.

One would almost think English was a living, adaptive language, or something. Either that, or Chaucer (or even Dickens) would be much more readily readable to a contemporary audience.