?

Log in

No account? Create an account

Previous Entry | Next Entry

Why hasn't Amazon fixed things overnight?

Well, if you have read Amazon CTO Werner Vogel's blog or seen any of his recent presentations, you'll definitely be (like me) inclined to "cock-up over conspiracy" as the explanation for the current shambles.

So why have things gone this desperately wrong this quickly?

The simple answer is Amazon's architecture. It's highly distributed, and there's no operations team. Each component (and over 200 go into a single page) is run by its development team, of four to five people. They are responsible for its features, its development - and for making sure it runs effectively. The result should be a company that can move quickly in response to outside events.

At least that's the theory.

I'm afraid the real world doesn't work like that. I've been a developer and I've managed developers and I can tell you that what really happens is something like this:

Someone comes up with a neat idea that they evangelise among the other developers, and it gets added to the platform. The developers become wedded to their idea, and they keep adding features. Something from the outside occurs that affects the data managed by the service, and they don't notice. After all, it's their design and it's perfect. The problem gets worse, and a few external symptoms are noted and passed on to the developers. They're too busy to pay much attention to them, and so they ignore them. Then suddenly, BANG, and everything breaks.

Oh, and it's a holiday weekend and there's no one there to actually handle the problem as the whole team's gone off on a skiing trip.

Now I can't guarantee that's what has happened with the deletion of GLBT content from the Amazon ratings system, but I suspect it's more likely than not.

So here's where my conjecture comes in:

Someone probably had the idea of reducing Amazon's exposure to bad publicity without increasing the site's legal liability. Manual censorship of the rankings would certainly make the service more liable, so the idea was probably a tool that would let the site's users do the work for it. After all, if the community doesn't like it, then, well, US community standards laws apply and you're safe. A group of developers coded it up, and it worked well - for a while.

Either a parameter wasn't quite right, or someone released a new version of a keyword file without testing - and, well, suddenly the GLBT books were off the list. Maybe someone gamed the system, too - it's impossible to tell from outside.

A separate test and operations team would have been likely to spot the underlying flaw before it got released - or at least spotted the first wave of complaints and started to triage them effectively, with a more productive response than "It's a glitch".

So now Amazon has to unwind data that's spread across its distributed application platform, which may be stored in any or all of three different kinds of database, and in at least three different geographies and many more data centres.

Ooops.

That's going to take a while to deal with.

Meanwhile their Seattle-based PR team is just about to start a very long day - and a group of developers are going to be desperately trying to explain just went wrong.

[ETA 23/4/2012. After three years of this post being targeted heavily by spammers, I have locked commenting.]

Comments

debgeisler
Apr. 13th, 2009 01:40 pm (UTC)
All of this is going to make a fascinating set of case studies in PR crisis management.
sbisson
Apr. 13th, 2009 01:48 pm (UTC)
Oh indeed.
therealdrhyde
Apr. 13th, 2009 02:38 pm (UTC)
nah, it's a storm in a teacup that everyone wil have forgotten about by next week
hal_obrien
Apr. 13th, 2009 03:16 pm (UTC)
"a storm in a teacup that everyone wil have forgotten about by next week"

Except for the writers whose income dramatically falls. And who, these days, have their blogs to let their readers know why.

How many of those writers are "Associates"? How many of their readers are "Associates"? (You're aware of this program, yes?)

AMZN stock is down today. The shareholder lawsuit should be fun.

Or, the other fun possibility: I've seen a lot of people say they're going through their archives, and re-writing the links to Amazon they've made for years when discussing books to point somewhere else. (B&N, Powell's, whatever.)

So, yes, people might "forget"... in the sense of, "Amazon? Who are they? Didn't they used to be somebody?"
qviri
Apr. 13th, 2009 10:50 pm (UTC)
Drive-by comment. Please. AMZN is down 1% on market close, and the worst it was during the day was 2.5% down. On NASDAQ, especially In This Economy (tm), this is precisely nothing. GM stock loses more when it rains in Detroit.
hal_obrien
Apr. 13th, 2009 11:18 pm (UTC)
Here's the graph showing AMZN underperforming all indices today -- the S&P, the Dow 30, and the NASDAQ -- on no other corporate news.

This is only the first day.

"GM stock loses more when it rains in Detroit."

And if Johnny were to jump up the Empire State Building, would you jump off the Empire State Building? If we're going to talk about This EconomyTM, then the slogan, "Amazon: Doesn't suck quite as badly as GM," isn't exactly a phrase to inspire confidence.

Those "Amazon.org" jokes may need to be dusted off yet.
hal_obrien
Apr. 13th, 2009 11:22 pm (UTC)
If it comes to that...

When you accuse me of a, "Drive-by comment," all I can say is, Better that than a drive-by LJ.

*^*^*

qviri's Journal
Created on 2003-09-12 19:30:43 (#1321750), last updated 2006-10-01
15 comments received, 352 comments posted
Basic Account [Gift]
2 Journal Entries, 0 Tags, 0 Memories, 0 Virtual Gifts, 0 Userpics


*^*^*

If that 15 comments received includes the one just posted, then counting this one I've made 2 of 16 to you. If not, then 2 of 17.

Either way: Pot. Kettle. Color blindness.
hal_obrien
Apr. 13th, 2009 11:31 pm (UTC)
Here we see that Jeffrey P Bezos holds 97,167,078 shares of AMZN. AMZN was down 0.83 today.

So Mr. Bezos lost roughly $80.6 million dollars today.

I suspect he's not the only shareholder to have comparable losses. IANAL, but sounds actionable to me as the result of somebody's solitary fuck-up.
qviri
Apr. 14th, 2009 12:38 am (UTC)
Okay, sorry, just to explain, I meant mine was the drive-by comment -- as it is my only comment on this issue, and as you've so aptly discovered, I use the account overwhelmingly for commenting.

I stand by my statement - a change of 1% for the day either way is not significant. If it keeps on going 1% down per day for a week, then perhaps, but there is so much noise in the data that a single graph for a single day with a single-digit change cannot possibly tell a meaningful story. (Not to mention 1% down per day for a week probably wouldn't happen - when there's cause for concern, crashes happen faster, when there isn't, stuff fluctuates.)

While I feel for the money lost, that's the beauty of the stock market, and Mr. Bezos full knows it.

If the fault can be pinpointed, will someone get fired? Probably, though it's more likely that a systemic software development/architecture problem at Amazon underlies this. Is there going to be a lawsuit over 1%? Not unless there's some poor litigious leech^W lawyers on the prowl.
(no subject) - hal_obrien - Apr. 14th, 2009 08:22 pm (UTC) - Expand
(no subject) - qviri - Apr. 14th, 2009 10:17 pm (UTC) - Expand
(no subject) - qviri - Apr. 14th, 2009 10:28 pm (UTC) - Expand
(no subject) - hal_obrien - Apr. 14th, 2009 10:52 pm (UTC) - Expand
(no subject) - qviri - Apr. 14th, 2009 11:58 pm (UTC) - Expand
(no subject) - hal_obrien - Apr. 15th, 2009 12:11 am (UTC) - Expand
(no subject) - therealdrhyde - Apr. 25th, 2009 02:28 pm (UTC) - Expand
(no subject) - hal_obrien - Apr. 25th, 2009 06:34 pm (UTC) - Expand
(no subject) - therealdrhyde - Apr. 25th, 2009 02:25 pm (UTC) - Expand
(no subject) - hal_obrien - Apr. 25th, 2009 06:25 pm (UTC) - Expand
debgeisler
Apr. 13th, 2009 05:38 pm (UTC)
It is certainly tempestuous. The search term #amazonfail yields 12,000 hits on the web search at Google...and 360 (including the Wall Street Journal) on the news search. Searching on amazon gay ranking glitch yields 237,000 hits.

I do think that the likelihood of people forgetting it by next week is going to be a real measure of PR success.
dampscribbler
Apr. 13th, 2009 06:53 pm (UTC)
As a long-time customer of Amazon.com who gets, on average, two deliveries a week from them, I guarantee you that *I* will not forget about this in a week. Or a month. Or maybe not even a year. Something has gone horribly awry for some reason, I suspect a combination of stupidity and damnfoolishness, and I am not comfortable with my relationship with this company. Even if it is "fixed overnight" (still waiting), the way I shop and utilize Amazon.com has changed and will remain changed -- more or less depending on how they deal with this in the coming days.
palmer_kun
Apr. 13th, 2009 07:13 pm (UTC)
People still have their panties in a twist over LJ drama like Strikethrough, long after it's been resolved.

This issue extends far outside the blogophere, and is not going anywhere
ksol1460
Apr. 13th, 2009 08:39 pm (UTC)
I've got two relatives who authored books and music that are sold on Amazon, I've been an Associate for eleven years and I'm sure as hell not going to forget.
estrelladesax
Apr. 13th, 2009 05:25 pm (UTC)
Yes! And I will get to study them! My term paper will be titled #amazonfail. I was having major trouble picking a topic...
(Anonymous)
Apr. 13th, 2009 08:22 pm (UTC)
#amazonfail
Would you think about sharing that term paper with those of us who have newsletters & blogs, so we can cite it? Attribution, credit, etc etc would all be yours.