On the Gmail problem today (somewhat techie, depending on your background :))

Based on the post in the Official Gmail Blog on today’s Gmail issue, it appears that Google had a classic case of cascading server failure. Lots of companies with high web traffic volumes will have essentially many servers doing essentially the same thing in parallel. For example, rather than have 1 server trying to handle 100% of the traffic, they will have 5 servers each handling 20% of the traffic. You do this for a number of reasons, with one of the key reasons being that 1 server can’t handle 100% of the traffic. However, 5 can. The question is, can 4 handle it? Or 3? Or 2? IT people — like myself — ask themselves this, and try to manage it so that even if 1 or more servers go down, the remaining servers can handle the load. (For example, I once read of a stock exchange that ran their multiple servers at no more than 20% capacity, just to handle variable load or failure.)

The cascading server failure problem occurs when the servers suffer the simultaneous problem of failing servers and too much load. If you have 5 servers available and very busy and one goes down, the other servers can get too busy and also go down. Eventually they all can fall like dominoes. And getting them up is difficult because no sooner do you get one up than it can get swamped with traffic and go down again. Not much fun for the IT people, or the poor frustrated users.

3 responses to “On the Gmail problem today (somewhat techie, depending on your background :))

  1. > “(very techie)”

    ???

    I expected some tech info about how balanced routers handle load and how increasing load can be managed. Well, nice try …

    What about congestion solutions/controls for routers?

  2. Bruce in New Zealand

    Yes, I would say the article is “very technical” – but only for those accustomed to reading press releases ;0

    • smartpeopleiknow

      I replied to Markus about this earlier, but since I am getting a number of comments about this now, I should comment generally.

      Yes, for IT people, this is not very technical at all. However, for most of the people who would read this blog, people who are not IT people, it would be” very techie” (and also somewhat boring, I suspect).

      I think I am going to have to modify the title so IT people can save their time and look elsewhere.