The 24 hour shutdown of Skype last week was due to heavy traffic and a bug in version 5.0.0152 of Skype for Windows. Other versions of the Skype client for Windows and Skype for Mac, Skype for iPhone, Skype on your TV, and Skype Connect or Skype Manager for enterprises were not affected by this initial problem.
In a blog posting today Skype’s CIO Lars Rabbe explains what happened.
On Wednesday, December 22, a cluster of support servers responsible for offline instant messaging became overloaded. As a result of this overload, some Skype clients received delayed responses from the overloaded servers. In a version of the Skype for Windows client (version 5.0.0152), the delayed responses from the overloaded servers were not properly processed, causing Windows clients running the affected version to crash.
However, around 50% of all Skype users globally were running the 188.8.131.52 version of Skype for Windows, and the crashes caused approximately 40% of those clients to fail. These clients included 25–30% of the publicly available supernodes, also failed as a result of this problem.
Supernodes have a built in mechanism to protect themselves and to avoid adverse impact on the systems hosting them when operational parameters do not fall into expected ranges. We believe that increased load in supernode traffic led to some of these parameters exceeding normal limits, and as a result, more supernodes started to shut down. This further increased the load on remaining supernodes and caused a positive feedback loop, which led to the near complete failures that occurred a few hours after the triggering event.
Regrettably, as a result of the confluence of events – server overload, a bug in Skype for Windows clients (version 184.108.40.206), and the decline in available supernodes – Skype’s functionality became unavailable to many of our users for approximately 24 hours.
This catastrophic failure has led Skype to look at how it automatically updates client software as well as examining how it can better react to failures and restore the system more quickly.
Users can help by ensuring they have installed the latest versions of the client..there was an updated version of the software (version 220.127.116.11) but most users had not updated, thus setting the scene for this Christmas disaster.
From: The Big Blog