This morning we experienced a failure cascade resulting in a loss of network connectivity to your server(s). One of our core routers suffered a hardware failure, and the automatic switchover to the spare did not operate properly. Normal service was restored at 08:39.
We now believe that the Sun hardware we had been using is unreliable and will be replacing the remaining units as soon as practical. Additionally, we are undertaking a thorough review of our automated fail-over systems to be 100% certain that they will operate properly in future.
I cannot sufficiently apologise for this downtime. Obviously any failure of this sort is unacceptable, and my team and I will be working hard over the coming weeks not just to fix the issues seen today, but to undertake a comprehensive testing and investigation programme to ensure that any other potential weaknesses are found before they cause a problem.
Bookmarks