Unexpected downtime

Message boards : News : Unexpected downtime
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Teemu Mannermaa
Project administrator
Project developer
Project tester

Send message
Joined: 20 Apr 11
Posts: 360
Credit: 758,716,620
RAC: 121,005
Message 875 - Posted: 5 Aug 2011, 10:33:13 UTC

It seems we had our first unexpected downtime yesterday. :( We've had some offline times previously but most of them have been for a shorter period of time and on purpose due to maintenance and updates.

Access to our services were failing on Thursday from about 4:00 to 20:30 when I finally brought things back online. Those times are on my local EEST+3 timezone, which means from 1:00 to 17:30 UTC and from Wed 18:00 to 10:30 PDT. This is about 16 and half hours of lost time.

Looking through logs, this seems to have been caused by the server running out of memory and subsequently OOM-killing itself to death. I have few things I can do to prevent same problem bringing us down in the future. (Like moving the DB to a different server as the OOM-killer chose the poor DB to die on first round.)

I did notice the problems in the morning but due to unrelated complications (non-project ones) I didn't manage to get the server back online until that evening. I do apologize for this extending our downtime. :(

Everything should be back to normal now but due let me know if there are still problems around. Thanks and now let's crunch hard to make up for the lost time! ;)
ID: 875 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile ZydorProject donor
Avatar

Send message
Joined: 5 May 11
Posts: 233
Credit: 351,414,150
RAC: 0
Message 883 - Posted: 5 Aug 2011, 14:09:57 UTC - in response to Message 875.  

The Cache took the strain :) Otherwise no dramas at this end, its kinda what the cache is for anyway.

First significant downtime, that wasnt really an issue due to cache, so the silence on the Boards is not remarkable, could even say it was a testiment of trust that it would get sorted as any downtime is so rare.

Queue flood of posts yelling "not for me it wasnt" rofl ...... Murphy and I can be close aquaintences at times :)

Personally I'd say take a bow, not a slap round the head :)

Regards
Zy
ID: 883 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Fire$torm [BlackOps]
Avatar

Send message
Joined: 2 May 11
Posts: 4
Credit: 855,435,132
RAC: 308,313
Message 884 - Posted: 5 Aug 2011, 15:23:32 UTC - in response to Message 883.  

The Cache took the strain :) Otherwise no dramas at this end, its kinda what the cache is for anyway.

First significant downtime, that wasnt really an issue due to cache, so the silence on the Boards is not remarkable, could even say it was a testiment of trust that it would get sorted as any downtime is so rare.

Queue flood of posts yelling "not for me it wasnt" rofl ...... Murphy and I can be close aquaintences at times :)

Personally I'd say take a bow, not a slap round the head :)

Regards
Zy


+++1

And thx for an honest report on the why. Good job.


ID: 884 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile Mad Matt
Avatar

Send message
Joined: 7 May 11
Posts: 4
Credit: 61,768,083
RAC: 0
Message 892 - Posted: 7 Aug 2011, 13:28:22 UTC - in response to Message 884.  

The Cache took the strain :) Otherwise no dramas at this end, its kinda what the cache is for anyway.

First significant downtime, that wasnt really an issue due to cache, so the silence on the Boards is not remarkable, could even say it was a testiment of trust that it would get sorted as any downtime is so rare.

Queue flood of posts yelling "not for me it wasnt" rofl ...... Murphy and I can be close aquaintences at times :)

Personally I'd say take a bow, not a slap round the head :)

Regards
Zy


+++1

And thx for an honest report on the why. Good job.


+1 ... :) Good job, Teemu.
ID: 892 · Rating: 0 · rate: Rate + / Rate - Report as offensive

Message boards : News : Unexpected downtime


 
Copyright © 2011-2017 Moo! Wrapper Project