Dhosting Logo

 

Archive for the ‘Emergency Work’ Category

Server SGC Emergency Reboot

Thursday, January 22nd, 2009

We need to apply a hotfix to IIS on sgc.gbdns.net to try and fix a major issue with websites failing to load correctly.

To complete the hotfix installation we need to reboot the server. This will take 3-5mins to complete.

We are sorry for the short notice but this work is essential for all users websites to function correctly

Update 20:35 - Reboot complete, we are waiting to see if this fixes the problem. We should know within the next 15-25mins.

Server SGC Issues

Friday, July 4th, 2008

At 5:10AM this morning Server SGC’s IIS Services failed, we attempted to restart the services but this did not work.

While investigating this issue other services began to fail.

A reboot has been performed which should solve the issue, this server has been up for over 125 days without a reboot and it looks like this maybe the cause.

Update 05:36am - The server rebooted normally and service was restored at 05:34am. We will be performing a second reboot at 05:40am to double check everything is working properly

Update 05:42am - Second reboot is now complete, everything looks to be normal. We will be keeping a close eye on this box over the next 24/48hrs but we do believe it was down to amount of time the box has been up without a reboot.

Emergency Network Maintenance

Sunday, December 30th, 2007

Our network provider has indentified a potential problem with one of their core distribution switches located at the Kent Science Park Data Center. The problem could potentially cause network instability at higher traffic levels. They have been advised by their hardware vendor that this can be resolved by booting into an alternative firmware as a permanent fix.

Due to the critical nature of this issue, the work will take place tonight between 23:45hrs and 00:15hrs. The new firmware has already been copied to the switch, it simply needs a restart in order to boot from it. During this time our servers in Kent will experience up-to a 30 second connectivity drop while the new firmware loads. Our network provider have tested the new firmware in a lab environment, however the old firmware will be left on the switch in order that they can revert back to it should anything go wrong.

Update - 31/12/2007 01:00 -
This work was completed without issue.

Server SGC - Emergency Reboot

Sunday, November 18th, 2007

Server SGC is currently suffering IIS problems, the only way we believe to solve this is to do a full reboot, this is been completed now.

Telehouse East - Emergency Work

Monday, July 23rd, 2007

Our interconnection provider between Telehouse East and Meridian Gate will be performing emergency maintenance between 01:00hrs and 03:00hrs today, once traffic levels reach their nightly low. The work should take no more than 10 minutes to complete and is required to ensure stability of the network.

Unfortunately our Mail Server in our Kent Facility will loose connectivity for the duration of this work. The secondary mail server will pickup any incoming email, users will not be able to access POP3 or IMAP.

We must apologise for the short notice and any inconvenience this essential work may cause.

Emergency Maintenance

Friday, June 22nd, 2007

EMERGENCY MAINTENANCE WINDOW ANNOUNCEMENT

WINDOW: Friday June 22nd 22:00 - 23:59 GMT
ESTIMATED DOWNTIME: 10 MINUTES
NETWORK IMPACT: PARTIAL
DESCRIPTION:

Our provider will be swapping out a switch in lon1 (iP House) which they believe may be the cause of VRRP issues they are currently seeing and replacing it with a new Cisco 2960G series switch. This will mean that all devices are standard.

All customers will lose access to the network for a couple of minutes as this switch is swapped out.

We apologise for the short notice on this maintenance period however we believe it is neccessary to perform this work for the ongoing health of the network

Server SGC Raid Issue

Friday, June 8th, 2007

Over the past 2-3 days we have noticed an error appearing from the raid controller on Server SGC.

From further investigation this looks to be either a faulty cable or a failing hard drive.

We will be replacing the SATA cable this evening. This work will be done A.S.A.P due to the possibility of data corruption.

There will be a 10min window where all websites on Server SGC will be unavailable. Mail and websites hosted on Server Atlantis will be unaffected.

If the replacement cable does not solve the problem the hard drive will be disabled and then replaced on Monday evening.

We appologise for the short notice for this maintenance, but we believe a proactive response is required to maintain data integrity.

Server SGC Issue

Tuesday, May 29th, 2007

It appears server SGC is having issues and has rebooted itself with a BSOD error.

We are working on the issue at the moment.

An update will be posted shortly.

Emergency Maintenance Notification

Tuesday, April 17th, 2007

WINDOW: Thursday 19th April 22:00 - 23:59 GMT
ESTIMATED DOWNTIME: 5 MINUTES
NETWORK IMPACT: PARTIAL
DESCRIPTION:

JTAC (Juniper Technical Assistance Center) have informed our network provider (Goscomb) that a new software release will most likely solve their current VRRP releated issues. As such, they are scheduling the above emergency maintenance period to upgrade each of their 2 core routers (one in IPH and one in THE).

The routers will be upgraded one at a time and will result in some network insta bility (due to the fact that VRRP is currently not working as expected). Each router upgrade is expected to last approximately 5 minutes at the most.

A snapshot of the current router OS and config will be taken prior to each upgrade and should any part of the process fail these snapshots will be restored.

We apologise for the short notice of this maintenance work.

Server SGC Issue

Monday, March 19th, 2007

Some users may have noticed a performance issue when accessing our control panel and some websites today.

It appears after we fixed AWStats last night we may have found a corruption / drive issue. Currently server SGC is in the stopped state and is not acting as part of the NLB cluster.

We need to perform a full check disk which will involve shutting down the server and rebooting it into a prewindows state. A schedule is currently being created as we need to attach a KVM to the box.

When this is performed the following services will be affected
MSSQL - Customers without MSSQL load balancing will not be able to access their databases
MySQL 4 - The primary MySQL4 instance will be unreachable, all users are welcome of course to use the read-only slave (change the server to Atlantis)
DNS - ns3.gbdns.net will not be contactable
Control Panel - This will be offline as it only runs on server SGC

Everything else will be unaffected, most users will not notice any downtime but we will try and resolve this issue a.s.a.p.