Dhosting Logo

 

Archive for January, 2008

Server Atlantis Network Issue

Tuesday, January 8th, 2008

Server Atlantis has stopped passing traffic in or out of its internet facing NIC.

We have tried disabling and re-enabling the nic with no success. We are currently performing a reboot.

If this does not work we will move external traffic to the internal nic.

Update 17:40 - The server is refusing to reboot it seems to be hanging on shutdown, a request has been submitted for a manual system reboot, this should be completed within <10mins.

Update 18:00 - The hard reboot fixed the NIC issue. There are no errors in our logs or anything which would explain the NIC not to pass traffic inbound or outbound. We will monitor the system and if it looks like the issue may occur again we will replace the network card. It is possible that windows itself caused the issue but we have never seen anything like this before.

All email was held at our secondary mail server and delivered to all inboxes.

Update 18:54 - This issue has again occurred we are working to fix the problem

Update 19:15 - An engineer is on route to replace the NIC they should be onsite < 15mins

Update 19:40 - Engineer has arrived at the kent DC the server will show online for the next couple mins while network settings are removed from the old NIC. Once this is complete we can shutdown the server replace the old NIC then boot it up. The NIC settings can then be set and one final reboot should bring everything 100% live.

Update 20:35 - The server is now back online, we will be monitoring the server very closely.

We appologise for the outage but we hope users understand this is a very unusual issue.

Network Maintenance

Monday, January 7th, 2008

Following extensive investigation of our network provider’s recent network glitches, it is understood that they need to upgrade the software on their router… “rt1.the (TeleHouse East)”. This needs to be performed urgently as there are various network conditions that could cause this router to suffer repeat issues.

Performing such upgrades on a live router would be asking for trouble, and as a result they will be migrating all core services to ‘rt2.the’ Tuesday morning between 00:30hrs and 01:30hrs. This is a much newer router and already runs the latest software installation. In particular this installation irons out various stability bugs with regards to coping with physical link failures.

During this time customers may notice up-to 10 seconds of packet loss per IP subnet, on a subnet by subnet basis as BGP routes converge. Depending on your location and routes into the network, you may not notice anything at all.

Update 01:40 - This work was completed successfully

Network Outage

Monday, January 7th, 2008

Between 03:38-04:15 today we experienced a network outage which affected all our servers.

This was causing packet loss between some UK ISP’s and EU / US providers.

We are currently awaiting a full RFO from our network provider

Server Atlantis Outage

Tuesday, January 1st, 2008

Server Atlantis is currently experiencing an outage, we are looking into the problem.

More information will be posted shortly.

Update - 03:30
Service was restored to this server at 03:28

The outage started at 03:01 and ended at 03:28

We were informed of the issue by our external monitoring at 03:12 unfortunately due to the new year the message was delayed by 5 mins because of mobile network congestion.

An RFO will be provided later this morning, when our investigation is complete.