Drive Crash 4: Now That's What I Call rsync: Difference between revisions

From YSTV History Wiki
Jump to navigation Jump to search
No edit summary
No edit summary
Line 14: Line 14:
Tim determined that the drive was healthy, which was slightly concerning, but more concerning was the beeping that started to come from Attenborough.
Tim determined that the drive was healthy, which was slightly concerning, but more concerning was the beeping that started to come from Attenborough.


At this point #Computing moved up a level of emergency.
#Computing moved up a level of emergency.


[[File:4_Attenborough_Dies.png]]
[[File:4_Attenborough_Dies.png]]
Rebuilding the dead OS drive failed. So the team decided to give the old "failed" drive a try. This started to rebuild fine so everyone went to the pub.
One Courtyard meal later and the OS drive claims to be rebuilt, but just to be sure, Attenborough was rebooted into the RAID BIOS. Much to the disappointment of all present, Attenborough promptly started to beep and reported the RAID array to be degraded. Just in case something other than the drive had failed, Attenborough was set about rebuilding his RAID array again. This failed.
At this point there was only one thing to do: call [[Sam|Sam Willcocks]]. Sam suggested installing Ubuntu on another, unraided, drive to dump all of the data on Attenborough onto Backup.
The new Ubuntu-based temporary Attenborough was given the hostname "TomScott" after YSTV alumni [[W:Tom_Scott_(entertainer)]], who is in part known for bodging together
[[Hui-Ling Phillips]] and [[Katherine Bell]] had arrived bringing the gift of biscuits and all awaited data to start pouring onto Backup through the magic of rsync with the Pirates of the Caribbean soundtrack playing in the background to match the atmosphere and keep morale high.
Pizza was then ordered.


Attenborough disaster
Attenborough disaster
  - Patching the AV Rack
  - Tom hears a beeping
  - OhShit.jpeg
  - Dead OS drive
  - No problem, we'll bung another drive in
  - After sorting through all the other dead OS drives, we found an healthy (we think) 500GB drive
  - Tried rebuilding onto the new drive
  - Raid status optimal
  - Reboot
  - Raid status degraded
  - Old drive passing SMART tests
  - Bung old drive into server
  - RAID card crashes
  - And again
  - Optimal
  - Degraded
  - [[Katherine Bell]] and [[Hui-Ling Phillips]] arrive with biscuits
  - RAID card crashes
  - Try rebuild again
   - Phone call to [[Sam Willcocks]] who is in sheffield after a job interview in London
   - Phone call to [[Sam Willcocks]] who is in sheffield after a job interview in London
   - Discussions about transferring data to [[Bruce]]
   - Discussions about transferring data to [[Bruce]]

Revision as of 05:38, 17 June 2017

At the end of the 2016/2017 academic year, during weeks 9and 10, Sam Willcocks, Tom Lee and Matthew Stratford decided it would be a good idea to tear everything out of the AV and Computing racks (See The Great Tech Redo 2017). Unfortunately, some of the servers didn't like being turned off and moved around and decided to fail.

The first server to complain was backup, which complained of a degraded array. This was caused by a High Impedance Air Gap between the hard drive power supply and the drive.

Shortly after plugging the drive back into backup, Web started to complain of a drive reporting SMART errors. This drive was replaced and Backup, not to be outdone by Web, decided to destroy one of its drives. This drive was replaced and all was well.

The Attenborough Disaster

Matt and Tim Bradgate were happily patching cat5 when Tom (who was patching SDI in the AV rack at the time) noticed a suspiciously familiar beeping noise.

The general reaction was "oh crap, not again".

Edwin Barnes was asked to stop editing so we could shutdown Attenborough and the dead drive was identified as one of the OS disks. The disk was replaced and the array began to rebuild so Tim turned his attention to determining why the drive had failed, whereas Matt and Tom went back to patching the AV rack.

Tim determined that the drive was healthy, which was slightly concerning, but more concerning was the beeping that started to come from Attenborough.

  1. Computing moved up a level of emergency.

4 Attenborough Dies.png

Rebuilding the dead OS drive failed. So the team decided to give the old "failed" drive a try. This started to rebuild fine so everyone went to the pub.

One Courtyard meal later and the OS drive claims to be rebuilt, but just to be sure, Attenborough was rebooted into the RAID BIOS. Much to the disappointment of all present, Attenborough promptly started to beep and reported the RAID array to be degraded. Just in case something other than the drive had failed, Attenborough was set about rebuilding his RAID array again. This failed.

At this point there was only one thing to do: call Sam Willcocks. Sam suggested installing Ubuntu on another, unraided, drive to dump all of the data on Attenborough onto Backup.

The new Ubuntu-based temporary Attenborough was given the hostname "TomScott" after YSTV alumni W:Tom_Scott_(entertainer), who is in part known for bodging together

Hui-Ling Phillips and Katherine Bell had arrived bringing the gift of biscuits and all awaited data to start pouring onto Backup through the magic of rsync with the Pirates of the Caribbean soundtrack playing in the background to match the atmosphere and keep morale high.

Pizza was then ordered.



Attenborough disaster

 - Phone call to Sam Willcocks who is in sheffield after a job interview in London
 - Discussions about transferring data to Bruce
 - Decision is made to transfer data off of Attenborough to Backup
 - Pirates of the Caribbean soundtrack used to compliment the atmosphere (and keep spirits high)
 - Pizza ordered
 - Debian installed on another 500GB drive
 - Given the host name "TomScott" due to the new OS disk being a temporary bodge
 - ZFS pool mounted
 - The rsyncing begins
 - We rsync the most recent (and most important) productions to backup
 - Edwin Barnes starts transferring footage from Pending Edits backup to Edit 2 to edit
 - All seems well
 - Tim Bradgate starts working on chron jobs to auto backup
 - Tom starts working on a media cache PC/Ingest station
 - Matt continues working on patching/routing
 - Backup starts reporting SMART errors
 - Goddammit
 - Scramble to retrieve data from backup onto Edwin's SSD and one of YSTV's SSDs
 - Tom takes a break to have a shower while he has the opportunity
 - Shortly followed by Katherine and Hui-Ling
 - Meanwhile Matt and Tim pull Backup out of the rack
 - Tom returns and Tim goes for a shower
 - The 2TB drive in Obriain is sacrificed to Backup to rebuild the array
 - Matt and Tom take the opportunity during the rebuild to continue the ongoing attempt to tidy up the studio

Attendees

 - Katherine - Nervous and there
 - Hui-Ling - Cable Monkey
 - Tom - Chief Bodger
 - Tim - Linux Wrangler
 - Matt - Cat5 Patcher
 - Edwin - Stressed Out Editor
 - Sam - Remote Tech Support