Drive Crash 4: Now That's What I Call rsync

From YSTV History Wiki
Revision as of 05:03, 17 June 2017 by Tom.lee (talk | contribs)
Jump to navigation Jump to search

At the end of the 2016/2017 academic year, during weeks 9and 10, Sam Willcocks, Tom Lee and Matthew Stratford decided it would be a good idea to tear everything out of the AV and Computing racks (See The Great Tech Redo 2017). Unfortunately, some of the servers didn't like being turned off and moved around and decided to fail.

The first server to complain was backup, which complained of a degraded array. This was caused by a High Impedance Air Gap between the hard drive power supply and the drive.

Shortly after plugging the drive back into backup, Web started to complain of a drive reporting SMART errors. This drive was replaced and Backup, not to be outdone by Web, decided to destroy one of its drives. This drive was replaced and all was well.

The Attenborough Disaster

Matt and Tim Bradgate were happily patching cat5 when Tom (who was patching SDI in the AV rack at the time) noticed a suspiciously familiar beeping noise.

The general reaction was "oh crap, not again".

Edwin Barnes was asked to stop editing so we could shutdown Attenborough and the dead drive was identified as one of the OS disks. The disk was replaced and the array began to rebuild so Tim turned his attention to determining why the drive had failed, whereas Matt and Tom went back to patching the AV rack.

Tim determined that the drive was healthy, which was slightly concerning, but more concerning was the beeping that started to come from Attenborough.

At this point #Computing moved up a level of emergency.

4 Attenborough Dies.png

Attenborough disaster

 - Patching the AV Rack
 - Tom hears a beeping
 - OhShit.jpeg
 - Dead OS drive
 - No problem, we'll bung another drive in
 - After sorting through all the other dead OS drives, we found an healthy (we think) 500GB drive
 - Tried rebuilding onto the new drive
 - Raid status optimal
 - Reboot
 - Raid status degraded
 - Old drive passing SMART tests
 - Bung old drive into server
 - RAID card crashes
 - And again
 - Optimal
 - Degraded
 - Katherine Bell and Hui-Ling Phillips arrive with biscuits
 - RAID card crashes
 - Try rebuild again
 - Phone call to Sam Willcocks who is in sheffield after a job interview in London
 - Discussions about transferring data to Bruce
 - Decision is made to transfer data off of Attenborough to Backup
 - Pirates of the Caribbean soundtrack used to compliment the atmosphere (and keep spirits high)
 - Pizza ordered
 - Debian installed on another 500GB drive
 - Given the host name "TomScott" due to the new OS disk being a temporary bodge
 - ZFS pool mounted
 - The rsyncing begins
 - We rsync the most recent (and most important) productions to backup
 - Edwin Barnes starts transferring footage from Pending Edits backup to Edit 2 to edit
 - All seems well
 - Tim Bradgate starts working on chron jobs to auto backup
 - Tom starts working on a media cache PC/Ingest station
 - Matt continues working on patching/routing
 - Backup starts reporting SMART errors
 - Goddammit
 - Scramble to retrieve data from backup onto Edwin's SSD and one of YSTV's SSDs
 - Tom takes a break to have a shower while he has the opportunity
 - Shortly followed by Katherine and Hui-Ling
 - Meanwhile Matt and Tim pull Backup out of the rack
 - Tom returns and Tim goes for a shower
 - The 2TB drive in Obriain is sacrificed to Backup to rebuild the array
 - Matt and Tom take the opportunity during the rebuild to continue the ongoing attempt to tidy up the studio

Attendees

 - Katherine - Nervous and there
 - Hui-Ling - Cable Monkey
 - Tom - Chief Bodger
 - Tim - Linux Wrangler
 - Matt - Cat5 Patcher
 - Edwin - Stressed Out Editor
 - Sam - Remote Tech Support