83
edits
No edit summary |
No edit summary |
||
(25 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
At the end of the 2016/2017 academic year, during weeks | At the end of the 2016/2017 academic year, during weeks 9 and 10, [[Sam Willcocks]], [[Tom Lee]] and [[Matthew Stratford]] decided it would be a good idea to tear everything out of the AV and Computing racks (See [[The Great Tech Redo 2017]]). Unfortunately, some of the servers didn't like being turned off and moved around and decided to fail. | ||
The first server to complain was backup, which complained of a degraded array. This was caused by a [[docs:Glossary | High Impedance Air Gap]] between the hard drive power supply and the drive. | The first server to complain was backup, which complained of a degraded array. This was caused by a [[docs:Glossary | High Impedance Air Gap]] between the hard drive power supply and the drive. | ||
Line 14: | Line 14: | ||
Tim determined that the drive was healthy, which was slightly concerning, but more concerning was the beeping that started to come from Attenborough. | Tim determined that the drive was healthy, which was slightly concerning, but more concerning was the beeping that started to come from Attenborough. | ||
#Computing moved up a level of emergency. | |||
[[File: | [[File:4_Attenborough_Dies.png]] | ||
Attenborough | Rebuilding the dead OS drive failed. So the team decided to give the old "failed" drive a try. This started to rebuild fine so everyone went to the pub. | ||
One Courtyard meal later and the OS drive claims to be rebuilt, but just to be sure, Attenborough was rebooted into the RAID BIOS. Much to the disappointment of all present, Attenborough promptly started to beep and reported the RAID array to be degraded. Just in case something other than the drive had failed, Attenborough was set about rebuilding his RAID array again. This failed. | |||
At this point there was only one thing to do: call [[Sam Willcocks|Sam]]. Sam suggested installing Ubuntu on another, unraided, drive to dump all of the data on Attenborough onto Backup. | |||
The new Ubuntu-based temporary Attenborough was given the hostname "TomScott" after York alumnus [[w:Tom_Scott_(entertainer)]], who is in part known for bodging together the Emoji keyboard. | |||
[[Hui-Ling Phillips]] and [[Katherine Bell]] had arrived bringing the gift of biscuits and all awaited data to start pouring onto Backup through the magic of rsync with the Pirates of the Caribbean soundtrack playing in the background to match the atmosphere and keep morale high. | |||
Pizza was then ordered. | |||
The decision was made to prioritise current and paid productions from pending edits during the transfer; so these projects were synced to Backup first. Edwin then set about copying these projects from backup to his SSD so that YSTV definitely, absolutely, without a doubt had a copy. In the mean time [[Kenric Yuen|Kenric]], Hui-Ling, Katherine and Edwin started crimping some Cat5 cables to help while computers were being dealt with. | |||
All seemed well so the Tech/Computing teams went back to completing small jobs about the studio. Tim started working on chron jobs to automate backups, Tom started to assemble a media cache for the edit PCs as a way to combat drive failure, and Matt continued work on patching/routing various cables. All was fine, until Backup started reporting SMART errors. | |||
Goddammit. | |||
Now there was a mad scramble to retrieve data from Backup onto Edwin's SSD. | |||
As there was not much else to do other than to wait for files to sync and pray that Backup lived long enough, Tom took this opportunity to go home, have a shower, and change - shortly followed by Katherine and Hui-Ling. Meanwhile, Matt and Tim pulled Backup out of the Computing rack. Upon Tom's return, Tim took his shift of showering and changing and Tom pulled the 2TB drive out of [[Obriain]] to be sacrificed to the great Backup RAID array. Matt and Tom take the opportunity during the rebuild to continue the ongoing attempt to tidy up the studio. | |||
After Tim's return, Tim and Tom continue setting up the media cache while Tom continued his effort to write the wiki article for the 4<sup>th</sup> in the series of Drive Crashes while the crash unfolded around him. | |||
=Attendees and Roles= | |||
{| class="wikitable" | |||
|- | |||
! Person | |||
! Role | |||
|- | |||
| Katherine | |||
| Nervous and there | |||
|- | |||
| Hui-Ling | |||
| Cable Monkey | |||
|- | |||
| Tom | |||
| Chief Bodger | |||
|- | |||
| Tim | |||
| Linux Wrangler | |||
|- | |||
| Matt | |||
| Cat5 Patcher | |||
|- | |||
| Edwin | |||
| Stressed out Editor | |||
|- | |||
| Sam | |||
| Remote Tech Support | |||
|- | |||
| Kenric | |||
| Crimping Party Starter | |||
|} | |||
=Drive Crash 4 as Told by #Computing= | |||
[[File:1_Tempting_Fate_1_Rob.png]] | |||
[[File:2_Tempting_Fate_2_Rob.png]] | |||
[[File:3_Tempting_Fate_3_Matt.png]] | |||
[[File:4_Attenborough_Dies.png]] | |||
[[File:5_Optimism_Peter.png]] | |||
[[File:6_Realism_Edwin.png]] | |||
[[File:8_Optimism_Tim.png]] | |||
[[File:9_Backup_Kills_A_Drive.png]] | |||
[[File:10_All_Is_Well.png]] | |||
=Lessons Learned= | |||
* Keep regular backups | |||
* Don't unplug the servers | |||
** No, that's not a good reason to | |||
** Seriously, they will fail | |||
* Drives cling to life until powered off (mostly) | |||
* Bring biscuits | |||
* Sleep is good | |||
=The Final Fatality= | |||
After returning home to recover from the ordeal, Tom sat down at his desktop to find it frozen. After months of being neglected to be maintained, and several days of being left on, Tom's desktop's OS drive had failed. The final victim of Drive Crash 4. | |||
[[Category:Drive_Crashes]] |