First ArchivingI have often made claims in this blog that Backup and Archiving is different but haven’t really explained why and what I do?
I use a really old fashion 3 generation approach, Grand Father > Father > Son which originally came from using large real Magnetic Tape as the archive of mainframe data on larger installations.
The system is simple, once a month (or at some routine time, fortnightly, bi-monthly, quarterly, yearly) a new master archive tape (or Tapes) is written and it sticky labelled SON (on the outside not magnetically). The next month a new master is written on a fresh tape. The SON Label is removed from the old tape and affixed to the new master (it could be a new label). The older Master is now labelled FATHER. The Next month a new tape is written and label SON the old son becomes FATHER and the old father becomes GRANDFATHER. It was common to keep either the Father or Grandfather versions offsite. On the forth and subsequent months of the cycle the GRANDFATHER tape is overwritten with a fresh new full archive and becomes the SON. This approach means that you can recovered items up to 3 months old even if the where corrupted (or lost) on recent archive. It also Has the Advantage that the tapes are being regularly exercise and not slowly magnetically leaking data/corruption across the layers of the tape.
For the past few years I have followed this approach but instead of tapes I’m using old hard disk drives recovered from superseded computers. I have a simple USB plug in SATA HDD adaptor (they are not expensive) and the old mechanical hard disk are essentially free and reformatting then re-using them stops your data falling into the hands of unscrupulous computer recycles and being sold to hackers or phishing schemers. Two years ago now in my first set of frustrations with Windows 10 update I gave up trying to install windows 10 of a small Toshiba netbook (and instead installed LINUX on it and it works fine). It is now forms the key component of my Air Gapped Archive, its not top secret spy style setup, just a way to try and keep viruses and normal mechanical malfunction that might affect my normal computer network away from my archive files (and I’m well away from any internet connections).
I have dropped back to doing the full archive update quarterly but I do maintain a candidates for latest archive area on one of the disks on my local area network that all computer can copy to. Each quarter I transfer this to my Linux computer via an external USB drive and sneaker net (ie I unplug it from one and plug it into the other). This external drive is itself a backup of the latest SON Archive (perhaps recreate then and there) and so begins the process of cleaning and deduplication etc). This can take a from a few minutes to many hours if not days for poorly organized collections of everything!
Just have everything fully backed up does not constitute and archive. An Archive needs to be organised so things are easily found (keywords are not enough). This is a big topic and will be the subject of a few post to come.
Once I am happy with the new additions to the archive I merge then into the full son set of directories, I then get the grandfather drive which is normally keep offsite so I need to collect it or tale my Linux laptop to the offsite location. I then reformat the grandfather so it is basically a fresh clean disk. Exercising (aka user) a mechanical hard drive actually improves their life span and significantly reduces the likelihood of data loss. This reformatted and empty now gets a full copy of the working son archive labelled SON. the old son become FATHER and old father gets labelled GRANDFATHER and retired offsite.
This might sound complex but it is easy, and extra reliable. Despite being 2 out of 3 computers down I know my archive is safe.