PDA

View Full Version : adding archival backup to SD


ChrisRyland
01-20-2006, 06:11 PM
Please, please, please consider building a drop-dead simple archival backup feature into SuperDuper.

I know I've asked before, and you said it's Not Currently Your Mission.

But I believe such a product (or maybe make a new product and call it SuperSuperDuper, charging $100) would sell well, and would still be very easy to use: backup would be a volume, at whose top level is a directory with subdirectories for each backup date; each such subdirectory would correspond to the state of the backed-up volume at that point (and all files contained within would be hard-linked to any other unchanged files from previous backups, so you're really only storing the incremental changes, not the whole volume each time).

To restore, just find the file in question (in the Finder) under the most recent directory that's still in the state you wanted.

Boom, you're done.

dnanian
01-20-2006, 06:16 PM
Thanks for the suggestion, Chris. While you might find a volume comprised of a lot of hard links "easy to use", I'm not sure less experienced users would agree... not only that, but this volume wouldn't be restorable using normal means.

But, I do understand some users want similar functionality. We'll have to see how requests, and potential solutions, work out moving forward.

ChrisRyland
01-20-2006, 06:24 PM
Thanks for the suggestion, Chris. While you might find a volume comprised of a lot of hard links "easy to use", I'm not sure less experienced users would agree... not only that, but this volume wouldn't be restorable using normal means.

I don't think you're quite seeing the simplicity of it, perhaps due to my terminology.

First, yes, don't call it a backup--that might confuse some people. It's an archive, not to be confused with a backup, and not to be depended on for a full restoration.

But, once you accept that it's a time-based archive, you can "go back in time" to any backup point, and descend one level into the state of the backed-up volume at that time. Once you're there, there are no links to worry about, just the exact hierarchy of the backed-up volume at that point in time. (The hard links are just an implementation detail of which the end user does not have to be aware at all.)

dnanian
01-20-2006, 06:27 PM
No, I do see the simplicity of it -- in fact, I know of tools that do exactly this. I really do understand exactly what you're talking about.... whether it belongs in SuperDuper!, though, is another question. (And it does have to be explained, in documentation and operation... really. It can't be magic: that's very much against my philosophy.)

ChrisRyland
01-20-2006, 06:33 PM
No, I didn't mean to make it seem like "magic", just that the machinery involved was nothing more than normal Unix file system semantics.

And, yes, maybe it belongs in a new product called SuperArchive!

I know even if it costs $100 I'd buy at least 5-10 copies. (And I wouldn't stop buying SuperDuper! either--they're really different beasts. It'd be nice if it were all one product, just so I wouldn't have to deal with two distinct UIs, but if that's necessary to keep things simpler, so be it.)

I'll stop pushing now, and see if anyone else chimes in with support or suggestions.

dnanian
01-20-2006, 06:37 PM
Yes. I do understand what you're trying to do, and I have looked at these various methods in the tools that do them. The question to my mind is: what percentage of people want this, and does that justify either adding the complexity of adding it to SuperDuper or creating (designing, writing, testing, documenting, marketing, supporting) another similar product.

Believe me, the effort involved would not be covered by 5-10 copies, even at $100 each. ;) :D

Anyway, it's something that's under consideration: I am aware of the technique. Thanks!

Syzygies
01-21-2006, 12:43 AM
I had this same idea, got quite excited about it, and within 5 minutes of googling realized (of course) it had been done. The idea has many virtues: Don't rely on a proprietary format, take advantage of 20 cents/GB hard drives to store the data Finder-readable. Don't implement a specialized "snapshop" for each backup date as Retrospect does; you've got a friggin' file system already available to store information about files. (This is the same revelation that took Apple decades to realize: Why implement a resource fork, when the application could be a file system directory storing whatever you want? Retrospect evolved in the days of tape, but modern efforts can rethink the paradigm.) If one daydreams about a time-traveling hard disk whose contents change with one's desired view date, there can't be a simpler interface than clicking on a dated folder holding that view.

There's a free cross-platform product called rsnapshot (http://www.rsnapshot.org/) which does exactly this. It depends on the Unix command line tool rsync, which was only updated to handle resource forks correctly in Tiger. Nevertheless, if one searches the web to confirm all is well in Tiger, one finds patch files for Tiger rsync, to fix issues with the current release. To put this all together, one needs to be very comfortable around-the-block with makefiles, etc.

rsnapshot is a PERL script. rsync takes on a lot more than the problems faced copying between two local drives; its primary concern is fast network copying. So in a way, rsnapshot is a kludge, though it's well thought of. I agree that a solid, minimalist duplicate finder written in C could do the heavy lifting here, managing the resource forks, Spotlight data, etc. correctly while building hard links or whatever once it determined duplicates and carried out the needed copies.

If this is a 'column' view, then ChronoSync (http://www.econtechnologies.com/site/Pages/ChronoSync/chrono_overview.html) takes a 'row' view: An auxiliary top-level folder called _Archived Items holds a parallel copy of one's directory structure, with backup copies of revised files all in the same parallel location, automatically given names like Backups Library_v001. Handy when you've hosed a particular file, and want to peruse the earlier versions all at once. However, ChronoSync can't clone volumes; it operates using a particular user's permissions, so an administrative account can't easily archive an entire machine. I've had various 'issues' and one repeatable lethal crash that takes one of my user archives out of commission, and I've yet to hear back ever from ChronoSync's tech support. In contrast, while I've been able to discover some obscure minor issues with new releases of SD!, it overall has been astoundingly stable and reliable, and my questions get instant replies. I've been stressing SD!.

I come around more and more to the view that .dmg image files are a great currency. One knows Apple and others will be able to open them in a decade. With DVDs down to 30 cents or less each, breaking image files into 4.2g chunks and burning them asynchronously becomes very appealing. (I'll post my scripts for this soon.) I crave a small burn watcher application that notices whenever I walk by and pop in a blank DVD-R, fetches the next burn job in the queue, and pops out the verified DVD when it's done. With images on external hard drives, this asynchronous task could be moved to a different machine. Getting the initial copy to disk image done fast when it's convenient to do so is what SD! excels at.

That said, the exact file structure of an archive becomes increasingly irrelevant, as long as one uses the built-in file system somehow so Spotlight can find everything. (Apple's goal here is to get everyone to forget about actual file locations, so our drives can become every bit as messy as our physical offices.) The most convenient file structure would support burning older portions of the archive compactly to DVD, then deleting to make space. Hard linking doesn't achieve this goal; every view is a complete snapshot, taking as much space as the entire original. Rather, the views save space collectively by overlapping each other.

This argues in favor of "differential backup" directories, containing only the differences generated in a given time frame. (rsnapshot could presumably be extended to generate a second set of directories containing any desired difference views.)

One reasonable evolution for SP! would be a filtering option to not copy any file found on one volume or image, in copying to a second volume or image. So e.g. use a complete "March 1, 2006" image to help generate and keep current a "latest version changed in March" image. Come April, demote the "now" image to "April 1, 2006", burn the March changes, and continue. Do the same for shorter time frames, as desired...

ChrisRyland
01-21-2006, 01:05 AM
Thanks for those thoughts. Yes, I'm aware that other solutions exist--my favorite (theoretically so far, because I'm a Python addict) is rdiff-backup (http://www.nongnu.org/rdiff-backup/index.html).

And, yes, I was using Tiger rsynch before (and the UM customized rsynch before that) and found they all had reliability problems. So I gave up on those. rdiff-backup I haven't tried, so it is still theoretically useful. :)

And I'm currently using Apple "asr" on my server for volume image backups, so can live with with command lines, as long as they're reliable.

Since SuperDuper is so reliable and drop-dead easy to use (with a nice GUI), I was hoping to find out if others on this board agreed that this would be a great feature (or a great different product) to have.

And, yes, Dave, I understand that my paltry ca. $500 ain't gonna pay for development. I was hoping to hear others chime in with "sure, I'd pay for that!" to the point where you got some sense it might pay over the medium term. (No product is a drop-dead sure thing in the short term.)

Syzygies
01-22-2006, 06:32 PM
This hard linking feature is a command line one-liner, "cp -al", if one can make a parallel directory on the same file system. As far as I can see, SD! will only copy to volumes as targets, not subdirectories like Retrospect's "subvolume" feature. Also, SD! deletes everything else that it finds on the target. If SD! could loosen either of these behaviors, then the above one-liner would trivially support archiving as described in this thread.

There are other, tamer reasons to want SD! to copy directory to directory, as an option. If this is one's goal, with the current filtering setup SD! wades through all the other files on the volume, rather than restricting its focus to a given directory. For my directory -> disk image tasks, on updates this is where half the time goes.

There are other, tamer reasons to want SD! to have filtering to protect certain target files from deletion. This is dumb, but for example I like my backup volumes to have a different label color. Etc. Etc.

I played around for a bit looking for a hack to present a local directory as a volume. E.g. adding mounts using Netinfo Manager, trying to connect to afp://localhost. Nothing worked. There should be a routine way to mount a folder so it appears to be a mounted volume, but googling didn't turn up one for me.