PDA

View Full Version : Data integrity question- how much verification by SuperDuper?


stevea
05-12-2005, 02:43 PM
What kinds of verification tests does SuperDuper perform to ensure data integrity of the backup copy?
________
VAPOR TOWER (http://www.vaporshop.com/vapor-tower-vaporizer.html)

stevea
05-12-2005, 06:02 PM
We don't do any kind of "explicit" verification, since the disk controller itself performs that task for us.

Our general take is that you don't ask the Finder to verify things when you copy, and you don't do it when you write a file to disk with an editor (or whatever). That's because a hard disk has "built-in verification" -- its controller flags errors when they occur, on either read or write.

If we hit an error like this, we stop, because there's been a problem. But compare-after-write would both take time and cause unnecessary wear and tear on the drive.

(This is different from writing to a medium that is inherently prone to failure, like "older" Travan tapes, CDs or floppies...)




Yes, the built-in verification prevents media errors or device errors... but, it does not prevent errors or verify that the files are copied correctly to begin with. (For example, are all files copied? are permissions correct? etc.)

Also, the SuperDuper documentation specifies that "[files which Apple recommends not to copy]" are not copied... yet, there is no explicit documentation or verification to determine exactly what is copied and what is not copied.

Thus, it is not possible to determine if the backup copy is identical to the original or not. Presumably, it is not identical... but, not knowing the exact differences, makes it more difficult to determine if data integrity was maintained or not.
________
Detroit diesel v8 engine (http://www.chevy-wiki.com/wiki/Detroit_Diesel_V8_engine)

dnanian
05-12-2005, 06:04 PM
Steve, if you don't trust us to copy the files properly, how can you trust us to "verify" those files?

The files that are not copied are in the scripts themselves. You're welcome to look in there: every file that's not copied is listed.

stevea
05-12-2005, 06:17 PM
Steve, if you don't trust us to copy the files properly, how can you trust us to "verify" those files?

I don't wish to argue with you. But, availability of a verification feature would at least indicate some concern for data integrity on your part.

On a technical level, it seems that verification is less prone to troubles that plague copying and backup utilities on OS X.

There are lots of other reasons why verification is valuable and important, which I am sure you don't need me to list here.


The files that are not copied are in the scripts themselves. You're welcome to look in there: every file that's not copied is listed.

Thanks, that's helpful!
________
Ford falcon (argentina) specifications (http://www.ford-wiki.com/wiki/Ford_Falcon_(Argentina))

dnanian
05-12-2005, 06:28 PM
I'm not trying to argue: I just want to make sure you understand our position.

We spend a lot of time testing and ensuring that we do the right thing when we copy. We obviously have a high regard for data integrity, and spend considerable effort ensuring that we do the right thing, and the software is never released to make a date: it's released when it's been thoroughly tested.

We have internal tools that do careful comparisons of source and destination to ensure we're doing the right thing, and we don't release versions that don't pass our extensive battery of tests.

Building those tools into the main application doesn't really offer any benefits. If we ensured we do the right thing beforehand, then it's going to do the right thing. If we didn't, we'd have made a mistake in both the comparison/testing tools AND the program. And thus, you wouldn't see the problem when you relied on our own tools to verify a copy.

If you're concerned about this, I'd suggest finding a comparison/verification tool from a different author -- that way, you'd get some "independence" when looking at the files, thus isolating yourself (at some level, at least) from any algorithmic issues that would effect both sides of our own implementation.

stevea
05-12-2005, 06:36 PM
Building those tools into the main application doesn't really offer any benefits. If we ensured we do the right thing beforehand, then it's going to do the right thing. If we didn't, we'd have made a mistake in both the comparison/testing tools AND the program. And thus, you wouldn't see the problem when you relied on our own tools to verify a copy.


Thanks. I understand your position, but very strongly disagree with your opinion and logic here.

Obviously, you rely and trust your verification tools and consider them very valuable to you. So, it makes little sense for you to turn around and say they would have little or no value to anyone else.

Do you think you are the only person concerned with matters of data integrity?

How do you think other people would want to trust your software, if you don't even want to make available tests to verify the integrity of its copying?




If you're concerned about this, I'd suggest finding a comparison/verification tool from a different author -- that way, you'd get some "independence" when looking at the files, thus isolating yourself (at some level, at least) from any algorithmic issues that would effect both sides of our own implementation.

Of course.

In part, my questions are related to resolving differences in results of using various other tools, both to perform copying tasks and verification tasks.
________
Fz6 (http://www.yamaha-tech.com/wiki/Yamaha_FZ6)

dnanian
05-12-2005, 06:40 PM
Thanks for your comments.

stevea
05-12-2005, 07:02 PM
Thanks for your replies! :)


Another situation-

I am trying SuperDuper to backup a disk that has some problems. (Disk Utility reported some problems, DiskWarrior reported the directory was too damaged to rebuild, Disk Utility repaired the problems and says the disk is OK, but DiskWarrior reports the directory is still too damaged to rebuild.)

Other verification tools (Synchronize! Pro X 3.6.1) is running into errors when comparing the original disk and the SuperDuper backup copy, so I cannot verify the backup completely. And yet, SuperDuper made the backup copy without reporting any errors, so I am trying to analyze and understand what data can be trusted here...
________
E cigarette (http://vaporshop.com)

dnanian
05-12-2005, 07:07 PM
We're very conservative about error handling. If we get back any errors at all during the process, we stop and log the file (as well as a bunch of information about it).

Of course, if the directory damage involves the tree we're walking -- and leaves files out -- there's no way for us to know what's not shown to us by the OS...

What specific errors is SPX running into here? A comparison error, or an inability to walk the tree?

stevea
05-12-2005, 07:18 PM
What specific errors is SPX running into here? A comparison error, or an inability to walk the tree?

The SPX error: "* An error occurred while getting the location of the parent directory of a file to be written. Current privileges do not allow the operation."

This error appears during a verify-only, so nothing is to be written. I'm not sure if it is a typo in error message or a revelation of some hack or low-level error in getting data for verification purposes.

SPX has its share of bugs and I'm not very happy with it... it's another motivation for this thread- just seeking better solutions.
________
Michigan marijuana dispensaries (http://michigan.dispensaries.org/)

stevea
05-12-2005, 07:20 PM
Of course, if the directory damage involves the tree we're walking -- and leaves files out -- there's no way for us to know what's not shown to us by the OS...


Right. But, as a user, that and other problems are what I want to detect and know about (as I'm sure you understand)! :)
________
LINCOLN MARK SERIES PICTURE (http://www.ford-wiki.com/wiki/Lincoln_Mark_series)

dnanian
05-12-2005, 07:27 PM
That's unusual. I have no idea what they're trying to do there...

dnanian
05-12-2005, 07:37 PM
(Something weird happened on the forums, and the original reply to this message has moved to a different thread with a deleted parent. Weird: go figure. Here's the original text.)

We don't do any kind of "explicit" verification, since the disk controller itself performs that task for us.

Our general take is that you don't ask the Finder to verify things when you copy, and you don't do it when you write a file to disk with an editor (or whatever). That's because a hard disk has "built-in verification" -- its controller flags errors when they occur, on either read or write.

If we hit an error like this, we stop, because there's been a problem. But compare-after-write would both take time and cause unnecessary wear and tear on the drive.

(This is different from writing to a medium that is inherently prone to failure, like "older" Travan tapes, CDs or floppies...)

Hope that helps!

stevea
05-12-2005, 08:02 PM
(Something weird happened on the forums, and the original reply to this message has moved to a different thread with a deleted parent. Weird: go figure.

I first posted the original message in a different thread, then, after I realized my mistake, deleted it and reposted here as a new thread.

You were just too quick, and replied to the first message that I deleted in the other thread! :D
________
Ford nucleon picture (http://www.ford-wiki.com/wiki/Ford_Nucleon)

dnanian
05-12-2005, 08:05 PM
Oh, Jeez, good. I was becoming concerned that things were well and truly hosed here!

stevea
05-16-2005, 01:19 PM
Just to illustrate, here is an important scenario which is not addressed by SuperDuper, but where verification is essential to maintain data integrity:

Copy Disk 1 to Disk 2 using SuperDuper

During the copy, contents of Disk 1 (or Disk 2) are altered. This can happen very easily, because both disks are available to the user during the copy operation. It's possible for the user to make changes, or other applications to make changes, with or without the user being aware of it.

Without verification, the user believes Disk 1 and Disk 2 are identical.

However, in this situation, they are not.

Verification not only reveals the existence of the differences in data on Disk 1 and Disk 2, but also pinpoints and identifies the exact differences, so the user can be aware and determine how to proceed.
________
HOW TO ROLL A BLUNT (http://howtorollablunt.net/)

dnanian
05-16-2005, 01:39 PM
Thanks for the example, Steve.

marianco
05-22-2005, 12:42 PM
Steve,
I don't think your example is a good one. If a user deliberately makes changes to a disk while SuperDuper is running a backup, they should automatically realize that the changes may not be copied over the the backup. The solution would be to run SuperDuper again to update the changes. If anything, the user should refrain from using the Mac while running SuperDuper to avoid problems like this.

I also thought Dave's answer was pertinent and sufficient. The disk driver itself does the verification of the copy, otherwise verification becomes impractically time-consuming. SuperDuper's cloned disks have been the best, cleanest, most problem-free I have obtained from any utility so far. This includes Synchronize Pro (which I now use only for synchronization of folders) and Carbon Copy Cloner. What makes it so good is that even the aliases point to the correct files in the clone. Note that Synchronize Pro does have a simple veriication procedure (checking file size, various flags, etc) which is relatively quick. I have used this, but it hasn't been very useful since too many files are listed as different, when the data in the file I know is identical. After a while, I ended up ignore the listing of files the verification process made in Synchronize Pro's log file.

Your initial scenario of copying a drive with known severe directory/corruption problems is also problematic. I don't think you can reliably get a clone of that drive in the first place, since the files themselves cannot be reliably found. By the way, I did have such a drive once - where Disk Utility and Disk Warrior both reported the drive was too damaged to use. In this case, I don't know which helped but I ran Norton Utilities' Disk Doctor (which is why I still keep it around) and did a Safe Boot (holding the shift key while starting up, which runs the fsck command-line utility), and the drive was repaired. If the drive is corrupted beyond the usual repair utilities, you may have to use Data Rescue (a third party utility), to scavenge the drive and attempt to recreate its files and structure on another drive. Data Rescue is probably the best at doing this.

If you really want to verify the exact file differences between the original and the clones, a utility that can do this very well is "You Synchronize". It uses CRC32-bit checksums to determine if files are different before synchronization. This is far more sophisticated than checking file lengths or modification dates. This will definitely tell you what differences there are between the original and clone. The problem is that if you have hundreds of thousands of files (as many people often have these days), it literally takes hours upon hours to complete the verification since You Synchronize has to do the CRC 32-bit checksum for each file on the original and each file on the clone, then compare the checksums then add the data to its database. For myself, this made it very impractical to use on a daily basis - particularly since I like backing up to four clone hard-drives daily. I would be limited to one overnight synchronization a day if I used You Synchronize on anything but a small number of files.
------
marianco