PDA

View Full Version : Getting RAID timeouts


Dewdman42
04-03-2009, 02:11 PM
I have been using SuperDuper with great success for the past few months, with a scheduled backup that takes place every night, which creates about 30GB read-only dmg file.

I have updated the backup to run a script at the end. I created a very simple shell script which looks like this:

cp -rp Retail_10.5.6_Latest.dmg /Volumes/PromiseRaid/Backup/SuperDuper/.

The PromiseRaid volume is a SMB share on another windows box.

I can run this shell script manually and it works fine.

When my SuperDuper backup runs, it has problems when it gets to this script. There is a dialog box that pops up that says something about AppleEvent timed out during copy job. I'm unable to close this box and instead have to resort to killing the copy process. I'm also getting time out messages from my RAID controller.

I don't understand why this copy script works fine on its own, but when Superduper tries to use this copy shell script, the RAID device on the other machine times out and the copy fails.

Any ideas?

dnanian
04-03-2009, 02:16 PM
I'm guessing, but perhaps because the copy isn't completely done? What I'd suggest is tossing that script in the background and sleeping, say, a minute before actually starting the copy.

Dewdman42
04-03-2009, 02:28 PM
Are you saying that SuperDuper is launching the shell script before it has completed its own previous steps? I can certainly add a 30 second sleep to the beginning of the script to see if that will fix things.

Its still odd that when I get up in the morning, I see this AppleEvents error dialog box(which I can't close), the shell script is no longer running, Super Duper is no longer running and the copy (from the shell script did not work). And why does it cause my RAID controller on the other machine to timeout? Nothing else I have been doing has caused that to happen.

dnanian
04-03-2009, 02:38 PM
We're just running the script, but do try what I suggested - it's easy to do, and since it'll let SD! get out of the way before you start doing your thing, I think it'll be more likely to be successful.

As far as why the controller times out - I have no idea. Obviously, we can't affect your RAID controller.

Dewdman42
04-03-2009, 02:54 PM
Since this is the only thing causing the RAID to timeout, and no copying is happening, there must be something about the way SuperDuper is launching the shell script. Somehow the shell script must be wedged waiting for something to clear on the mac, while the RAID is also waiting on the script. Like a deadlock.

I also don't get why there is a dialog box up in the morning saying "timed out during copy job", which I don't think should have anything to do with the shell script. I'll try to get a screenshot next time. That must be from SuperDuper also.

Anyway, I have updated the shell script to start with a 2 minute sleep. I have also updated the script to make sure that the full path name of both the source and dest files are specified. I have also modified the shell script to include this on the first line:


#!/bin/bash


Here is the whole script:


#!/bin/bash

SOURCE=/Volumes/Audio/SuperDuperImages/Retail_10.5.6_Latest.dmg
DEST=/Volumes/PromiseRaid/Backup/SuperDuper/Retail_10.5.6_Latest.dmg

sleep 120

cp -rp $SOURCE $DEST


Does SuperDuper go ahead and quit once it launches the shell script?

dnanian
04-03-2009, 03:36 PM
We're going to wait for the script to finish. You get the error because the script task likely timed out.

You'll need to put this script in the background. That is:


#!/bin/bash
/bin/bash -c "sleep 120; cp -rp sourcefile destfile" &

Dewdman42
04-03-2009, 03:44 PM
Oh I see, you're suggesting to put the shell script into the background IN ORDER SO THAT, Super Duper can terminate and get completely out of the way. I get it now. I will try that.

I was not under the impression that shell scripts and classic unix utils like "cp" would cause a gui dialog box to pop up under any circumstances. When I have a screen shot I will post it here so that you can see, its not likely caused by the script. In order to get rid of the box I have to go into Activity Monitor can kill some job called copy something er other...which does sound like it must be from "cp", but its very strange to me that a shell script should create a gui dialog box to report error.

dnanian
04-03-2009, 03:56 PM
The shell script didn't do that. Rather, because SD! itself stopped communicating with its applescript interface, the copy job 'schedule driver' got an error.

Dewdman42
04-03-2009, 04:04 PM
Which I guess is what putting cp in the background should end that problem. I should probably modify that script to send cp stderr to a file too.

Dewdman42
04-03-2009, 04:39 PM
Well, I ran the backup job with SuperDuper. SuperDuper finished what it needed to do right away, as suggested. It did not create the AppleEvent error dialog this time, and my RAID did not bark at me about timeouts or anything...but...the cp command didn't copy anything either.

Running the shell script manually works fine, does the job in the background as intended.

dnanian
04-03-2009, 04:41 PM
We're really just running the script -- nothing tricky. You might want to test out with something that just echos text into a file...

Dewdman42
04-04-2009, 01:51 PM
I dunno, I guess I give up. I put the script into the background as you suggested. I can run it manually from the command line and it works perfectly. When my automated backup ran last night (it takes 2 hours, so its difficult to test this), same problem as before this morning:

- SuperDuper did not create the dmg on the local drive

- nothing was copied

- Alert dialog with AppleEvent timed out

- My RAID on the windows box barked at me about timeout, 4 hours after SuperDuper was scheduled to start the backup, and then again two hours later

- I added a few lines to the shell script to send some text to a file before starting the "cp" command, and that did not occur either, which implies the script did not get launched at all, but that doesn't explain the AppleEvent dialog box coming up.

I'm starting to wonder if the problem has nothing to do with the script or my RAID or anything, but simply that SuperDuper is failing to even perform its initial dmg creation for some reason. Thus the "CopyJob" process is wedging up with that error dialog box. Could the screen saver have anything to do with this?

dnanian
04-04-2009, 02:42 PM
OK. If the DMG wasn't created at all, it's likely failing mid-creation. I'd suggest creating a small copy script that includes "Exclude all files" and then includes a small file. Test with that -- it should take almost no time -- to see if things work.

As far as the timeout goes -- well, if nothing was created, and nothing was copied, I can't see how the RAID could be affected.

Dewdman42
04-04-2009, 02:45 PM
I was trying to create a small copy script like that, but I could not for the life of me figure out how to do it. At some point I figured out how to include a small file set, but then the SuperDuper backup was failing with some message about detecting volume. Let me try to recreate after the current long backup job is done(I ran it manually since the scheduled one failed last night)

Dewdman42
04-04-2009, 04:38 PM
Ok. Full SuperDuper backup completed, (non-scheduled). It did create the read-only compressed DMG, it did NOT execute the script. It did not wedge during "CopyJob" or create an error dialog in the gui.

I am executing the script by hand now and its running fine.

Something about SuperDuper and calling that script does not work. I will try to see now if I can get that smaller copy job working in order to test it better.

So, there could be more than one thing going on. Could be related to scheduling, screen saver, SuperDuper copy script problems or Super Duper calling the shell script. I'm at a loss. I will continue to try to post test results in case you can spot the problem.

Dewdman42
04-04-2009, 05:00 PM
ok, made a small backup file set and ran it manually. This time it worked fine. Everything worked, including calling the script. I did have a script file permission problem the first time I tried this, but I'm not sure if that was there this morning I created that problem while tweaking the script just now for this small test. So I will run the full backup again to see if it all works 100%.

Beyond that, we'll see what happens when I schedule this for tonight again.

Could be something going on related to screensaver or the scheduler. When backups are run via the scheduler, what unix user executes the scripts?

dnanian
04-04-2009, 07:27 PM
Does it indicate in the log that it's calling the script? Did you "nohup" the background task?

We are really just asking the script to execute. There's no trick here...

Dewdman42
04-09-2009, 01:23 PM
After muddling around with this a few days, it started working and for a few days the schedule ran as expected every night at 3:30am, creating a backup and successfully running the shell script at the end to copy the backup to a remote drive.

However, this morning I arose to find the same error, an error reported from "Copy Job" (Which I now understand is indeed a SuperDuper thing). Copy Job has thrown up a dialog box saying "AppleEvent timed out". The ok button on this dialog box does not function. The SuperDuper GUI is not up. Copy Job is still showing in the Activity Monitor, so I kill it, and the error dialog goes away.

There is no dmg file that would have been created by SuperDuper, so the backup did not happen for some reason.

I looked at the SuperDuper log using the SuperDuper GUI (window menu) and I can't even see any entries since last saturday, even though I know the backup actually ran a number of times successfully since then.

dnanian
04-09-2009, 02:16 PM
The "Copy Job" application is the schedule driver. As I indicated, if the application crashed or failed, the script interface would fail, and time out... perhaps there's something in your system log that indicates what the real failure was?

Dewdman42
04-09-2009, 02:25 PM
Its not failing during the shell script and I'm not talking about that anymore.

SuperDuper is failing to do ANYTHING on the schedule. It appears that if AppleScript is jammed up for some reason, then SuperDuper will not be able to execute on schedule because somehow Copy Job is blocked. I looked in my console log to see what was happening around 3:30am, and there are a lot of Apple Script time out messages being posted for a little app called "Mail Unread Menu". This app has always functioned fine in the past but has been acting up lately. I have now disabled that app and we'll see what happens this week with SuperDuper.

Side note, seems unfortunate that another app like that should be able to block SuperDuper from running on schedule.

dnanian
04-09-2009, 02:51 PM
Maybe you should try rebuild your "Launch Services" database, actually. That will recreate the various script dictionary caches and might help.

Other applications shouldn't really be able to "block" us from running on schedule (unless, of course, cron isn't working at all, but in your case it is).

Dewdman42
04-09-2009, 02:53 PM
I will try that. I seem to recall that can be done with OnyX