NetWorker Daily Rerun Guidelines
General Information
Each day someone from the backup team needs to validate that every backup has run successfully. This
document outlines how that activity should be completed each day, and other suggestions and
guidelines to ensure proper backups are being performed.
Daily outline
The daily tasks follow this outline:
1. Review active backups and validate that none of them have been running for an excessive
amount of time. Resolve situations as necessary.
2. Find failed backups and validate that the failed backup did not already rerun successfully
3. Re-run failed backup as necessary
4. Monitor rerun jobs for completion
Identifying Backup job status
Networker uses icons to show when jobs are running, completed successfully, interrupted, and failed:
Note that “User Input” will almost never show up for a backup job, but may be used when ejecting tapes
or other operations.
HAEA Internal Use Only
Reviewing active backups
Using the Monitoring tab, review the backups that are still running. This should be done by the end of
the backup window (6AM PST). Most, if not all, backups should be completed by this time but it’s not
unusual for a few backups to be running from the night before. However, sometimes backups can get
“stuck” and attempt to run for days. This creates several issues, one being that subsequent backups will
not run until the current backup is completed. Some backups do run for multiple days, such as weekly
full or monthly full backups. The following guidelines should be used to determine if an active backup
needs to be stopped and re-ran:
1. Incremental backup running longer than 24 hours
2. Full backup running longer than 5 days
If one of these situations frequently re-occur for a specific policy/system, further troubleshooting should
be performed to prevent the situation from continuing.
Find Failed Backups and Validate
NOTE: Only use this steps for failed “Backup” actions. For other failed actions (such as replication or tape
copies) please refer to other documents to troubleshoot/resolve those. Do not rerun those jobs unless
other documentation tells you to.
If a policy shows a ”Failed” label, expand the workflow, right click on the “backup” action, and select
“show details”.
This will bring up a window with details on all clients under that Policy/Workflow/Action.
HAEA Internal Use Only
At the top of the window is a dropdown named “Action Start Time”. You can use this to view the logs
from various action attempts.
Note that each filesystem/drive has its own line. To review the selected failures and see the error
messages, select the appropriate line and click the button that says “Show messages”. Note that you
only want to rerun the systems that had a complete failure (i.e. no filesystems were backed up) and not
the ones that had one or two files fail.
Re-run failed backups as necessary
There are two ways to re-run individual clients. You can use the GUI by following the steps below:
HAEA Internal Use Only
1. Right click on the workflow and select “start Individual client”
2. Select the clients you want to rerun (you can select multiple)
3. Click Start
You can also rerun backup jobs from the Networker server.
1. Log in as bkupadm
2. FOR NON VM JOBS: Run the following command, substituting in the correct
Policy/Workflow/Client:
nsrworkflow -p POLICY -w WORKFLOW -c CLIENTNAME -A “backup -l full”
This would run a full backup. To run an incremental backup:
nsrworkflow -p POLICY -w WORKFLOW -c CLIENTNAME -A “backup -l incr”
(note that after backup, that is a lower case L for level. Also, if copy/pasting this information directly into
an SSH session, you may need to change the quotation marks, as Microsoft Word is using start/end
quotes which SSH may not recognize.)
Rerunning VM jobs through the CLI is a little more difficult but is much easier if there are dozens to do.
1. Get the VM UUID
mminfo -kvot -q vmname=HCAOSIVDI31
2. Review the output to get the UUID for the VM
NWddboostprod.001 Data Domain irvhcavcsp02v.hmfad.com 07/06/2024
09:59:50 PM 16 MB 260710595 cE full vm:50113a7c-c463-274a-2f0d-
208868e7a82:irvhcavcsp02v.hmfad.com
3. Run the following command, substituting the appropriate values:
nsrworkflow -s hcaipntwkr01 -p <POLICYNAME> -w <WORKFLOW NAME> -L
-c "work items:<UUID FROM ABOVE>"
HAEA Internal Use Only
Monitor rerun jobs for completion
The new re-ran job will show up Monitor tab and it should be monitored to ensure that it completes
successfully. If there is a lingering issue that caused the failure, the new job will generally fail fairly
quickly These issues should be immediately investigated and resolved and the job re-ran again if
possible. These jobs will show up on the next day’s failure report, so it’s not necessary to monitor the
job to absolute completion, especially in the case of longer (1+ hour) jobs.
HAEA Internal Use Only