NetBackup Dedupe Guide
NetBackup Dedupe Guide
Release 7.1
21159706
Legal Notice
Copyright 2011 Symantec Corporation. All rights reserved. Symantec, the Symantec Logo, and NetBackup are trademarks or registered trademarks of Symantec Corporation or its affiliates in the U.S. and other countries. Other names may be trademarks of their respective owners. This Symantec product may contain third party software for which Symantec is required to provide attribution to the third party (Third Party Programs). Some of the Third Party Programs are available under open source or free software licenses. The License Agreement accompanying the Software does not alter any rights or obligations you may have under those open source or free software licenses. Please see the Third Party Legal Notice Appendix to this Documentation or TPIP ReadMe File accompanying this Symantec product for more information on the Third Party Programs. The product described in this document is distributed under licenses restricting its use, copying, distribution, and decompilation/reverse engineering. No part of this document may be reproduced in any form by any means without prior written authorization of Symantec Corporation and its licensors, if any. THE DOCUMENTATION IS PROVIDED "AS IS" AND ALL EXPRESS OR IMPLIED CONDITIONS, REPRESENTATIONS AND WARRANTIES, INCLUDING ANY IMPLIED WARRANTY OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE OR NON-INFRINGEMENT, ARE DISCLAIMED, EXCEPT TO THE EXTENT THAT SUCH DISCLAIMERS ARE HELD TO BE LEGALLY INVALID. SYMANTEC CORPORATION SHALL NOT BE LIABLE FOR INCIDENTAL OR CONSEQUENTIAL DAMAGES IN CONNECTION WITH THE FURNISHING, PERFORMANCE, OR USE OF THIS DOCUMENTATION. THE INFORMATION CONTAINED IN THIS DOCUMENTATION IS SUBJECT TO CHANGE WITHOUT NOTICE. The Licensed Software and Documentation are deemed to be commercial computer software as defined in FAR 12.212 and subject to restricted rights as defined in FAR Section 52.227-19 "Commercial Computer Software - Restricted Rights" and DFARS 227.7202, "Rights in Commercial Computer Software or Commercial Computer Software Documentation", as applicable, and any successor regulations. Any use, modification, reproduction release, performance, display or disclosure of the Licensed Software and Documentation by the U.S. Government shall be solely in accordance with the terms of this Agreement.
Symantec Corporation 350 Ellis Street Mountain View, CA 94043 http://www.symantec.com Printed in the United States of America. 10 9 8 7 6 5 4 3 2 1
Technical Support
Symantec Technical Support maintains support centers globally. Technical Supports primary role is to respond to specific queries about product features and functionality. The Technical Support group also creates content for our online Knowledge Base. The Technical Support group works collaboratively with the other functional areas within Symantec to answer your questions in a timely fashion. For example, the Technical Support group works with Product Engineering and Symantec Security Response to provide alerting services and virus definition updates. Symantecs support offerings include the following:
A range of support options that give you the flexibility to select the right amount of service for any size organization Telephone and/or Web-based support that provides rapid response and up-to-the-minute information Upgrade assurance that delivers software upgrades Global support purchased on a regional business hours or 24 hours a day, 7 days a week basis Premium service offerings that include Account Management Services
For information about Symantecs support offerings, you can visit our Web site at the following URL: www.symantec.com/business/support/ All support services will be delivered in accordance with your support agreement and the then-current enterprise technical support policy.
Hardware information Available memory, disk space, and NIC information Operating system Version and patch level Network topology Router, gateway, and IP address information Problem description:
Error messages and log files Troubleshooting that was performed before contacting Symantec Recent software configuration changes and network changes
Customer service
Customer service information is available at the following URL: www.symantec.com/business/support/ Customer Service is available to assist with non-technical questions, such as the following types of issues:
Questions regarding product licensing or serialization Product registration updates, such as address or name changes General product information (features, language availability, local dealers) Latest information about product updates and upgrades Information about upgrade assurance and support contracts Information about the Symantec Buying Programs Advice about Symantec's technical support options Nontechnical presales questions Issues that are related to CD-ROMs or manuals
Contents
Chapter 2
Contents
About fully qualified domain names .......................................... About scaling deduplication ..................................................... Send initial full backups to the storage server ............................. Increase the number of jobs gradually ........................................ Introduce load balancing servers gradually ................................. Implement client deduplication gradually ................................... Use deduplication compression and encryption ........................... About the optimal number of backup streams .............................. About storage unit groups for deduplication ............................... About protecting the deduplicated data ...................................... Save the deduplication storage server configuration ..................... Plan for disk write caching ....................................................... How deduplication restores work .............................................. Replacing the PureDisk Deduplication Option with Media Server Deduplication on the same host ................................................ Migrating from PureDisk to the NetBackup Media Server Deduplication option .............................................................. Migrating from another storage type to deduplication ........................
46 46 47 47 47 48 48 48 49 49 50 50 50 51 52 53
Chapter 3
Chapter 4
Chapter 5
Contents
Enabling deduplication encryption .................................................. About backup policy configuration .................................................. Configuring optimized synthetic backups for deduplication ................. Configuring optimized duplication copy behavior .............................. Configuring optimized duplication of deduplicated data ...................... Throttling optimized duplication traffic ........................................... Adding a NetBackup load balancing server ........................................ About the deduplication pd.conf file ................................................ NetBackup deduplication pd.conf file settings ............................. Editing the deduplication pd.conf file ............................................... About the deduplication storage server configuration file .................... Getting the deduplication storage server configuration ....................... Editing a deduplication storage server configuration file ..................... Setting the deduplication storage server configuration ....................... About the deduplication host configuration file ................................. Deleting a deduplication host configuration file ................................. Resetting the deduplication registry ................................................ Configuring deduplication log file timestamps on Windows ................. Configuring a replication target for MSDP duplication to another domain ................................................................................. Setting NetBackup configuration options by using bpsetconfig ............. Reconfiguring the deduplication storage server and storage paths ...................................................................................
70 72 72 73 75 78 78 80 80 83 84 85 86 87 88 88 89 90 90 92 92
Chapter 6
Chapter 7
10
Contents
Viewing deduplication storage server attributes ......................... Managing NetBackup Deduplication Engine credentials .................... Adding NetBackup Deduplication Engine credentials .................. Changing NetBackup Deduplication Engine credentials .............. Deleting credentials from a load balancing server ...................... Determining which media servers have deduplication credentials ................................................................................... Managing deduplication pools ...................................................... Changing deduplication disk pool properties ............................. Setting media server deduplication pool attributes ..................... Clearing deduplication pool attributes ...................................... Changing the deduplication pool state ...................................... Changing the deduplication disk volume state ........................... Deleting a deduplication pool .................................................. Determining the deduplication pool state .................................. Determining the deduplication disk volume state ....................... Viewing deduplication pools ................................................... Viewing media server deduplication pool attributes .................... Deleting backup images ............................................................... About maintenance processing ..................................................... Performing deduplication maintenance manually ............................ Resizing the storage partition ....................................................... Restoring files at a remote site ...................................................... Specifying the restore server ........................................................
109 110 110 110 111 111 111 112 113 113 114 114 115 115 116 116 117 117 118 119 119 120 121
Chapter 8
Contents
11
Deduplication fails after services are restarted or a domain controller is restarted ..................................................... Cannot delete a disk pool ....................................................... Media open error (83) ............................................................ Media write error (84) ........................................................... Storage full conditions .......................................................... Viewing disk errors and events ..................................................... Deduplication event codes and messages ........................................
Chapter 9
Chapter 10
12
Contents
Chapter
Reduce the amount of data that is stored. Reduce backup bandwidth. Reduced bandwidth can be especially important when you want to limit the amount of data that a client sends over the network. Over the network can be to a backup server or for image duplication between remote locations. Reduce backup windows. Reduce infrastructure.
14
NetBackup deduplication
15
NetBackup Media Server NetBackup clients send their backups to a NetBackup media Deduplication Option server, which deduplicates the backup data. A NetBackup media server hosts the NetBackup Deduplication Engine, which writes the data to the storage and manages the deduplicated data. See About the NetBackup Media Server Deduplication Option on page 21. Appliance deduplication The NetBackup OpenStorage option lets third-party vendor appliances function as disk storage for NetBackup. The disk appliance provides the storage and it manages the storage. A disk appliance may provide deduplication functionality. NetBackup backs up and restores client data and manages the life cycles of the data. PureDisk deduplication NetBackup PureDisk is a deduplication solution that is not part of the NetBackup distribution. PureDisk provides bandwidth-optimized backups of data in remote offices. You use PureDisk interfaces to install, configure, and manage the PureDisk servers, storage pools, and client backups. You do not use NetBackup to configure or manage the storage or backups. PureDisk has its own documentation set. See the NetBackup PureDisk Getting Started Guide. A PureDisk storage pool can be a storage destination for both the NetBackup Client Deduplication Option and the NetBackup Media Server Deduplication Option.
16
Deduplication significantly reduces the amount of storage space that is required for the NetBackup backup images. Figure 1-2 is a diagram of file segments that are deduplicated. Figure 1-2 File deduplication
File 1 Client files to back up
A B C D E A B
File 2
Q D L
The following list describes how NetBackup derives unique segments to store:
The deduplication engine breaks file 1 into segments A, B, C, D, and E. The deduplication engine breaks file 2 into segments A, B, Q, D, and L. The deduplication engine stores file segments A, B, C, D, and E from file 1 and file segments Q, and L from file 2. The deduplication engine does not store file segments A, B, and D from file 2. Instead, it points to the unique data copies of file segments A, B, and D that were already written from file 1.
More detailed information is available. See Media server deduplication process on page 155.
Chapter
Planning your deduplication deployment About the deduplication tech note About the deduplication storage destination About the NetBackup Media Server Deduplication Option About NetBackup Client Deduplication About NetBackup Deduplication Engine credentials About the network interface for deduplication About deduplication port usage About deduplication compression About dedupliction encryption About optimized synthetic backups and deduplication About optimized duplication of deduplicated data About deduplication and SAN Client About deduplication performance About deduplication stream handlers About iSCSI, CIFS, and NFS Deployment best practices Replacing the PureDisk Deduplication Option with Media Server Deduplication on the same host
18
Migrating from PureDisk to the NetBackup Media Server Deduplication option Migrating from another storage type to deduplication
Determine the storage destination See About the deduplication storage destination on page 20. Determine which type of deduplication to use See About the NetBackup Media Server Deduplication Option on page 21. See About NetBackup Client Deduplication on page 26. Determine the requirements for deduplication hosts See About deduplication servers on page 22. See About deduplication server requirements on page 24. See About client deduplication requirements and limitations on page 27. See About the network interface for deduplication on page 28. See About deduplication port usage on page 29. See About scaling deduplication on page 46. See About deduplication performance on page 43. Determine the credentials for deduplication Read about compression and encryption Read about optimized synthetic backups See About NetBackup Deduplication Engine credentials on page 28. See About deduplication compression on page 29. See About dedupliction encryption on page 30. See About optimized synthetic backups and deduplication on page 30.
19
Read about best practices for implementation Determine the storage requirements and provision the storage
See About provisioning the storage on page 55. See About deduplication storage requirements on page 55. See About deduplication storage capacity on page 56. See About the deduplication storage paths on page 56.
See Replacing the PureDisk Deduplication Option with Media Server Deduplication on the same host on page 51. See Migrating from PureDisk to the NetBackup Media Server Deduplication option on page 52.
Currently supported systems Media server and client sizing information Configuration, operational, and troubleshooting updates And more
20
A Media Server Deduplication Pool represents the disk storage that is attached to a NetBackup media server. If you use this destination, use this guide to plan, implement, configure, and manage deduplication and the storage. When you configure the storage server, select Media Server Deduplication Pool as the storage type. For a Media Server Deduplication Pool storage destination, all hosts that are used for the deduplication must be NetBackup 7.0 or later. Hosts include the master server, the media servers, and the clients that deduplicate their own data. Integrated deduplication means that the components installed with NetBackup perform deduplication. A PureDisk Deduplication Pool represents a PureDisk storage pool. If you use a PureDisk Deduplication Pool, use the PureDisk documentation to plan, implement, configure, and manage the storage. A PureDisk Deduplication Pool destination requires that PureDisk be at release 6.6 or later. See the NetBackup PureDisk Getting Started Guide. After you configure the storage, use this guide to configure backups and deduplication in NetBackup. When you configure the storage server, select PureDisk Deduplication Pool as the storage type. For a PureDisk Deduplication Pool storage destination, all hosts that are used for the deduplication must be NetBackup 7.0 or later. Hosts include the master server, the media servers, and the clients that deduplicate their own data. Integrated deduplication means that the components installed with NetBackup perform deduplication. A PureDisk storage pool destination that is not represented by a NetBackup construct. The required PureDisk Deduplication Option agent presents the PureDisk storage pool to NetBackup. In PDDO, the PureDisk Storage Pool Authority provides the PureDisk Deduplication Option agent that is installed on the NetBackup media servers. If you use a PureDisk storage pool, use the PureDisk documentation to plan, implement, configure, and manage the storage. For a PureDisk storage pool destination, you can use NetBackup 6.5 or 7.0 NetBackup hosts. Hosts include the master server and the media servers. See the NetBackup PureDisk Deduplication Option Guide.
Table 2-2 shows the storage destinations supported by the two NetBackup releases.
Planning your deployment About the NetBackup Media Server Deduplication Option
21
Table 2-2
NetBackup release
6.5 7.0
22
Planning your deployment About the NetBackup Media Server Deduplication Option
Figure 2-1
NetBackup client
NetBackup client
NetBackup client
PureDisk
PureDisk
Plug-in
Plug-in
PureDisk
Plug-in
A PureDisk storage pool may also be the storage destination. See About the deduplication storage destination on page 20. More detailed information is available. See Deduplication server components on page 153. See Media server deduplication process on page 155.
Planning your deployment About the NetBackup Media Server Deduplication Option
23
Deduplication storage One host functions as the storage server for a deduplication node; server that host must be a NetBackup media server. The storage server does the following:
Writes the data to and reads data from the disk storage. Manages that storage.
The storage server also deduplicates data. Therefore, one host both deduplicates the data and manages the storage. Only one storage server exists for each NetBackup deduplication node. You can use NetBackup deduplication with one media server host only: the media server that is configured as the deduplication storage server. See About deduplication nodes on page 23. Load balancing server You can configure other NetBackup media servers to help deduplicate data. They perform file fingerprint calculations for deduplication, and they send the unique results to the storage server. These helper media servers are called load balancing servers. See About deduplication fingerprinting on page 161. A NetBackup media server becomes a load balancing server when two things occur: You enable the media server for deduplication load balancing duties. You do so when you configure the storage server or later by modifying the storage server properties. You select it in the storage unit for the deduplication pool.
See Introduce load balancing servers gradually on page 47. Load balancing servers also perform restore and duplication jobs. Load balancing servers can be any supported server type for deduplication. They do not have to be the same type as the storage server.
24
Planning your deployment About the NetBackup Media Server Deduplication Option
its own storage. Deduplication within each node is supported; deduplication between nodes is not supported. Multiple media server deduplication nodes can exist. Nodes cannot share servers, storage, or clients.
Symantec recommends at least a Symantec recommends at least a 2.2-GHz clock rate. A 64-bit processor 2.2-GHz clock rate. A 64-bit processor is required. is required. At least four cores are required. Symantec recommends eight cores. See About deduplication performance on page 43. See About maintenance processing on page 118. At least two cores are required. Depending on throughput requirements, more cores may be helpful.
Planning your deployment About the NetBackup Media Server Deduplication Option
25
Table 2-4
Deduplication server minimum requirements (continued) Load balancing server or PureDisk Deduplication Option host
4 GBs.
Note: In some environments, a single host can function as both a NetBackup master server and as a deduplication server. Such environments typically run fewer than 100 total backup jobs a day. (Total backup jobs means backups to any storage destination, including deduplication and nondeduplication storage.) If you perform more than 100 backups a day, deduplication operations may affect master server operations.
Note: If you use an existing media server for deduplication, performance may be inferior to the equipment that meets the deduplication server minimum requirement guidelines.
26
See Replacing the PureDisk Deduplication Option with Media Server Deduplication on the same host on page 51. See the NetBackup PureDisk Deduplication Option (PDDO) Guide. Deduplication within each media server deduplication node is supported; global deduplication between nodes is not supported.
Reduces network traffic. The client sends only unique file segments to the storage server. Duplicate data is not sent over the network. Distributes some deduplication processing load from the storage server to clients. (NetBackup does not balance load between clients; each client deduplicates its own data.)
Remote office or branch office backups to the data center. LAN connected file server Virtual machine backups.
Client-side deduplication is also a useful solution if a client host has unused CPU cycles or if the storage server or load balancing servers are overloaded. Figure 2-2 shows client deduplication. The deduplication storage server is a media server on which the deduplication core components are enabled. The storage destination is a Media Server Deduplication Pool
27
Figure 2-2
Plug-in
Plug-in
PureDisk
Plug-in NetBackup PureDisk Deduplication Engine Deduplication storage server Media server deduplication pool
A PureDisk storage pool may also be the storage destination. See About the deduplication storage destination on page 20. More information is available. See Deduplication client components on page 158. See Deduplication client backup process on page 158.
28
For user names and passwords, you can use characters in the printable ASCII range (0x20-0x7E) except for the following characters:
Asterisk (*) Backward slash (\) and forward slash (/) Double quote (") Left parenthesis [(] and right parenthesis [)]
The user name can be up to 127 characters in length. The password can be up to 100 characters in length. Leading and trailing spaces and quotes are ignored. The user name and password cannot be empty or all spaces.
Record and save the credentials in case you need them in the future. Caution: You cannot change the NetBackup Deduplication Engine credentials after you enter them. Therefore, carefully choose and enter your credentials. If you must change the credentials, contact your Symantec support representative.
29
10085
10102
30
If you enable encryption on a client that deduplicates its own data, the client encrypts the data before it sends it to the storage server. The data remains encrypted on the storage. Data also is transferred from the client over a Secure Sockets Layer to the server regardless of whether or not the data is encrypted. Therefore, data transfer from the clients that do not deduplicate their own data is also protected. If you enable encryption on a load balancing server, the load balancing server encrypts the data. It remains encrypted on storage. If you enable encryption on the storage server, the storage server encrypts the data. It remains encrypted on storage. If the data is already encrypted, the storage server does not encrypt it.
Deduplication uses the Blowfish algorithm for encryption. Note: Do not enable encryption by selecting the Encryption setting on the Attributes tab of the Policy dialog box. If you do, NetBackup encrypts the data before it reaches the PureDisk plug-in that deduplicates it. Consequently, deduplication rates are very low. See Use deduplication compression and encryption on page 48. See Enabling deduplication encryption on page 70. See About the deduplication pd.conf file on page 80. See NetBackup deduplication pd.conf file settings on page 80.
31
server constructs (or synthesizes) the backup image directly on the disk storage. Optimized synthetic backups require no data movement across the network.. Optimized synthetic backups are faster than a synthetic backup. Regular synthetic backups are constructed on the media server. They are moved across the network from the storage server to the media server and synthesized into one image. The synthetic image is then moved back to the storage server. The target storage unit's deduplication pool must be the same deduplication pool on which the source images reside. In NetBackup, the Optimizedlmage attribute enables optimized synthetic backup. It applies to both storage servers and deduplication pools. Beginning with NetBackup 7.1, the Optimizedlmage attribute is enabled by default on storage servers and media server deduplication pools. See Configuring optimized synthetic backups for deduplication on page 72. If NetBackup cannot produce the optimized synthetic backup, NetBackup creates the more data-movement intensive synthetic backup. See Setting deduplication storage server attributes on page 105. See Setting media server deduplication pool attributes on page 113.
32
If the source images reside on a NetBackup Media Server Deduplication Pool, the destination can be another Media Server Deduplication Pool or a PureDisk Deduplication Pool. (In NetBackup, a PureDisk storage pool is configured as a PureDisk Deduplication Pool.) If the destination is a PureDisk storage pool, the PureDisk environment must be at release level 6.6 or later. If the source images reside on a PureDisk storage pool, the destination must be another PureDisk storage pool. Both PureDisk environments must be at release level 6.6 or later. The source storage and the destination storage must have at least one media server in common. See About the media servers for optimized MSDP duplication within the same domain on page 33. In the storage unit you use for the destination for the optimized duplication, you must select only the common media server or media servers. If you select more than one, NetBackup assigns the duplication job to the least busy media server. If you select a media server or servers that are not common, the optimized duplication job fails. For more information about media server load balancing, see the NetBackup Administrator's Guide for UNIX and Linux, Volume I or the NetBackup Administrator's Guide for Windows, Volume I. The destination storage unit cannot be the same as the source storage unit.
33
You cannot use optimized duplication from a PureDisk storage pool (a PureDisk Deduplication Pool) to a Media Server Deduplication Pool. If an optimized duplication job fails after the configured number of retries, NetBackup does not run the job again. By default, NetBackup retries an optimized duplication job three times. You can change the number of retries. See Configuring optimized duplication copy behavior on page 73. Optimized duplication does not work with storage unit groups. If you use a storage unit group as a destination for optimized duplication, NetBackup uses regular duplication. Optimized duplication does not support multiple copies during an optimized duplication job. If NetBackup is configured to make multiple new copies from the (source) copy of the backup image, the following occurs:
In a storage lifecycle policy, one duplication job creates one optimized duplication copy. If multiple optimized duplication destinations exist, a separate job exists for each destination. This behavior assumes that the device for the optimized duplication destination is compatible with the device on which the source image resides. If multiple remaining copies are configured to go to devices that are not optimized duplication capable, NetBackup uses normal duplication. One duplication job creates those multiple copies. For other duplication methods, NetBackup uses normal duplication. One duplication job creates all of the copies simultaneously. The other duplication methods include the following: NetBackup Vault, the bpduplicate command line, and the duplication option of the Catalog utility in the NetBackup Administration Console.
See Optimized MSDP duplication within the same domain requirements on page 32.
About the media servers for optimized MSDP duplication within the same domain
For optimized Media Server Deduplication Pool duplication within the same domain, the source storage and the destination storage must have at least one media server in common. The common server initiates, monitors, and verifies the
34
copy operation. The common server requires credentials for both the source storage and the destination storage. (For deduplication, the credentials are for the NetBackup Deduplication Engine, not for the host on which it runs.) Usually, you would use one of the storage servers as the common server, but you can use another media server to control the optimized duplication. It must have the credentials and the connectivity for both the source storage and the destination storage. (You configure the credentials when you configure the storage server or you can add credentials to a media server later.) Which server initiates the optimized duplication determines if the duplication is a push duplication or a pull duplication. Technically, no advantage exists with a push duplication or a pull duplication. However, the media server that initiates the duplication operation also becomes the write host for the new image copies.
Push duplication
Deduplication node B (duplicates) Host B PureDisk Credentials: Host B
Plug-in
Plug-in
For Figure 2-3, the following screen shot shows the settings for the storage unit for the normal backups. The disk pool is the Media Server Deduplication Pool in the local environment. Because host A has credentials for both nodes, host A
35
appears in the storage units for both of the nodes. For node A normal backups, you do not want a remote host deduplicating data, so only host A is selected.
For optimized duplication for Figure 2-3, the following screen shot shows the storage unit settings. The destination is the Media Server Deduplication Pool in the remote environment. You must select the common server, so only host A is selected.
36
If you use node B for backups also, select host B and not host A in the storage unit for the node B backups. If you select host A, it becomes a load balancing server for the node B deduplication pool. Figure 2-4 shows a push deduplication from a Media Server Deduplication Pool to a PureDisk storage pool. (In NetBackup, a PureDisk storage pool is configured as a PureDiskDeduplicationPool.) The MediaServerDeduplicationPool contains normal backups; the PureDisk Deduplication Pool is the destination for the optimized duplication copies. The local media server has credentials for both environments; it is the common server.
37
Figure 2-4
Deduplication node A (normal backups) Local_MediaServer PureDisk
PureDisk
Plug-in
For Figure 2-4, the following screen shot shows the settings for the storage unit for the normal backups. The disk pool is the Media Server Deduplication Pool in the local environment. For normal backups, you do not want a remote host deduplicating data, so only the local host is selected.
38
For optimized duplication for Figure 2-4, the following screen shot shows the storage unit settings. The disk pool is the PureDisk Deduplication Pool in the remote environment. You must select the common server, so only the local media server is selected. If this configuration were a pull configuration, the remote host would be selected in the storage unit.
39
Figure 2-5 shows optimized duplication between two PureDisk storage pools. NetBackup media server A has credentials for both storage pools; it initiates, monitors, and verifies the optimized duplication. In the destination storage unit, the common server (media server A) is selected. This configuration is a push configuration. For a PureDisk Deduplication Pool (that is, a PureDisk storage pool), the PureDisk content router functions as the storage server.
40
Figure 2-5
PureDisk
Storage pool A, send your data to storage pool B
Plug-in
You can use a load balancing server when you duplicate between two NetBackup deduplication pools. However, it is more common between two PureDisk storage pools.
41
Figure 2-6
Pull duplication
Deduplication node B (duplicates) Credentials: Host A Host B Host B PureDisk
Please verify that the data arrived
Plug-in
Plug-in
For Figure 2-6, the storage unit settings for the normal backups are the same as for the push duplication example. For the duplication destination, the following screen shot shows the storage unit settings. They are similar to the push example except host B is selected. Host B is the common server, so it must be selected in the storage unit.
42
If you use node B for backups also, select host B and not host A in the storage unit for the node B backups. If you select host A, it becomes a load balancing server for the node B deduplication pool.
43
See Configuring a replication target for MSDP duplication to another domain on page 90. Duplicating images to another NetBackup domain is different than NetBackup optimized duplication. NetBackup optimized duplication copies images to different storage in the same NetBackup domain. See About optimized MSDP duplication within the same domain on page 32.
44
Table 2-7
When
Initial seeding
Normal operation
Normal operation is when all clients have been backed up once. Approximately 15 to 20 jobs can run concurrently and with high performance under the following conditions: The hardware meets minimum requirements. (More capable hardware improves performance.) No compression. If data is compressed, the CPU usage increases quickly, which reduces the number of concurrent jobs that can be handled. The deduplication rate is between 10% and 50%. The deduplication rate is the percentage of data already stored so it is not stored again. The amount of data that is stored is between 30% to 90% of the capacity of the storage.
Clean up periods
Clean up is when the NetBackup Deduplication Engine performs maintenance such as deleting expired backup image data segments. NetBackup maintains the same number of concurrent backup jobs as during normal operation. However, the average time to complete the jobs increases significantly.
Storage NetBackup maintains the same number of concurrent backup jobs as during approaches normal operation under the following conditions: full capacity The hardware meets minimum requirements. (More capable hardware improves performance.) The amount of data that is stored is between 85% to 90% of the capacity of the storage. However, the average time to complete the jobs increases significantly.
45
Backup Exec Hyper-V NDMP VMWare vmdk VMWare vSphere backup types Windows System State
Symantec continues to develop additional stream handlers to improve backup deduplication performance.
46
Select Only use the following media servers. Select all of the load balancing servers but do not select the deduplication storage server.
The deduplication storage server performs storage server tasks only: storing and managing the deduplicated data, file deletion, and optimized duplication. If you configure client deduplication, the clients deduplicate their own data. Some of the deduplication load is removed from the deduplication storage server and loading balancing servers. Symantec recommends the following strategies to scale deduplication:
For the initial full backups of your clients, use the deduplication storage server. For subsequent backups, use load balancing servers. Enable client-side deduplication gradually. If a client cannot tolerate the deduplication processing workload, be prepared to move the deduplication processing back to a server.
47
48
The deduplication storage server is CPU limited on any core. Memory resources are available on the storage server. Network bandwidth is available on the storage server. Back-end I/O bandwidth to the deduplication pool is available. Other NetBackup media servers have CPU available for deduplication.
Gigabit Ethernet should provide sufficient performance in many environments. If your performance objective is the fastest throughput possible with load balancing servers, you should consider 10 Gigabit Ethernet.
Use the storage server for the initial backup of the clients. Enable deduplication on only a few clients at a time. It may be easier to evaluate how your environment handles traffic and easier to troubleshoot any problems with fewer hosts added for deduplication.
If a client cannot tolerate the deduplication processing workload, be prepared to move the deduplication processing back to the storage server.
49
server deduplication may be a better solution when the application can provide multiple streams or channels. See Planning your deduplication deployment on page 18.
Use NetBackup optimized duplication to copy the images to another deduplication node off-site location. Optimized duplication copies the primary backup data to another deduplication pool. It provides the easiest, most efficient method to copy data off-site yet remain in the same NetBackup domain. You then can recover from a disaster that destroys the storage on which the primary copies reside by retrieving images from the other deduplication pool. See About optimized MSDP duplication within the same domain on page 32. See Configuring optimized duplication of deduplicated data on page 75. For the primary deduplication storage, use a SAN volume with resilient storage methodologies to replicate the data to a remote site. If the deduplication database is on a different SAN volume, replicate that volume to the remote site also.
50
Also, you can use NetBackup to back up the deduplication storage server system or program disks. If the disk on which NetBackup resides fails and you have to replace it, you can use NetBackup to restore the media server.
A battery backup unit that supplies power to the cache memory so write operations can continue if power is restored within sufficient time. An uninterruptible power supply that allows the components to complete their write operations.
If your devices that have caches are not protected, Symantec recommends that you disable the hardware caches. Read and write performance may decline, but you help to avoid data loss.
Planning your deployment Replacing the PureDisk Deduplication Option with Media Server Deduplication on the same host
51
The backup server may not be the server that performs the restore. If another server has credentials for the NetBackup Deduplication Engine (that is, for the storage server), NetBackup may use that server for the restore job. NetBackup chooses the least busy server for the restore. The following other servers can have credentials for the NetBackup Deduplication Engine:
A load balancing server in the same deduplication node. A deduplication server in a different deduplication node that is the target of optimized duplication. Optimized duplication requires a server in common between the two deduplication nodes. See About the media servers for optimized MSDP duplication within the same domain on page 33.
You can specify the server to use for restores. See Specifying the restore server on page 121.
Replacing the PureDisk Deduplication Option with Media Server Deduplication on the same host
The PureDisk Deduplication Option provides deduplication of NetBackup backups for NetBackup release 6.5. The destination storage for PDDO is a PureDisk storage pool. The PureDisk agent that performs the deduplication is installed from the PureDisk software distribution, not from the NetBackup distribution. PDDO is not the same as integrated NetBackup deduplication. You can upgrade to 7.0 a NetBackup media server that hosts a PDDO agent and use that server for integrated NetBackup deduplication. The storage can remain the PureDisk storage pool, and NetBackup maintains access to all of the valid backup images in the PureDisk storage pool. If you perform this procedure, the NetBackup PureDisk plug-in replaces the PureDisk agent on the media server. The NetBackup PureDisk plug-in can deduplicate data for either integrated NetBackup deduplication or for a PureDisk storage pool. The PDDO agent can deduplicate data only for a PureDisk storage pool. Note: To use the NetBackup PureDisk plug-in with a PureDisk storage pool, the PureDisk storage pool must be part of a PureDisk 6.6 or later environment.
52
Planning your deployment Migrating from PureDisk to the NetBackup Media Server Deduplication option
Deactivate all backup policies Deactivate the policies to ensure that no activity occurs on that use the host the host. See the NetBackup Administrator's Guide for UNIX and Linux, Volume I See the NetBackup Administrator's Guide for Windows, Volume I.. Remove the PDDO plug-in NetBackup deduplication components cannot reside on the same host as a PureDisk Deduplication Option (PDDO) agent. Therefore, remove the PDDO agent from the host. See the NetBackup PureDisk Deduplication Option Guide. Upgrade the media server to If the media server runs a version of NetBackup earlier than 7.0 or later 7.0, upgrade that server to NetBackup 7.0 or later. See the NetBackup Installation Guide for UNIX and Linux. See the NetBackup Installation Guide for Windows. Configure the host In the Storage Server Configuration Wizard, select PureDisk Deduplication Pool and enter the name of the Storage Pool Authority. See Configuring a deduplication storage server on page 63. Activate your backup policies See the NetBackup Administrator's Guide for UNIX and Linux, Volume I See the NetBackup Administrator's Guide for Windows, Volume I..
53
Install and configure NetBackup Configure NetBackup deduplication Redirect your backup jobs
Redirect your backup jobs to the NetBackup media server deduplication pool. See the NetBackup Administrator's Guide for UNIX and Linux, Volume I. See the NetBackup Administrator's Guide for Windows, Volume I.
Uninstall PureDisk
After the PureDisk backup images expire, uninstall PureDisk. See your NetBackup PureDisk documentation.
54
After all of the backup images that are associated with the storage expire, repurpose that storage. If it is disk storage, you cannot add it to an existing media server deduplication pool. You can use it as storage for another, new deduplication node.
See Migrating from PureDisk to the NetBackup Media Server Deduplication option on page 52.
Chapter
About provisioning the storage About deduplication storage requirements About deduplication storage capacity About the deduplication storage paths About adding additional storage
56
Minimum performance
The storage must be configured and operational before you can configure deduplication in NetBackup. NetBackup requires exclusive use of the disk resources. If the storage is used for purposes other than backups, NetBackup cannot manage disk pool capacity or manage storage lifecycle policies correctly. Therefore, NetBackup must be the only entity that uses the storage. Local disk storage may leave you vulnerable in a disaster. SAN attached disk can be remounted at a newly provisioned server with the same name.
Use more than one media server deduplication node. Use a PureDisk storage pool as the storage destination. A PureDisk storage pool provides larger storage capacity; PureDisk 6.6 supports up to 100 TB of deduplicated data. It also provides global deduplication. See About the deduplication storage destination on page 20.
Only one deduplication storage path can exist on a media server. You cannot add another storage path to increase capacity beyond 32 TBs.
57
Because the storage requires a directory path, do not use only a root node (/) or drive letter (G:\) as the storage path. You also can specify a different location for the deduplication database. The database path is the directory in which NetBackup stores and maintains the structure of the stored deduplicated data. For performance optimization, Symantec recommends that you use a separate disk, volume, partition, or spindle for the deduplication database. If the directory or directories do not exist, NetBackup creates them and populates them with the necessary subdirectory structure. If the directory or directories exist, NetBackup populates them with the necessary subdirectory structure. The path names must use ASCII characters only. The NetBackup Media Server Deduplication Option does not support NFS mounted file systems. Caution: You cannot change the paths after NetBackup configures the deduplication storage server. Therefore, carefully decide during the planning phase where and how you want the deduplicated backup data stored. See About deduplication storage requirements on page 55.
58
Chapter
Licensing deduplication
This chapter includes the following topics:
About licensing deduplication About the deduplication license key Licensing NetBackup deduplication
NetBackup supports deduplication on specific 64-bit host operating systems. If you intend to upgrade an existing media server and use it for deduplication, that host must be supported. For the supported systems, see the NetBackup Release Notes. NetBackup deduplication components cannot reside on the same host as a PureDisk Deduplication Option agent. To use a PDDO agent host for NetBackup deduplication, first remove the PDDO agent from that host. See the NetBackup PureDisk Deduplication Option (PDDO) Guide. Then, upgrade that host to NetBackup 7.0 or later. Finally, configure that host as a deduplication storage server or as a load balancing server.
60
1 2 3 4 5
On the Help menu of the NetBackup Administration Console, select License Keys. In the NetBackup License Keys dialog box, click New. In the Add a New License Key dialog box, enter the license key and click Add or OK. Click Close. Restart all the NetBackup services and daemons.
Chapter
Configuring deduplication
This chapter includes the following topics:
Configuring NetBackup deduplication Configuring a deduplication storage server About deduplication pools Configuring a deduplication pool Configuring a deduplication storage unit Enabling client deduplication Enabling deduplication encryption About backup policy configuration Configuring optimized synthetic backups for deduplication Configuring optimized duplication copy behavior Configuring optimized duplication of deduplicated data Throttling optimized duplication traffic Adding a NetBackup load balancing server About the deduplication pd.conf file Editing the deduplication pd.conf file About the deduplication storage server configuration file Getting the deduplication storage server configuration Editing a deduplication storage server configuration file
62
Setting the deduplication storage server configuration About the deduplication host configuration file Deleting a deduplication host configuration file Resetting the deduplication registry Configuring deduplication log file timestamps on Windows Configuring a replication target for MSDP duplication to another domain Setting NetBackup configuration options by using bpsetconfig Reconfiguring the deduplication storage server and storage paths
Optionally, enable client-side See Enabling client deduplication on page 70. deduplication Optionally, enable See Enabling deduplication encryption on page 70. compression and encryption
63
Optionally, configure See Configuring optimized synthetic backups for optimized synthetic backups deduplication on page 72. Optionally, configure optimized duplication copy See Configuring optimized duplication of deduplicated data on page 75. See Configuring optimized duplication copy behavior on page 73. See Throttling optimized duplication traffic on page 78. Optionally, specify advanced See About the deduplication pd.conf file on page 80. deduplication settings See Editing the deduplication pd.conf file on page 83. See NetBackup deduplication pd.conf file settings on page 80.
The type of storage server. For NetBackup media server deduplication, select Media Server Deduplication Pool for the type of disk storage. For a PureDisk deduplication pool, select PureDisk Deduplication Pool for the type of disk storage. The credentials for the deduplication engine. See About NetBackup Deduplication Engine credentials on page 28. The storage paths. See About the deduplication storage paths on page 56.
64
The network interface. See About the network interface for deduplication on page 28. The load balancing servers, if any. See About deduplication servers on page 22.
When you create the storage server, the wizard lets you create a disk pool and storage unit also. To configure a deduplication storage server by using the wizard
1 2 3
In the NetBackup Administration Console, expand Media and Device Management > Configure Disk Storage Servers. Follow the wizard screens to configure a deduplication storage server. After NetBackup creates the deduplication storage server, you can click Next to continue to the Disk Pool Configuration Wizard.
The type of disk pool (PureDisk). The NetBackup deduplication storage server to query for the disk storage to use for the pool. The disk volume to include in the pool. NetBackup exposes the storage as a single volume. The disk pool properties. See Media server deduplication pool properties on page 65.
65
Symantec recommends that disk pool names be unique across your enterprise. To create a NetBackup disk pool
1 2
In the NetBackup Administration Console, select the Media and Device Management node. From the list of wizards in the Details pane, click Configure Disk Pool and follow the wizard instructions. For help, see the wizard help.
After NetBackup creates the deduplication pool, you have the option to create a storage unit that uses the pool.
Disk volume
Update Replication
Query the storage server for its replication capabilities. If the capabilities are different than the current NetBackup configuration, update the configuration. The amount of space available in the disk pool. The total raw size of the storage in the disk pool. A comment that is associated with the disk pool.
66
The default is 98%. Low water mark Limit I/O streams The Low water mark has no affect on the PureDiskVolume. Select to limit the number of read and write streams (that is, jobs) for each volume in the disk pool. A job may read backup images or write backup images. By default, there is no limit. If you select this property, also configure the number of streams to allow per volume. When the limit is reached, NetBackup chooses another volume for write operations, if available. If not available, NetBackup queues jobs until a volume is available. Too many streams may degrade performance because of disk thrashing. Disk thrashing is excessive swapping of data between RAM and a hard disk drive. Fewer streams can improve throughput, which may increase the number of jobs that complete in a specific time period. per volume Select or enter the number of read and write streams to allow per volume. Many factors affect the optimal number of streams. Factors include but are not limited to disk speed, CPU speed, and the amount of memory.
67
1 2 3
In the NetBackup Administration Console, expand NetBackup Management > Storage > Storage Units. On the Actions menu, select New > Storage Unit. Complete the fields in the New Storage Unit dialog box. For a storage unit for optimized duplication destination, select Only use the following media servers. Then select the media servers that are common between the two deduplication nodes.
See Deduplication storage unit properties on page 67. See Deduplication storage unit recommendations on page 68.
Storage unit name A unique name for the new storage unit. The name can describe the type of storage. The storage unit name is the name used to specify a storage unit for policies and schedules. The storage unit name cannot be changed after creation. Storage unit type Select Disk as the storage unit type. Disk type Select PureDisk for the disk type for a media server deduplication pool, a PureDisk deduplication pool, or a PureDisk Deduplication Option storage pool. Select the disk pool that contains the storage for this storage unit. All disk pools of the specified Disk type appear in the Disk pool list. If no disk pools are configured, no disk pools appear in the list.
Disk pool
68
NetBackup selects the media server to use when the policy runs. Maximum fragment size For normal backups, NetBackup breaks each backup image into fragments so it does not exceed the maximum file size that the file system allows. You can enter a value from 20 MBs to 51200 MBs. The Maximumconcurrentjobs setting specifies the maximum number of jobs that NetBackup can send to a disk storage unit at one time. (Default: one job. The job count can range from 0 to 256.) This setting corresponds to the Maximum concurrent write drives setting for a Media Manager storage unit. NetBackup queues jobs until the storage unit is available. If three backup jobs are scheduled and Maximum concurrent jobs is set to two, NetBackup starts the first two jobs and queues the third job. If a job contains multiple copies, each copy applies toward the Maximum concurrent jobs count. Maximum concurrent jobs controls the traffic for backup and duplication jobs but not restore jobs. The count applies to all servers in the storage unit, not per server. If you select multiple media servers in the storage unit and 1 for Maximum concurrent jobs, only one job runs at a time. The number to enter depends on the available disk space and the server's ability to run multiple backup processes.
69
Configure the media servers for NetBackup deduplication and configure the storage. Configure a disk pool. Configure a storage unit for your most important clients (such as STU-GOLD). Select the disk pool. Select Only use the following media servers. Select two media servers to use for your important backups. Create a backup policy for the 100 important clients and select the STU-GOLD storage unit. The media servers that are specified in the storage unit move the client data to the deduplication storage server. Configure another storage unit (such as STU-SILVER). Select the same disk pool. Select Only use the following media servers. Select the other two media servers. Configure a backup policy for the 500 regular clients and select the STU-SILVER storage unit. The media servers that are specified in the storage unit move the client data to the deduplication storage server.
Backup traffic is routed to the wanted data movers by the storage unit settings. Note: NetBackup uses storage units for media server selection for write activity (backups and duplications) only. For restores, NetBackup chooses among all media servers that can access the disk pool.
70
For example, two storage units use the same set of media servers. One of the storage units (STU-GOLD) has a higher Maximum concurrent jobs setting than the other (STU-SILVER). More client backups occur for the storage unit with the higher Maximum concurrent jobs setting.
1 2 3 4 5
In the NetBackup Administration Console, expand NetBackup Management > Host Properties > Master Servers. In the details pane, select the master server. On the Actions menu, select Properties. On the Host Properties General tab, add the clients that use client direct to the Clients list. Select one of the following Deduplication Location options:
Always use the media server disables client deduplication. By default, all clients are configured with the Always use the media server option. Prefer to use client-side deduplication uses client deduplication if the PureDisk plug-in is active on the client. If it is not active, a normal backup occurs; client deduplication does not occur. Always use client-side deduplication uses client deduplication. If the deduplication backup job fails, NetBackup retries the job.
You can override the Prefer to use client-side deduplication or Always use client-side deduplication host property in the backup policies. See Disable client-side deduplication in the NetBackup Administrator's Guide for UNIX and Linux, Volume I. See Disable client-side deduplication in the NetBackup Administrator's Guide for Windows, Volume I.
You can enable encryption on all hosts that deduplicate their own data without configuring them individually.
71
Use this procedure if you want all of your clients that deduplicate their own data to encrypt that data. See To enable encryption on all hosts on page 71.
You can enable encryption on individual hosts. Use this procedure to enable compression or encryption on the storage server, on a load balancing server, or on a client that deduplicates its own data. See To enable encryption on a single host on page 71.
See About dedupliction encryption on page 30. To enable encryption on all hosts
On the storage server, open the contentrouter.cfg file in a text editor; it resides in the following directory:
storage_path/etc/puredisk/contentrouter.cfg
Add agent_crypt to the ServerOptions line of the file. The following line is an example:
ServerOptions=fast,verify_data_read,agent_crypt
If you use load balancing servers, make the same edits to the contentrouter.cfg files on those hosts.
Use a text editor to open the pd.conf file on the host. The pd.conf file resides in the following directories:
2 3
For the line in the file that contains ENCRYPTION, remove the pound character (#) in column 1 from that line. In that line, replace the 0 (zero) with a 1. Note: The spaces to the left and right of the equal sign (=) in the file are significant. Ensure that the space characters appear in the file after you edit the file.
72
Ensure that the LOCAL_SETTINGS parameter is set to 1. If LOCAL_SETTINGS is 0 (zero) and the ENCRYPTION setting on the storage server is 0, the client setting does not override the server setting. Consequently, the data is not encrypted on the client host.
5 6
Save and close the file. If the host is the storage server or a load balancing server, restart the NetBackup Remote Manager and Monitor Service (nbrmms) on the host.
Set the Optimizedlmage attribute on your See Setting media server deduplication pool existing deduplication pools. (Any attributes on page 113. deduplication pools that you create after you set the storage server attribute inherit the new functionality.)
73
Table 5-4
Task
Configure a Standard or MS-Windows See the administrator's guide for your backup policy. Select the Synthetic backup operating system: attribute on the Schedule Attribute tab of NetBackup Administrator's Guide for the backup policy. UNIX and Linux, Volume I. NetBackup Administrator's Guide for Windows, Volume I.
Storage lifecycle policy retry If the optimized deduplication job is configured in a storage wait period lifecycle policy and the job fails, NetBackup retries the job three times. If the job is unsuccessful after three tries, NetBackup waits two hours and then retries the job. You can change the wait period.
74
Caution: These settings affect all optimized duplication jobs; they are not limited to optimized duplication to a Media Server Deduplication Pool or a PureDisk Deduplication Pool. To configure optimized duplication failover
See Setting NetBackup configuration options by using bpsetconfig on page 92. Alternatively on UNIX systems, add the entry to the bp.conf file on the NetBackup master server. To configure the number of duplication attempts
On the master server, create a file named OPT_DUP_BUSY_RETRY_LIMIT that contains an integer the specifies the number of times to retry the job before NetBackup fails the job. The file must reside in the following directory (depending on the operating system):
Change the wait period for retries by adding an IMAGE_EXTENDED_RETRY_PERIOD_IN_HOURS entry to the NetBackup LIFECYCLE_PARAMETERS file. The default for this value is two hours. For example, the following entry configures NetBackup to wait four hours before NetBackup tries the job again:
IMAGE_EXTENDED_RETRY_PERIOD_IN_HOURS 4
75
Step 1 Configure the storage servers One server must be common between the source storage and the destination storage. Which you choose depends on whether you want a push or a pull configuration. See About the media servers for optimized MSDP duplication within the same domain on page 33. For a push configuration, configure the common server as a load balancing server for the storage server for your normal backups. For a pull configuration, configure the common server as a load balancing server for the storage server for the copies at your remote site. Alternatively, you can add a server later to either environment. (A server becomes a load balancing server when you select it in the storage unit for the deduplication pool.) See Optimized MSDP duplication within the same domain requirements on page 32. See Configuring a deduplication storage server on page 63. Step 2 Configure the deduplication If you did not configure the deduplication pools when you configured the pools storage servers, use the Disk Pool Configuration Wizrd to configure them. See Configuring a deduplication pool on page 64. Step 3 Configure the backup storage In the storage unit for your backups, do the following: unit For the Disk type, select PureDisk.
For the Disk pool, select one of the following: If you back up to integrated NetBackup deduplication, select your Media Server Deduplication Pool. If you back up to a PureDisk environment, select the PureDisk Deduplication Pool.
If you use a pull configuration, do not select the common media server in the backup storage unit. If you do, NetBackup uses it to deduplicate backup data. (That is, unless you want to use it for a load balancing server for the source deduplication node.) See Configuring a deduplication storage unit on page 67.
76
For the Disk type, select PureDisk. For the Disk pool, the destination can be a Media Server Deduplication Pool or a PureDisk Deduplication Pool.
77
Step 6 Configure a storage lifecycle Configure a storage lifecycle policy only if you want to use one to duplicate policy for the duplication images. The storage lifecycle policy manages both the backup jobs and the duplication jobs. Configure the lifecycle policy in the deduplication environment that performs your normal backups. Do not configure it in the environment that contains the copies. When you configure the storage lifecycle policy, do the following: For the Backup destination, select the storage unit that is the target of your backups. That storage unit may use a Media Server Deduplication Pool or a PureDisk Deduplication Pool. These backups are the primary backup copies; they are the source images for the duplication operation. For the Duplication destination, select the storage unit for the destination deduplication pool. That pool may be a Media Server Deduplication Pool or a PureDisk Deduplication Pool. If the backup destination is a PureDisk Deduplication Pool, the duplication destination also must be a PureDisk Deduplication Pool.
See the NetBackup Administrators Guide for UNIX and Linux or the NetBackup Administrators Guide for Windows. Step 7 Configure a backup policy Configure a policy to back up your clients. Configure the backup policy in the deduplication environment that performs your normal backups. Do not configure it in the environment that contains the copies. If you use a storage lifecycle policy to manage the backup job and the duplication job: Select that storage lifecycle policy in the Policy storage field of the Policy Attributes tab. If you do not use a storage lifecycle policy to manage the backup job and the duplication job: Select the storage unit for the Media Server Deduplication Pool that contains your normal backups. These backups are the primary backup copies.
78
See the NetBackup Vault Administrators Guide. Step 9 Duplicate by using the bpduplicate command Use the NetBackup bpduplicate command to copy images manually. If you use a storage lifecycle policy or NetBackup Vault for optimized duplication, you do not have to use the bpduplicate command. Duplicate from the source storage to the destination storage. The destination storage may be a Media Server Deduplication Pool or a PureDisk Deduplication Pool. See NetBackup Commands Reference Guide.
79
See About deduplication servers on page 22. To add a load balancing server
1 2 3
In the NetBackup Administration Console, expand Media and Device Management > Credentials > Storage Server Select the deduplication storage server. On the Edit, select Change.
4 5
In the Change Storage Server dialog box, select the Media Servers tab Select the media server or servers that you want to use as a load balancing server. It must be a supported host. The media servers that are checked are configured as load balancing servers.
6 7
Click OK. For all storage units in which Only use the following media servers is configured, ensure that the new load balancing server is selected.
80
BACKUPRESTORERANGE N/A
Classless Inter-Domain Specifies the IP address or range of Routing format or addresses of the local network interface comma-separated list card (NIC) for backups and restores. of IP addresses 0 (no limit) to the Determines the maximum bandwidth that practical system limit, is allowed when backing up or restoring data between the media server and the in KBs/sec deduplication pool. The value is specified in KBytes/second. The default is no limit. 0 (off) or 1 (on) Specifies whether you want compression. By default, files are not compressed. If you want compression, change the value to 1. See About deduplication compression on page 29.
BANDWIDTH_LIMIT
COMPRESSION
81
Default value
DONT_SEGMENT_TYPES N/A
ENCRYPTION
0 (off) or 1 (on)
Specifies whether to encrypt the data. By default, files are not encrypted. If you want encryption, change the value to 1. If you set this parameter to 1 on all hosts, the data is encrypted during transfer and on the storage. See About dedupliction encryption on page 30.
LOCAL_SETTINGS
0 (allow override) or 1 Specifies whether to allow the pd.conf (always use local settings of the deduplication storage settings) server to override the settings in the local pd.conf file.
82
Default value
0
Action
Specifies the amount of information that is written to the log file. The range is from 0 to 10, with 10 being the most logging.
OPTDUP_BANDWITH
0 (no limit) to the Determines the maximum bandwidth that practical system limit, is allowed for optimized duplication. The value is specified in KBytes/second. in KBs/sec 0 (off) or 1 (on) Specifies whether to compress optimized duplication data. By default, files are not compressed. If you want compression, change the value to 1. See About deduplication compression on page 29.
OPTDUP_COMPRESSION 1
OPTDUP_ENCRYPTION 0
0 (off) or 1 (on)
Specifies whether to encrypt the optimized duplication data. By default, files are not encrypted. If you want encryption, change the value to 1. If you set this parameter to 1 on all hosts, the data is encrypted during transfer and on the storage. See About dedupliction encryption on page 30.
OPTDUP_TIMEOUT
N/A
Specifies the number of minutes before the optimized duplication times out. Indicated in minutes.
83
Default value
128
32 to 16384 KBs, The file segment size in kilobytes. The increments of 32 only value must be a multiple of 32.
Use a text editor to open the pd.conf file. The pd.conf file resides in the following directories:
(UNIX) /usr/openv/lib/ost-plugins/
84
(Windows) install_path\Veritas\NetBackup\bin\ost-plugins
2 3
To activate a setting, remove the pound character (#) in column 1 from each line that you want to edit. To change a setting, specify a new value. Note: The spaces to the left and right of the equal sign (=) in the file are significant. Ensure that the space characters appear in the file after you edit the file.
4 5
Save and close the file. Restart the NetBackup Remote Manager and Monitor Service (nbrmms) on the host.
85
V7.0 represents the version of the I/O format not the NetBackup release level. The version may differ on your system. If you get the storage server configuration when the server is not configured or is down and unavailable, NetBackup creates a template file. The following is an example of a template configuration file:
V7.0 V7.0 V7.0 V7.0 V7.0 V7.0 V7.0 "storagepath" " " string "spalogin" " " string "spapasswd" " " string "spalogretention" "7" int "verboselevel" "3" int "dbpath" " " string "required_interface" " " string
To use a storage server configuration file for recovery, you must edit the file so that it includes only the information that is required for recovery. See Getting the deduplication storage server configuration on page 85. See Editing a deduplication storage server configuration file on page 86. See Setting the deduplication storage server configuration on page 87.
For sshostname, use the name of the storage server. For file.txt, use a file name that indicates its purpose.
86
If you get the file when a storage server is not configured or is down and unavailable, NetBackup creates a template file.
V7.0 "replication_target(s)" "none" string A value for replication_target(s) is required only if you configured optimized duplication. Otherwise, do not edit this line. V7.0 "spalogin" "username" string Replace username with the NetBackup Deduplication Engine user ID. Replace password with the password for the NetBackup Deduplication Engine user ID.
87
See About the deduplication storage server configuration file on page 84. See Recovering from a deduplication storage server disk failure on page 145. See Recovering from a permanent deduplication storage server failure on page 147. To edit the storage server configuration
If you did not save a storage server configuration file, get a storage server configuration file. See Getting the deduplication storage server configuration on page 85.
Use a text editor to enter, change, or remove values. Remove lines from and add lines to your file until only the required lines (see Table 5-7) are in the configuration file. Enter or change the values between the second set of quotation marks in each line. A template configuration file has a space character (" ") between the second set of quotation marks.
88
On the master server, run the following command: UNIX: /usr/openv/netbackup/bin/admincmd/nbdevconfig -setconfig
-storage_server sshostname -stype PureDisk -configlist file.txt
For sshostname, use the name of the storage server. For file.txt, use the name of the file that contains the configuration.
The storage_server_name is the fully qualified domain name if that was used to configure the storage server. For example, if the storage server name is DedupeServer.symantecs.org, the configuration file name is DedupeServer.symantecs.org.cfg. The following is the location of the file: UNIX: /usr/openv/lib/ost-plugins Windows: install_path\Veritas\NetBackup\bin\ost-plugins
Delete the file; it's location depends on the operating system type, as follows: UNIX: /usr/openv/lib/ost-plugins Windows: install_path\Veritas\NetBackup\bin\ost-plugins
89
Enter the following commands on the storage server to reset the deduplication registry file:
rm /etc/pdregistry.cfg cp -f /usr/openv/pdde/pdconfigure/cfg/userconfigs/pdregistry.cfg /etc/pdregistry.cfg
HKLM\SOFTWARE\Symantec\PureDisk\Agent\ConfigFilePath HKLM\SOFTWARE\Symantec\PureDisk\Agent\EtcPath
Delete the storage path in the following key in the Windows key. That is, delete everything after postgresql-8.3 -D in the key.
HKLM\SYSTEM\ControlSet001\Services\postgresql-8.3\ImagePath
For example, in the following example registry key, you would delete the content of the key that is in italic type:
"C:\Program Files\Veritas\pdde\pddb\bin\pg_ctl.exe" runservice -N postgresql-8.3 -D "D:\DedupeStorage\databases\pddb\data" -w
90
The database path may be the same as the configured storage path.
2 3 4
In the line that begins with log_line_prefix, change the value from %%t to %t. (That is, remove one of the percent signs (%).) Save the file. Run the following command:
install_path\Veritas\pdde\pddb\bin\pg_ctl reload -D dbpath\databases\pddb\data
If the command output does not include server signaled, use Windows Computer Management to restart the PostgreSQL Server 8.3 service. See About deduplication logs on page 123.
Configuring deduplication Configuring a replication target for MSDP duplication to another domain
91
1 2 3 4
In the NetBackup Administration Console, expand Media and Device Management > Credentials > Storage Server Select the deduplication storage server. On the Edit menu, select Change. In the Change Storage Server dialog box, select the Replication tab.
Enter the Storage Server Name Enter the Username and Password for the NetBackup Deduplication Engine on the target storage server. Click Add.
All targets are considered for duplication, depending on the rules of the storage lifecycle policies that control the duplication.
92
On the host on which you want to set configuration options, write the current configuration to a file by running the following command: UNIX: /usr/openv/netbackup/bin/admincmd/bpgetconfig h hostname
> file.txt
Windows: install_path\NetBackup\bin\admincmd\bpgetconfig h
hostname > file.txt
Edit and save the file. You can change the values of the options that are in the file. You can add option and value pairs. Ensure that you understand the values that are allowed and the format of any new options that you add.
Upload the configuration by running the following command: UNIX: /usr/openv/netbackup/bin/admincmd/bpsetconfig h hostname
file.txt
Windows: install_path\NetBackup\bin\admincmd\bpsetconfig h
hostname file.txt
Configuring deduplication Reconfiguring the deduplication storage server and storage paths
93
made a mistake during configuration, you must manually deactivate the engine and physically delete the storage directory before you can reconfigure deduplication. Similarly, if you want to change the storage path, you must deactivate the engine and delete the storage directory. Common configuration mistakes include the following:
Choosing a media server with an unsupported operating system. All NetBackup media servers appear in the Storage Server Configuration Wizard. Choosing a storage path that does not include a directory name. For example, D:\DedupeData is a valid storage path, but D:\ is not.
Two aspects to the configuration exist: the record of the deduplication storage in the EMM database and the physical presence of the storage on disk (the populated storage directory). Deleting the deduplication storage server does not alter the contents of the storage on physical disk. To protect against inadvertent data loss, NetBackup does not automatically delete the storage when you delete the storage server. Warning: Deleting valid backup images may cause data loss. Table 5-8 Task Reconfigure media server deduplication Procedure
Ensure that no deduplication Deactivate all backup policies that use deduplication storage. activity occurs See the NetBackup Administrator's Guide for UNIX and Linux, Volume I See the NetBackup Administrator's Guide for Windows, Volume I. Expire backup images Expire all backup images that reside on the deduplication disk storage. See the NetBackup Administrator's Guide for UNIX and Linux, Volume I See the NetBackup Administrator's Guide for Windows, Volume I. Delete the storage units that See the NetBackup Administrator's Guide for UNIX and use the disk pool Linux, Volume I See the NetBackup Administrator's Guide for Windows, Volume I. Delete the disk pool See Deleting a deduplication pool on page 115.
94
Configuring deduplication Reconfiguring the deduplication storage server and storage paths
Stop the NetBackup services See the NetBackup Administrator's Guide for UNIX and on the storage server Linux, Volume I. See the NetBackup Administrator's Guide for Windows, Volume I. Delete the storage directory Delete the storage directory and database directory (if you and the database directory configured a database directory). Reset the deduplication registry See Resetting the deduplication registry on page 89.
Start the NetBackup services See the NetBackup Administrator's Guide for UNIX and on the media server Linux, Volume I. See the NetBackup Administrator's Guide for Windows, Volume I. Delete the deduplication storage server Delete the storage server configuration file See Deleting a deduplication storage server on page 106.
The storage server and every load balancing server contain a deduplication configuration file. Delete that file from every server that you use for deduplication. See Deleting a deduplication host configuration file on page 88.
Reconfigure
Chapter
Monitoring the deduplication rate Monitoring deduplication storage capacity and usage Viewing disk reports Monitoring deduplication processes
The Deduplication Rate column of the Activity Monitor Jobs tab. The Job Details dialog box. The Detailed Status tab shows detailed information, including the deduplication rate. The information depends on whether it is media server deduplication or client-side deduplication, as follows:
For media server deduplication, the Detailed Status tab shows the deduplication rate on the server that performed the deduplication. The following job details excerpt shows details for a client for which Server_A deduplicated the data (the dedup field shows the deduplication rate):
96
10/6/2010 10:02:09 AM - Info Server_A(pid=30695) StorageServer=PureDisk:Server_A; Report=PDDO Stats for (Server_A): scanned: 30126998 KB, stream rate: 162.54 MB/sec, CR sent: 1720293 KB, dedup: 94.3%, cache hits: 214717 (94.0%)
The other fields that show deduplication information are highlighted in the example. For the field descriptions, see Table 6-1.
For client-side deduplication jobs, the Detailed Status tab shows two deduplication rates. The first deduplication rate is always for the client data. The second duplication rate is for the disk image header and True Image Restore information (if applicable). That information is always deduplicated on a server; typically, deduplication rates for that information are zero or very low. The following job details example excerpt shows the two rates:
10/8/2010 11:54:21 PM - Info Server_A(pid=2220) Using OpenStorage client direct to backup from client Client_B to Server_A 10/8/2009 11:58:09 PM - Info Server_A(pid=2220) StorageServer=PureDisk:Server_A; Report=PDDO Stats for (Server_A): scanned: 3423425 KB, stream rate: 200.77 MB/sec, CR sent: 122280 KB, dedup: 96.4%, cache hits: 49672 (98.2%) 10/8/2010 11:58:09 PM - Info Server_A(pid=2220) Using the media server to write NBU data for backup Client_B_1254987197 to Server_A 10/8/2010 11:58:19 PM - Info Server_A(pid=2220) StorageServer=PureDisk:Server_A; Report=PDDO Stats for (Server_A): scanned: 17161 KB, stream rate: 1047.42 MB/sec, CR sent: 17170 KB, dedup: 0.0%, cache hits: 0 (0.0%) the requested operation was successfully completed(0)
The bpdbjobs command shows the deduplication rate if you configure a COLDREFS entry for DEDUPRATIO in the bp.conf file on the media server on which you run the command. See the NetBackup Administrator's Guide for UNIX and Linux, Volume I.
Many factors affect deduplication performance. See About deduplication performance on page 43.
97
CR sent
dedup
The percentage of data that was stored already. That data is not stored again. The amount of data that the deduplication plug-in scanned. The speed of the scan: The kilobytes of data that are scanned divided by how long the scan takes.
98
Expired backups may not change the available size and the used size. An expired backup may have no unique data segments. Therefore, the segments remain valid for other backups. NetBackup Deduplication Manager clean-up may not have run yet. The Deduplication Manager performs clean up twice a day. Until it performs clean-up, deleted image fragments remain on disk.
If you use operating system tools to examine storage space usage, their results may not match the usage reported by NetBackup, as follows:
The operating system tools cannot report usage accurately. The storage implementation uses container files. Deleted segments can leave free space in container files, but the container file sizes do not change. If other applications use the storage, NetBackup cannot report usage accurately. NetBackup requires exclusive use of the storage.
Table 6-2 describes the options for monitoring capacity and usage. Table 6-2 Option
Change Storage Server dialog box
The Disk Pools window of the Administration Console displays a value that was stored when NetBackup polled the disk pools. NetBackup polls every 5 minutes; therefore, the value may not be as current as the value that is displayed in the Storage Server window. To display the window, expand Media and Device Management > Devices > Disk Pools.
The Disk Pool Status report displays the state of the disk pool and usage information. See Viewing disk reports on page 100.
99
License Keys dialog box The summary of active capacity-based license features in the NetBackup License Keys dialog box. The summary displays the storage capacity for which you are licensed and the capacity used. It does not display the amount of physical storage space. On the Help menu in the NetBackup Administration Console, select License Keys. View container command A command that is installed with NetBackup provides a view of storage capacity and usage within the deduplication container files. See About deduplication container files on page 99. See Viewing capacity within deduplication container files on page 100. The nbdevquery command The nbdevquery command shows the state of the disk volume and its properties and attributes. It also shows capacity, usage, and percent used. See Determining the deduplication disk volume state on page 116. NetBackup OpsCenter The NetBackup OpsCenter also provides information about storage capacity and usage. See the NetBackup OpsCenter Administrator's Guide.
100
file sizes do not change. Segments are deleted from containers when backup images expire and the NetBackup Deduplication Manager performs clean-up.
On UNIX and Linux systems, the path name of the command is /usr/openv/pdde/pdcr/bin/crcontrol. On Windows systems, the path name of the command is install_path\Veritas\pdde\Crcontrol.exe.
The following is an example of the command usage on a Windows deduplication storage server. The command shows the data store statistics (--dsstat option).
C:\Program Files\Veritas\pdde>Crcontrol.exe --dsstat 1 ************ Data Store statistics ************ Data storage Size Used Avail Use% 68.4G 46.4G 22.0G 68% Number of containers : 67 Average container size : 187441541 bytes (178.76MB) Space allocated for containers : 12558583274 bytes (11.70GB) Space used within containers : 12551984871 bytes (11.69GB) Space available within containers: 6598403 bytes (6.29MB) Space needs compaction : 508432 bytes (0.48MB) Records marked for compaction : 3 Active records : 95755 Total records : 95758
The NetBackup Deduplication Manager periodically compacts the space available inside the container files. Therefore, space within a container is not available as soon as it is free. Various internal parameters control whether a container file is compacted. Although space may be available within a container file, the file may not be eligible for compaction. The NetBackup Deduplication Manager checks for space every 20 seconds. For help with the command options, use the --help option.
101
Table 6-3 describes the disk reports available. Table 6-3 Report
Images on Disk
Disk Logs
The Disk Logs report displays the media errors or the informational messages that are recorded in the NetBackup error catalog. The report is a subset of the Media Logs report; it shows only disk-specific columns. Either PureDisk or Symantec Deduplication Engine in the description identifies a deduplication message. (The identifiers are generic because the deduplication engine does not know which application consumes its resources. NetBackup, Symantec Backup Exec, and NetBackup PureDisk are Symantec applications that use deduplication.
The Disk Storage Unit Status report displays the state of disk storage units in the current NetBackup configuration. For disk pool capacity, see the disk pools window in Media and Device Management > Devices > Disk Pools. Multiple storage units can point to the same disk pool. When the report query is by storage unit, the report counts the capacity of disk pool storage multiple times.
The Disk Pool Status report displays the state of disk pool and usage information.
1 2 3 4
In the NetBackup Administration Console, expand NetBackup Management > Reports > Disk Reports. Select the name of a disk report. In the right pane, select the report settings. Click Run Report.
102
On Windows systems, NetBackup Deduplication Manager in the Activity Monitor Services tab. On UNIX, the NetBackup Deduplication Manager appears as spad in the Administration Console Activity Monitor Daemons tab. The NetBackup bpps command shows the spad process.
The database On Windows systems, the postgres database processes appear processes (postgres) in the Activity Monitor Processes tab. The NetBackup bpps command shows the postgres processes.
Chapter
Managing deduplication
This chapter includes the following topics:
Managing deduplication servers Managing NetBackup Deduplication Engine credentials Managing deduplication pools Deleting backup images About maintenance processing Performing deduplication maintenance manually Resizing the storage partition Restoring files at a remote site Specifying the restore server
104
1 2 3 4
In the NetBackup Administration Console, expand Media and Device Management > Credentials > Storage Server Select the deduplication storage server. On the Edit menu, select Change. In the Change Storage Server dialog box, select the Properties tab.
5 6 7
For the property to change, select the value in the Value column. Change the value. Click OK.
105
The following is the command syntax to set a storage server attribute. Run the command on the master server or on the storage server.
nbdevconfig -changests -storage_server storage_server -stype PureDisk -setattribute attribute -media_server media_server
The attribute is the name of the argument that represents the new functionality. For example, OptimizedImage specifies that the environment supports the optimized synthetic backup method.
-media_server media_server
A NetBackup media server that connects to the storage server. The media server queries the storage server for its capabilities. Because the storage server is also a media server, use the storage server name.
Run the following command on the NetBackup master server or on a storage server:
106
nbdevconfig -changests -storage_server storage_server -stype PureDisk -clearattribute attribute -storage_server storage_server -setattribute attribute The name of the storage server.
The attribute is the name of the argument that represents the functionality.
1 2 3
In the NetBackup Administration Console, expand Media and Device Management > Credentials > Storage Server On the Edit menu, select Delete. Click Yes in the confirmation dialog box.
See Reconfiguring the deduplication storage server and storage paths on page 92.
107
Run the following command on the NetBackup master server or a deduplication storage server: UNIX: /usr/openv/netbackup/bin/admincmd/nbdevquery -liststs
-storage_server server_name -stype PureDisk U
This example output is shortened; more flags may appear in actual output.
108
For every storage unit that specifies the media server in Use one of the following media servers, clear the check box that specifies the media server. This step is not required if the storage unit is configured to use any available media server.
2 3
In the NetBackup Administration Console, expand Media and Device Management > Credentials > Storage Server. Select the deduplication storage server, then select Edit > Change.
4 5 6
In the Change Storage Server dialog box, select the Media Servers tab. Clear the check box of the media server you want to remove. Click OK.
109
In the NetBackup Administration Console, expand Media and Device Management > Credentials > Storage Server. The All Storage Servers pane shows all configured deduplication storage servers. deduplication storage servers show PureDisk in the Disk Type column.
The following is the command syntax to set a storage server attribute. Run the command on the NetBackup master server or on the deduplication storage server: UNIX: /usr/openv/netbackup/bin/admincmd/nbdevquery -liststs
-storage_server server_name -stype PureDisk U
This example output is shortened; more flags may appear in actual output.
110
On the host to which you want to add credentials, run the following command: UNIX: /usr/openv/volmgr/bin/tpconfig -add -storage_server
sshostname -stype PureDisk -sts_user_id UserID -password PassWord
111
On the load balancing server, run the following command: UNIX: /usr/openv/volmgr/bin/tpconfig -delete -storage_server
sshostname -stype PureDisk -sts_user_id UserID
1 2 3
In the NetBackup Administration Console, expand Media and Device Management > Credentials > Storage Server. Select the storage server, then select Edit > Change. In the Change Storage Server dialog box, select the Media Servers tab. The media servers for which credentials are configured are checked.
112
See Changing the deduplication disk volume state on page 114. See Deleting a deduplication pool on page 115. See Determining the deduplication pool state on page 115. See Determining the deduplication disk volume state on page 116. See Viewing deduplication pools on page 116. See Viewing media server deduplication pool attributes on page 117.
1 2 3
In the NetBackup Administration Console, expand Media and Device Management > Devices > Disk Pools. Select the disk pool you want to change in the details pane. On the Edit menu, select Change.
In the Change Disk Pool dialog box, change properties. See Media server deduplication pool properties on page 65.
113
The following is the command syntax to set a deduplication pool attribute. Run the command on the master server or on the storage server.
nbdevconfig -changedp -dp pool_name -stype PureDisk -setattribute attribute
The attribute is the name of the argument that represents the new functionality. For example, OptimizedImage specifies that the environment supports the optimized synthetic backup method.
The following is the command syntax to clear a deduplication pool attribute. Run the command on the master server or on the storage server.
nbdevconfig -changedp -dp pool_name -stype PureDisk -clearattribute attribute
114
The attribute is the name of the argument that represents the new functionality.
1 2 3 4
In the NetBackup Administration Console, expand Media and Device Management > Device Monitor. Select the Disk Pools tab. Select the disk pool. Select either Actions > Up or Actions > Down.
See About deduplication pools on page 64. See Determining the deduplication pool state on page 115. See Media server deduplication pool properties on page 65. See Configuring a deduplication pool on page 64.
Determine the name of the disk volume. The following command lists all volumes in the specified disk pool:
115
The nbdevquery and the nbdevconfig commands reside in the following directory:
To display the disk volumes in all disk pools, omit the -dp option.
Change the disk volume state; the following is the command syntax:
nbdevconfig -changestate -stype PureDisk -dp disk_pool_name dv vol_name -state state
1 2 3 4
In the NetBackup Administration Console, expand Media and Device Management > Devices > Disk Pools. Select a disk pool On the Edit menu, select Delete. In the Delete Disk Pool dialog box, verify that the disk pool is the one you want to delete and then click OK.
1 2 3
In the NetBackup Administration Console, expand Media and Device Management > Device Monitor. Select the Disk Pools tab. The state is displayed in the Status column.
116
Display the volume state by using the following command: UNIX: /usr/openv/netbackup/bin/admincmd/nbdevquery -listdv -stype
PureDisk -U
In the NetBackup Administration Console, expand Media and Device Management > Devices > Disk Pools.
117
The following is the command syntax to view the attributes of a deduplication pool. Run the command on the NetBackup master server or on the deduplication storage server: UNIX: /usr/openv/netbackup/bin/admincmd/nbdevquery -listdp -dp
pool_name -stype PureDisk U
This example output is shortened; more flags may appear in actual output.
118
Expire all of the images by using the bpexpdate command and the -notimmediate option. The -notimmediate option prevents bpexpdate from calling the nbdelete command, which deletes the image. Without this option, bpexpdate calls nbdelete to delete images. Each call to nbdelete creates a job in the Activity Monitor, allocates resources, and launches processes on the media server.
After you expire the last image, delete all of the images by using the nbdelete command with the allvolumes option. Only one job is created in the Activity Monitor, fewer resources are allocated, and fewer processes are started on the media servers. The entire process of expiring images and deleting images takes less time.
Because maintenance processing occurs automatically, you should not need to invoke those processes manually. However, you may do so. See Performing deduplication maintenance manually on page 119.
119
Because maintenance processing does not block any other deduplication process, rescheduling should not be necessary. Users cannot change the maintenance process schedules. However, if you must reschedule these processes, contact your Symantec support representative. See About deduplication server requirements on page 24.
Run the control command with the --processqueue option. The following is an example on a Windows system:
install_path\Veritas\pdde\Crcontrol.exe --processqueue
To examine the results, run the control command with the --dsstat 1 option (number 1 not lowercase letter l). The command may run for a long time; if you omit the 1, results return more quickly but they are not as accurate. See Viewing capacity within deduplication container files on page 100.
Run the control command with the -v -m +1,+2 --noreport options. The following is an example on a UNIX system:
/usr/openv/pdde/pdcr/bin/crcollect -v -m +1,+2 --noreport
120
1 2
Stop all NetBackup jobs on the storage on which you want to change the disk partition sizes and wait for the jobs to end. Deactivate the media server that hosts the storage server. See the NetBackup Administrator's Guide for UNIX and Linux, Volume I or the NetBackup Administrator's Guide for Windows, Volume I.
Stop the NetBackup services on the storage server. Be sure to wait for all services to stop.
4 5 6
Use the operating system or disk manager tools to dynamically increase or decrease the deduplication storage area. Restart the NetBackup services. Activate the media server that hosts the storage server. See the NetBackup Administrator's Guide for UNIX and Linux, Volume I or the NetBackup Administrator's Guide for Windows, Volume I.
121
Use NetBackup Host Properties to specify a Media host override server. All restore jobs for any storage unit on the original backup server use the media server you specify. Specify the same server for the Restore server as for the Original backup server. See Forcing restores to use a specific server in the NetBackup Administrator's Guide for UNIX and Linux, Volume I or the NetBackup Administrator's Guide for Windows, Volume I. This procedure sets the FORCE_RESTORE_MEDIA_SERVER option. Configuration options are stored in the bp.conf file on UNIX systems and the registry on Windows systems. Create the touch file USE_BACKUP_MEDIA_SERVER_FOR_RESTORE on the NetBackup master server in the following directory: UNIX: usr/openv/netbackup/db/config Windows: install_path\veritas\netbackup\db\config This global setting always forces restores to the server that did the backup. It applies to all NetBackup restore jobs, not just deduplication restore jobs. If this touch file exists, NetBackup ignores the FORCE_RESTORE_MEDIA_SERVER and FAILOVER_RESTORE_MEDIA_SERVER settings.
Always use a different server. Use NetBackup Host Properties to specify a Media host override server. See the previous explanation about Media host override, except: Specify the different server for the Restore server. A single restore instance. Use the bprestore command with the -disk_media_server option. Restore jobs for each instance of the command use the media server you specify. See NetBackup Commands Reference Guide.
122
Chapter
Troubleshooting
This chapter includes the following topics:
About deduplication logs Troubleshooting installation issues Troubleshooting configuration issues Troubleshooting operational issues Viewing disk errors and events Deduplication event codes and messages
124
Description
The client deduplication proxy plug-in on the media server runs under bptm, bpstsinfo, and bpbrm processes. Examine the log files for those processes for proxy plug-in activity. The strings proxy or ProxyServer embedded in the log messages identify proxy server activity. They write log files to the following directories:
For bpstsinfo: Windows: /usr/openv/netbackup/logs/admin UNIX: /usr/openv/netbackup/logs/bpstsinfo Windows: install_path\Veritas\NetBackup\logs\admin Windows: install_path\Veritas\NetBackup\logs\stsinfo
The deduplication proxy server nbostpxy on the client writes messages to files in an eponymous directory, as follows: UNIX: /usr/openv/netbackup/logs/nbostpxy Windows: install_path\Veritas\NetBackup\logs\nbostpxy.
The following is the path name of the log file for the deduplication configuration script:
NetBackup creates this log file during the configuration process. If your configuration succeeded, you do not need to examine the log file. The only reason to look at the log file is if the configuration failed. If the configuration process failed after it created and populated the storage directory, this log file identifies when the configuration failed. Deduplication database The deduplication database log file (postgresql.log) is in the storage_path/databases/pddb directory. You can configure log parameters. For more information, see the following: http://www.postgresql.org/docs/current/static/runtime-config-logging.html See Configuring deduplication log file timestamps on Windows on page 90.
125
Description
The NetBackup Deduplication Engine writes several log files, as follows:
The spoold.log file is the main log file The storaged.log file is for queue processing. A log file for each connection to the engine is stored in a directory in the storage path spoold directory. The following describes the pathname to a log file for a connection: IP_address/application/TaskName/FirstDigitofSessionID/sessionID-current_time_in_seconds.log For example, the following is an example of a crcontrol connection log pathname on a UNIX system: /storage_path/log/spoold/127.0.0.1/crcontrol/Control/2/2916742631-1257956402.log
Usually, the only reason to examine these connection log files is if a Symantec support representative asks you to. A VxUL log file for the events and errors that NetBackup receives from polling. The originator ID for the deduplication engine is 364. See About VxUL logs for deduplication on page 125. NetBackup Deduplication Manager The log files are in the /storage_path/log/spad directory.
You can set the log level and retention period in the Change Storage Server dialog box Properties tab. See Changing deduplication storage server properties on page 104. PureDisk plug-in You can configure the location and name of the log file and the logging level. To do so, edit the DEBUGLOG entry and the LOGLEVEL in the pd.conf file. See About the deduplication pd.conf file on page 80. See Editing the deduplication pd.conf file on page 83.
126
The messages that begin with a sts_ prefix relate to the interaction with the storage vendor software plug-in. Most interaction occurs on the NetBackup media servers. To view and manage VxUL log files, you must use NetBackup log commands. For information about how to use and manage logs on NetBackup servers, see the NetBackup Troubleshooting Guide. Table 8-2 Activity
NetBackup Deduplication Engine Backups and restores
N/A
The bpbrm backup and restore manager The bpdbm database manager The bpdm disk manager The bptm tape manager for I/O operations
117
Device 111 configuration and monitoring Device 178 configuration and monitoring Device 202 configuration and monitoring Device 230 configuration and monitoring
The Disk Service Manager process that runs in the Enterprise Media Manager (EMM) process.
The storage server interface process that runs in the Remote Manager and Monitor Service. RMMS runs on media servers. The Remote Disk Service Manager interface (RDSM) that runs in the Remote Manager and Monitor Service. RMMS runs on media servers.
127
128
Example
RDSM has encountered an STS error: Failed to update storage server ssname, database system error
Diagnosis
The PDDE_initConfig script was invoked, but errors occurred during the storage initialization. First, examine the deduplication configuration script log file for references to the server name. See About deduplication logs on page 123.. Second, examine the tpconfig command log file errors about creating the credentials for the server name. The tpconfig command writes to the standard NetBackup administrative commands log directory.
Diagnosis
Possible root causes: When you configured the storage server, you selected a media server that runs an unsupported operating system. All media servers in your environment appear in the Storage Server Configuration Wizard; be sure to select only a media server that runs a supported operating system. If you used the nbdevconfig command to configure the storage server, you may have typed the host name incorrectly. Also, case matters for the storage server type, so ensure that you use PureDisk for the storage server type.
129
If you cannot configure a deduplication storage server or load balancing servers, your network environment may not be configured for DNS reverse name lookup. You can edit the hosts file on the media servers that you use for deduplication. Alternatively, you can configure NetBackup so it does not use reverse name lookup. To prohibit reverse host name lookup by using the Administration Console
1 2 3 4 5
In the NetBackup Administration Console, expand NetBackup Management > Host Properties > Master Servers. In the details pane, select the master server. On the Actions menu, select Properties. In the Master Server Properties dialog box, select the Network Settings properties. Select one of the following options:
For a description of these options, see the NetBackup online Help or the administrator's guide. To prohibit reverse host name lookup by using the bpsetconfig command
Enter the following command on each media server that you use for deduplication:
echo REVERSE_NAME_LOOKUP = PROHIBITED | bpsetconfig -h host_name
The bpsetconfig command resides in the following directories: UNIX: /usr/openv/netbackup/bin/admincmd Windows: install_path\Veritas\NetBackup\bin\admincmd
130
131
Invoke the following command on the master server or the media server that functions as the deduplication storage server: The following example output shows that the DiskPoolVolume is UP:
Disk Pool Name Disk Type Disk Volume Name Disk Media ID Total Capacity (GB) Free Space (GB) Use% Status Flag Flag Flag Num Read Mounts Num Write Mounts Cur Read Streams Cur Write Streams : : : : : : : : : : : : : : : PD_Disk_Pool PureDisk PureDiskVolume @aaaab 49.98 43.66 12 UP ReadOnWrite AdminUp InternalUp 0 1 0 0
132
Mount the file system After a brief period of time, the volume state changes to UP. No further action is required.
Set the memory size of each virtual machine to double the physical memory of the host. Set the minimum and the maximum values of each virtual machine to the same value (double the physical memory of the host). These memory settings prevent the virtual memory from becoming fragmented on the disk because it does not grow or shrink.
These recommendations may not be the best configuration for every virtual machine. However, Symantec recommends that you try this solution first when troubleshooting performance issues.
Amend the security policy to allow the purediskdbuser account to have the "log on as a service" right. Change the service manually to a new domain account or to any other account that can run services.
133
Note: If you choose to switch the service to a different pre-existing account, that account must be granted full permissions to the<DBPATH>\databases\pddb\data directory.
Change the service to run as a Local System. Note: If you change the service to run as a Local System, the account that runs the postgresql-8.3 service must be granted full permissions to the<DBPATH>\databases\pddb\data directory.
To delete the disk pool, you must first delete the image fragments. The nbdelete command deletes expired image fragments from disk volumes. To delete the fragments of expired images
The -allvolumes option deletes expired image fragments from all volumes that contain them. The -force option removes the database entries of the image fragments even if fragment deletion fails.
134
Determine if incomplete SLP duplication jobs exist by running the following command on the master server: UNIX: install_path\NetBackup\bin\admincmd\nbstlutil stlilist
-incomplete
Cancel the incomplete jobs by running the following command for each backup ID returned by the previous command (xxxxx represents the backup ID): UNIX: install_path\NetBackup\bin\admincmd\nbstlutil cancel
-backupid xxxxx
135
136
Client-side deduplication can fail if the client cannot resolve the host name of the server. More specifically, the error can occur if the storage server was configured with a short name and the client tries to resolve a fully qualified domain name. To determine which name the client uses for the storage server, examine the deduplication host configuration file on the client. See About the deduplication host configuration file on page 88. To fix this problem, configure your network environment so that all permutations of the storage server name resolve. Symantec recommends that you use fully qualified domain names. See About fully qualified domain names on page 46.
137
The Disk Logs report. See Viewing disk reports on page 100. The NetBackup bperror command with the -disk option reports on disk errors. The command resides in the following directories: UNIX: /usr/openv/netbackup/bin/admincmd Windows: install_path\Veritas\NetBackup\bin\admincmd
1001
Error
1002
Warning
1003
Error
1004
Critical
138
1008
Error
1009
Authorization Authorization request from <IP> for user <USER> denied (<REASON>). Error Task initialization on server PureDisk:server1.symantecs.org on host server1.symantecs.org got an unexpected error. Task ended on server PureDisk:server1.symantecs.org on host server1.symantecs.org. A request for agent task was denied on server PureDisk:server1.symantecs.orgon host server1.symantecs.org. Task session start request on server PureDisk:server1.symantecs.orgon host server1.symantecs.org got an unexpected error. Task creation failed, could not initialize task class on server PureDisk:server1.symantecs.orgon host server1.symantecs.org.
1010
1011
Error
1012
Error
1014
Critical
1015
Critical
139
1018
16
Info
1019
Critical
1020
Critical
1028
Critical
1029
Critical
140
1031
Critical
1032
Critical
1036
Warning
1037
Warning
1044
multiple
multiple
1040
Error
1043
Error
1047
Error
141
Deduplication event codes and messages (continued) Event Severity NetBackup Message example Severity
Error Low space threshold exceeded on the partition containing the storage database on server PureDisk:server1.symantecs.org on host server1.symantecs.org.
142
Chapter
Replacing the deduplication storage server host computer Recovering from a deduplication storage server disk failure Recovering from a permanent deduplication storage server failure Recovering the storage server after NetBackup catalog recovery About uninstalling media server deduplication Deactivating media server deduplication
144
Host replacement, recovery, and uninstallation Replacing the deduplication storage server host computer
Warning: The new host must use the same byte order as the old host. If it does not, you cannot access the deduplicated data. In computing, endianness describes the byte order that represents data: big endian and little endian. For example, Sun SPARC processors and Intel processors use different byte orders. Therefore, you cannot replace a Solaris SPARC host with a host that has an Intel processor. Table 9-1 Task How to replace the deduplication storage server host Procedure
Change the disk volume state See Changing the deduplication disk volume state and disk pool state to DOWN on page 114. See Changing the deduplication pool state on page 114. Configure the new host so it See About deduplication servers on page 22. meets deduplication See About deduplication server requirements on page 24. requirements Move the storage to the new See the storage vendor's documentation. host. Install the NetBackup media See the NetBackup Installation Guide for UNIX and Linux. server software on the new See the NetBackup Installation Guide for Windows. host Delete the NetBackup Deduplication Engine credentials If you have load balancing servers, delete the NetBackup Deduplication Engine credentials on those media servers. On each load balancing server, run the following command: See Deleting credentials from a load balancing server on page 111. Add the credentials to the storage server Add the NetBackup Deduplication Engine credentials to the storage server. See Adding NetBackup Deduplication Engine credentials on page 110. Get a configuration file template If you did not save a storage server configuration file before the failure, get a template configuration file. See Getting the deduplication storage server configuration on page 85. Edit the configuration file See Editing a deduplication storage server configuration file on page 86.
Host replacement, recovery, and uninstallation Recovering from a deduplication storage server disk failure
145
Configure the storage server Configure the storage server by uploading the configuration from the file you edited. If you saved a configuration file before the storage server failure, use that file. See Setting the deduplication storage server configuration on page 87. Configure the load balancing If you have load balancing servers, add them to the servers configuration. See Adding a NetBackup load balancing server on page 78. Change configuration settings If you edited the deduplication configuration file, make the same changes to that file. See About the deduplication pd.conf file on page 80. See Editing the deduplication pd.conf file on page 83. Change the disk volume state See Changing the deduplication disk volume state and disk pool state to UP on page 114. See Changing the deduplication pool state on page 114. Change the disk volume state See Changing the deduplication disk volume state and disk pool state to UP on page 114. See Changing the deduplication pool state on page 114. Restart the backup jobs If any backup jobs failed, restart those jobs. Alternatively, wait until the next scheduled backup, at which time the backup jobs should succeed.
146
Host replacement, recovery, and uninstallation Recovering from a deduplication storage server disk failure
After recovery, your NetBackup deduplication environment should function normally. Any valid backup images on the deduplication storage should be available for restores. Symantec recommends that you use NetBackup to protect the deduplication storage server system or program disks. You then can use NetBackup to restore that media server if the disk on which NetBackup resides fails and you have to replace it. Table 9-2 Process to recover from media server disk failure
If the disk is a system boot disk, also install the operating system. See the hardware vendor and operating system documentation.
Ensure that the storage and database are mounted at the same locations. See the storage vendor's documentation.
See the NetBackup Installation Guide for UNIX and Linux. See the NetBackup Installation Guide for Windows. See About the deduplication license key on page 60.
Delete the configuration file If you use load balancing servers in your environment, delete on deduplication servers the storage server configuration files on those servers. See Deleting a deduplication host configuration file on page 88. Delete the credentials on deduplication servers If you have load balancing servers, delete the NetBackup Deduplication Engine credentials on those media servers. See Deleting credentials from a load balancing server on page 111. Add the credentials to the storage server Add the NetBackup Deduplication Engine credentials to the storage server. See Adding NetBackup Deduplication Engine credentials on page 110. Get a configuration file template If you did not save a storage server configuration file before the disk failure, get a template configuration file. See Getting the deduplication storage server configuration on page 85.
Host replacement, recovery, and uninstallation Recovering from a permanent deduplication storage server failure
147
Table 9-2
Configure the storage server Configure the storage server by uploading the configuration from the file you edited. See Setting the deduplication storage server configuration on page 87. Add load balancing servers If you use load balancing servers in your environment, add them to your configuration. See Adding a NetBackup load balancing server on page 78.
Change the disk volume state See Changing the deduplication disk volume state and disk pool state to DOWN on page 114. See Changing the deduplication pool state on page 114.
148
Host replacement, recovery, and uninstallation Recovering from a permanent deduplication storage server failure
Table 9-3
Task
Configure the new host so it Use the same host name as the failed server. meets deduplication See About deduplication servers on page 22. requirements See About deduplication server requirements on page 24. Move the storage to the new Ensure that the storage and database are mounted at the host. same locations. See the storage vendor's documentation. Install the NetBackup media See the NetBackup Installation Guide for UNIX and Linux. server software on the new See the NetBackup Installation Guide for Windows. host See About the deduplication license key on page 60. Delete the credentials on deduplication servers If you have load balancing servers, delete the NetBackup Deduplication Engine credentials on those media servers. See Deleting credentials from a load balancing server on page 111. Add the credentials to the storage server Add the NetBackup Deduplication Engine credentials to the storage server. See Adding NetBackup Deduplication Engine credentials on page 110. Get a configuration file template If you did not save a storage server configuration file before the failure, get a template configuration file. See Getting the deduplication storage server configuration on page 85. Edit the configuration file See Editing a deduplication storage server configuration file on page 86.
Configure the storage server Configure the storage server by uploading the configuration from the file you edited. See Setting the deduplication storage server configuration on page 87. Configure the load balancing If you have load balancing servers, add them to the servers configuration. See Adding a NetBackup load balancing server on page 78.
Host replacement, recovery, and uninstallation Recovering the storage server after NetBackup catalog recovery
149
Table 9-3
Task
Change configuration settings
Change the disk volume state See Changing the deduplication disk volume state and disk pool state to UP on page 114. See Changing the deduplication pool state on page 114. Restart the backup jobs If any backup jobs failed, restart those jobs. Alternatively, wait until the next scheduled backup, at which time the backup jobs should succeed.
Reconfigure an existing deduplication environment. See Reconfiguring the deduplication storage server and storage paths on page 92.
150
Deactivate deduplication and remove the configuration files and the storage files. See Deactivating media server deduplication on page 150.
Disable client deduplication Remove the clients that deduplicate their own data from the client deduplication list. Delete the storage units that See the NetBackup Administrator's Guide for UNIX and use the disk pool Linux, Volume I See the NetBackup Administrator's Guide for Windows, Volume I.. Delete the disk pool Delete the deduplication storage server Stop the services on the storage server See Deleting a deduplication pool on page 115. See Deleting a deduplication storage server on page 106.
See the NetBackup Administrator's Guide for UNIX and Linux, Volume I. See the NetBackup Administrator's Guide for Windows, Volume I.
Delete the storage directories Delete the storage directory and database directory. (Using a separate database directory was an option when you configured deduplication.) See the operating system documentation.
151
Table 9-4
On Windows, delete accounts On Windows storage servers and load balancing servers, and files delete the following: The purediskdbuser account. The account is for the deduplication database administration. The purediskdbuser folder.
See the operating system documentation. On UNIX and Linux, remove On UNIX and Linux storage servers and load balancing files servers, remove the following files:
The storage server and every load balancing server contain a deduplication configuration file. Delete that file from every server that you use for deduplication. See Deleting a deduplication host configuration file on page 88.
See the NetBackup Administrator's Guide for UNIX and Linux, Volume I. See the NetBackup Administrator's Guide for Windows, Volume I.
Start the NetBackup services See the NetBackup Administrator's Guide for UNIX and on the media server Linux, Volume I. See the NetBackup Administrator's Guide for Windows, Volume I.
152
Chapter
10
Deduplication architecture
This chapter includes the following topics:
Deduplication server components Media server deduplication process Deduplication client components Deduplication client backup process About deduplication fingerprinting Data removal process
154
Figure 10-1
PureDisk plug-in
Storage path
Catalog plugin
Database application
Separates the files metadata from the files content. Deduplicates the content (separates files into segments ). Controls the data stream from NetBackup to the NetBackup Deduplication Engine and vice versa.
The plug-in runs on the deduplication storage server. The plug-in also runs on load balancing servers and on the clients that deduplicate their own data. NetBackup Deduplication Engine The NetBackup Deduplication Engine is one of the storage server core components. It stores and manages deduplicated file data. The binary file name is spoold, which is short for storage pool daemon; do not confuse it with a print spooler daemon. The spoold process appears as the NetBackup Deduplication Engine in the NetBackup Administration Console.
155
Catalog plug-in
The catalog plug-in implements a standardized catalog API, which lets the NetBackup Deduplication Engine communicate with the back-end database process. The catalog plug-in translates deduplication engine catalog calls into the calls that are native to the back-end database. The deduplication database stores and manages the metadata of deduplicated files. The metadata includes a unique fingerprint that identifies the files content. The metadata also includes information about the file such as its owner, where it resides on a client, when it was created, and other information. NetBackup uses the PostgresSQL database for the deduplication database. You can use the NetBackup bpps command to view the database process (postgres). The deduplication database is separate from the NetBackup catalog. The NetBackup catalog maintains the usual NetBackup backup image information. On Windows systems, NetBackup creates a purediskdbuser account for database management.
Deduplication database
156
Figure 10-2
Master server
nbjm bpdbm
bpbrm
bptm
Client
The following list describes the backup process when a media server deduplicates the backups and the destination is a media server deduplication pool:
The NetBackup Job Manager (nbjm) starts the Backup/Restore Manager (bpbrm) on a media server. The Backup/Restore Manager starts the bptm process on the media server and the bpbkar process on the client. The Backup/Archive Manager (bpbkar) on the client generates the backup images and moves them to the media server bptm process. The Backup/Archive Manager also sends the information about files within the image to the Backup/Restore Manager (bpbrm). The Backup/Restore Manager sends the file information to the bpdbm process on the master server for the NetBackup database.
The bptm process moves the data to the PureDisk plug-in. The PureDisk plug-in retrieves a list of fingerprints from the last full backup for the client from the NetBackup Deduplication Engine. The list is used as a cache so the plug-in does not have to request each fingerprint from the engine. The PureDisk plug-in performs file fingerprinting calculations. The PureDisk plug-in compares the file fingerprints and the segment fingerprints against the fingerprint list in its cache.
157
The PureDisk plug-in sends only unique data segments to the NetBackup Deduplication Engine on the storage server. The NetBackup Deduplication Engine writes the data to the media server deduplication pool.
Figure 10-3 shows the backup process when a media server deduplicates the backups. The destination is a PureDisk storage pool. A description follows. Figure 10-3
Master server
nbjm bpdbm
bpbrm
bpbkar
The following list describes the backup process when a media server deduplicates the backups and the destination is a PureDisk storage pool:
The NetBackup Job Manager (nbjm) starts the Backup/Restore Manager (bpbrm) on a media server. The Backup/Restore Manager starts the bptm process on the media server and the bpbkar process on the client). The Backup/Archive Manager (bpbkar) generates the backup images and moves them to the media server bptm process. The Backup/Archive Manager also sends the information about files within the image to the Backup/Restore Manager (bpbrm). The Backup/Restore Manager sends the file information to the bpdbm process on the master server for the NetBackup database.
The bptm process moves the data to the PureDisk plug-in. The PureDisk plug-in retrieves a list of fingerprints from the last full backup for the client from the PureDisk storage pool. The list is used as a cache so the plug-in does not have to request each fingerprint from the storage pool.
158
The PureDisk plug-in compares the file fingerprints and the segment fingerprints against the fingerprint list in its cache. The PureDisk plug-in performs file fingerprinting calculations. The PureDisk plug-in sends only unique data segments to the PureDisk storage pool.
Deduplicates the content (separates files into segments ). Controls the data stream from NetBackup to the NetBackup Deduplication Engine and vice versa. Proxy server Client The OpenStorage proxy server (nbostpxy) manages control communication with the media server.
Proxy plugin
Media server The proxy plug-in manages control communication with the client.
159
Figure 10-4
Master server nbjm
bpbrm
Proxy plug-in
bptm
The following list describes the backup process for a deduplication client to a media server deduplication pool:
The NetBackup Job Manager (nbjm) starts the Backup/Restore Manager (bpbrm) on a media server. The Backup/Restore Manager probes the client to determine if it is configured and ready for deduplication. If the client is ready, the Backup/Restore Manager starts the following processes: The OpenStorage proxy server (nbostpxy) on the client and the data moving processes (bpbkar) on the client and bptm on the media server). NetBackup uses the proxy plug-in on the media server to route control information from bptm to nbostpxy.
The Backup/Archive Manager (bpbkar) generates the backup images and moves them to the client nbostpxy process by shared memory. The Backup/Archive Manager also sends the information about files within the image to the Backup/Restore Manager (bpbrm). The Backup/Restore Manager sends the file information to the bpdbm process on the master server for the NetBackup database.
The client nbostpxy process moves the data to the PureDisk plug-in.
160
The PureDisk plug-in retrieves a list of fingerprints from the last full backup for the client from the NetBackup Deduplication Engine. The list is used as a cache so the plug-in does not have to request each fingerprint from the engine. The PureDisk plug-in performs file fingerprinting calculations. The PureDisk plug-in sends only unique data segments to the storage server, which writes the data to the media server deduplication pool.
Figure 10-5 shows the backup process of a client that deduplicates its own data. The destination is a PureDisk storage pool. A description follows. Figure 10-5
Master server nbjm bpdbm bpbkar PureDisk plug-in Proxy server (nbostpxy)
bpbrm
Proxy plug-in
The following list describes the backup process for a deduplication client to a media server deduplication pool:
The NetBackup Job Manager (nbjm) starts the Backup/Restore Manager (bpbrm) on a media server. The Backup / Restore Manager probes the client to determine if it is configured and ready for deduplication. If the client is ready, the Backup/Restore Manager starts the following processes: The OpenStorage proxy server (nbostpxy) on the client and the data moving processes (bpbkar on the client and bptm on the media server). NetBackup uses the proxy plug-in on the media server to route control information from bptm to nbostpxy.
161
The Backup/Archive Manager (bpbkar) generates the backup images and moves them to the client nbostpxy process by shared memory. The Backup/Archive Manager also sends the information about files within the image to the Backup/Restore Manager (bpbrm). The Backup/Restore Manager sends the file information to the bpdbm process on the master server for the NetBackup database.
The client nbostpxy process moves the data to the PureDisk plug-in. The PureDisk plug-in retrieves a list of fingerprints from the last full backup for the client from the NetBackup Deduplication Engine. The list is used as a cache so the plug-in does not have to request each fingerprint from the engine. The PureDisk plug-in performs file fingerprinting calculations. The PureDisk plug-in sends only unique data segments to the PureDisk storage pool.
The PureDisk plug-in reads the backup image and separates the image into files. The plug-in separates files into segments. For each segment, the plug-in calculates the hash key (or fingerprint) that identifies each data segment. To create a hash, every byte of data in the segment is read and added to the hash. The plug-in compares its calculated fingerprints to the fingerprints that the NetBackup Deduplication Engine stores on the media server. Two segments that have the same fingerprint are duplicates of each other. The plug-in sends unique segments to the deduplication engine to be stored. A unique segment is one for which a matching fingerprint does not exist in the engine already. The first backup may have a 0% deduplication rate; however, a 0% deduplication rate is unlikely. Zero percent means that all file segments in the backup data are unique.
162
The NetBackup Deduplication Engine saves the fingerprint information for that backup.
The PureDisk plug-in retrieves a list of fingerprints from the last full backup for the client from the NetBackup Deduplication Engine. The list is used as a cache so the plug-in does not have to request each fingerprint from the engine. The PureDisk plug-in reads the backup image and separates the image into files. The PureDisk plug-in separates files into segments and calculates the fingerprint for each file and segment. The plug-in compares each fingerprint against the local fingerprint cache. If the fingerprint is not known in the cache, the plug-in requests that the engine verify if the fingerprint already exists. If the fingerprint does not exist, the segment is sent to the engine. If the fingerprint exists, the segment is not sent.
The fingerprint calculations are based on the MD5 algorithm. However, any segments that have different content but the same MD5 hash key get different fingerprints. So NetBackup prevents MD5 collisions.
NetBackup removes the image record from the NetBackup catalog. NetBackup directs the NetBackup Deduplication Manager to remove the image. The deduplication manager immediately removes the image entry and adds a removal request for the image to the database transaction queue. From this point on, the image is no longer accessible. When the queue is next processed, the NetBackup Deduplication Engine executes the removal request. The engine also generates removal requests for underlying data segments At the successive queue processing, the NetBackup Deduplication Engine executes the removal requests for the segments.
Storage is reclaimed after two queue processing runs; that is, in one day. However, data segments of the removed image may still be in use by other images. If you manually delete an image that has expired within the previous 24 hours, the data becomes garbage. It remains on disk until removed by the next garbage collection process.
163
See About maintenance processing on page 118. See Deleting backup images on page 117.
164
Index
A
about NetBackup deduplication 13 about NetBackup deduplication options 14 about the deduplication host configuration file 88 appliance deduplication 15 attributes clearing deduplication pool 113 clearing deduplication storage server 105 OptimizedImage 31 setting deduplication pool 113 setting deduplication storage server 105 viewing deduplication pool 117 viewing deduplication storage server 109
configuring deduplication 62 container files about 100 viewing capacity within 100 CR sent field of the job details 97 credentials 28 adding NetBackup Deduplication Engine 110 changing NetBackup Deduplication Engine 110
D
data removal process for deduplication 162 database system error 127 deactivating media server deduplication 150 dedup field of the job details 97 deduplication about credentials 28 about fingerprinting 161 about the license key 60 adding credentials 110 cache hits field of the job details 97 capacity and usage reporting 97 changing credentials 110 client backup process 158 compression 29 configuration file 80 configuring 62 configuring optimized synthetic backups 72 container files 100 CR sent field of the job details 97 data removal process 162 dedup field of the job details 97 encryption 30 event codes 137 garbage collection 118 how it works 15 license key for 60 licensing 60 limitations 25 maintenance processing 118 media server process 155
B
backup client deduplication process 158 big endian 147 byte order 147
C
cache hits field of the job details 97 capacity adding storage 57 capacity and usage reporting for deduplication 97 CIFS 45 clearing deduplication pool attributes 113 client deduplication about 26 components 158 host requirements 27 limitations 27 sizing the systems 19 Common Internet File System 45 compression and deduplication 29 pd.conf file setting 80 configuring a deduplication pool 64 configuring a deduplication storage server 63 configuring a deduplication storage unit 67
166
Index
deduplication (continued) network interface 28 node 24 performance 43 planning deployment 18 requirements for optimized within the same domain 32 scaling 46 scanned field of the job details 97 storage capacity 56 storage destination 20 storage paths 56 storage requirements 55 storage unit properties 67 stream rate field of the job details 97 supported systems 19 deduplication configuration file editing 83 settings 80 deduplication database about 155 log file 124 purediskdbuser account on Windows 155 deduplication disk pool. See deduplication pool deduplication disk volume changing the state 114 determining the state 116 deduplication host configuration file about 88 deleting 88 deduplication hosts and firewalls 29 client requirements 27 load balancing server 23 server requirements 24 storage server 23 deduplication logs about 123 client deduplication proxy plug-in log 124 client deduplication proxy server log 124 configuration script 124 deduplication database 124 NetBackup Deduplication Engine 125 NetBackup Deduplication Manager 125 PureDisk plug-in log 125 VxUL deduplication logs 125 deduplication node about 24 adding a load balancing server 78
deduplication node (continued) removing a load balancing server 107 deduplication optimized synthetic backups about 31 deduplication pool about 64 changing properties 112 changing the state 114 clearing attributes 113 configuring 64 deleting 115 determining the state 115 properties 65 setting attributes 113 viewing 116 viewing attributes 117 deduplication port usage about 29 troubleshooting 135 deduplication processes do not start 130 deduplication rate how file size affects 45 monitoring 95 deduplication registry resetting 89 deduplication servers about 22 components 153 host requirements 24 deduplication storage capacity about 56 monitoring 97 viewing capacity in container files 100 deduplication storage destination 20 deduplication storage paths 56 deduplication storage requirements 55 deduplication storage server about 23 changing properties 104 clearing attributes 105 components 153 configuration failure 128 configuring 63 configuring target for remote master server duplication 90 deleting 106 determining the state 106 editing configuration file 86 getting the configuration 85
Index
167
deduplication storage server (continued) reconfiguring 93 recovery 147 replacing the host 143 setting attributes 105 setting the configuration 87 viewing 108 viewing attributes 109 deduplication storage server configuration file about 84 deduplication storage type 20 Deduplication storage unit Only use the following media servers 68 Use any available media server 68 deleting backup images 117 deleting deduplication host configuration file 88 disaster recovery protecting the data 49 recovering the storage server after catalog recovery 149 disk failure deduplication storage server 145 disk logs 101 disk logs report 99 disk pool. See deduplication pool cannot delete 133 disk pool status report 98, 101 disk storage unit report 101 Disk type 67 disk volume changing the state 114 determining the state of a deduplication 116 volume state changes to down 131 duplicating images to another NetBackup domain about 42
F
fingerprinting about deduplication 161 firewalls and deduplication hosts 29
G
garbage collection for deduplication 118 invoke manually 119
H
host requirements 24 how deduplication works 15
I
images on disk report 101 initial seeding 44 iSCSI 45
L
license information failure for deduplication 128 license key for deduplication 60 licensing deduplication 60 limitations media server deduplication 25 little endian 147 load balancing server about 23 adding to a deduplication node 78 for deduplication 23 removing from deduplication node 107 logs about deduplication 123 client deduplication proxy plug-in log 124 client deduplication proxy server log 124 deduplication configuration script log 124 deduplication database log 124 disk 101 NetBackup Deduplication Engine log 125 NetBackup Deduplication Manager log 125 PureDisk plug-in log 125 VxUL deduplication logs 125
E
encryption and deduplication 30 pd.conf file setting 81 endian big 147 little 147 event codes deduplication 137
M
maintenance processing for deduplication 118
168
Index
Mapped full VM backup 72 Maximum concurrent jobs 68 Maximum fragment size 68 media server deduplication process 155 sizing the systems 19 Media Server Deduplication Option about 21 media server deduplication pool. See deduplication pool migrating from PureDisk to NetBackup deduplication 52 migrating to NetBackup deduplication 53
optimized MSDP deduplication (continued) pull configuration within the same domain 40 push configuration within the same domain 34 requirements 32 within the same domain 32 optimized MSDP duplication about 31 optimized synthetic backups configuring for deduplication 72 deduplication 31 OptimizedImage attribute 31
P
pd.conf file about 80 editing 83 settings 80 pdde-config.log 124 performance deduplication 43 monitoring deduplication rate 95 port usage and deduplication 29 troubleshooting 135 provisioning the deduplication storage 55 PureDisk deduplication 15 PureDisk Deduplication Option replacing with media server deduplication 51 PureDisk plug-in about 154 log file 125 purediskdbuser account 155
N
NDMP 45 NetBackup Client Deduplication Option 15 NetBackup deduplication about 13 license key for 60 NetBackup Deduplication Engine about 154 about credentials 28 adding credentials 110 changing credentials 110 logs 125 NetBackup Deduplication Manager about 155 logs 125 NetBackup deduplication options 14 NetBackup Media Server Deduplication Option 15 network interface for deduplication 28 NFS 45 node deduplication 24
Q
queue processing invoke manually 119 queue processing for deduplication 118
O
OpenStorage appliance deduplication 15 optimized deduplication copy configuring 75 guidance for 34 limitations 33 push configuration 36 throttling traffic 78 optimized MSDP deduplication about the media server in common within the same domain 34
R
reconfiguring deduplication 93 recovery deduplication storage server 147 from deduplication storage server disk failure 145 Red Hat Linux deduplication processes do not start 130 replacing PDDO with NetBackup deduplication 51 replacing the deduplication storage server 143
Index
169
replication about duplicating images to another NetBackup domain 42 reports disk logs 99, 101 disk pool status 98, 101 disk storage unit 101 resetting the deduplication registry 89 restores at a remote site 120 how deduplication restores work 50 specifying the restore server 121 reverse host name lookup prohibiting 128 reverse name lookup 128
S
scaling deduplication 46 scanned field of the job details 97 seeding initial 44 server not found error 128 setting deduplication pool attributes 113 storage capacity about 56 adding 57 for deduplicaton 56 monitoring deduplication 97 viewing capacity in container files 100 storage paths for deduplication 56 reconfiguring 93 storage requirements for deduplication 55 storage server about the configuration file 84 changing properties for deduplication 104 components for deduplication 153 configuring for deduplication 63 configuring target for remote master server duplication 90 deduplication 23 deleting a deduplication 106 determining the state of a deduplication 106 editing deduplication configuration file 86 getting deduplication configuration 85 reconfiguring deduplication 93 recovery 147 replacing the deduplication host 143
storage server (continued) setting the deduplication configuration 87 viewing 108 viewing attributes 109 storage server configuration getting 85 setting 87 storage server configuration file editing 86 storage type for deduplication 20 storage unit configuring for deduplication 67 properties for deduplication 67 recommendations for deduplication 68 Storage unit name 67 Storage unit type 67 stream handlers Backup Exec 45 Hyper-V 45 NDMP 45 NetBackup 45 VMWare vmdk 45 vSphere 45 Windows System State 45 stream rate field of the job details 97 supported systems 19
T
troubleshooting database system error 127 deduplication backup jobs fail 130 deduplication processes do not start 130 general operational problems 132 host name lookup 128 installation fails on Linux 127 no volume appears in disk pool wizard 129 server not found error 128
U
uninstalling media server deduplication 149
V
viewing deduplication pool attributes 117 viewing storage server attributes 109
W
Windows System State 45