NDMP Backup of Dell FS Series NAS using CommVault Simpana Dell EMC Engineering January 2017 Dell EMC Best Practices
Revisions Date Description January 2013 Initial Release June 2013 Results added for FS7600 and FS7610 platforms January 2017 Updated to include new branding and formatting Acknowledgements This best practice white paper was produced by the following members of the Dell Storage team: Engineering: Chidambara Shashikiran Technical Marketing: Raj Hosamani Editing: Camille Daily Additional contributors: Jacob Cherian, Suresh Jasrasaria, Puneet Dhawan, Gabby Lavy, Mark Welker, Andrei Ivanov, and Mike Kosa
Table of contents Revisions.............................................................................................................................................................................1 Acknowledgements .............................................................................................................................................................1 Executive summary.....................................................................................................................
5.2.2 Flexible restore .................................................................................................................................................32 5.2.3 Direct Access Recovery ...................................................................................................................................33 6 Best practices: Putting it all together ..........................................................................................................................35 6.
Executive summary The exponential growth of data presents several challenges for IT administrators tasked with protecting data. Two basic considerations come into play, determining how quickly data must be recovered and how to minimize the extent of data loss.
1 Introduction The storage industry is seeing an exponential increase in the growth rate of unstructured data. Analysts agree that the growth rate of unstructured data will continue to exceed that of other data types. This paper discusses best practices for protecting file data on Dell™ FS Series NAS Appliances using NDMP. It begins with a review of the data protection and integrity features built into the FluidFS architecture, followed by an in-depth discussion of the NDMP feature.
1.2 Audience The paper is intended for solution architects, application and storage engineers, system administrators, and IT managers who need to understand how to design, properly size, and deploy a backup solution for the FS Series based NAS appliance. It is expected that the reader has a working knowledge of NDMP architecture, FS Series NAS system administration and iSCSI SAN network design. 1.3 Terminology The following terms are used throughout this document.
Network Attached Storage (NAS): A self-contained computer or appliance which provides file-based data storage services to other devices on the network. Network Data Management Protocol (NDMP): NDMP is an open-standard protocol for performing backup and restore of heterogeneous NAS appliances. NDMP provides a common interface between backup application and heterogeneous NAS devices without installing any third-party software on NAS server.
2 Fluid file system architecture The FluidFS architecture shown in Figure 2 is highly available through an underlying cluster technology that consists of multiple controllers working together, monitoring each other, and providing automatic failover capabilities. The basic implementation is a pair of controllers (FluidFS Appliance) in a cluster that can be scaled by adding additional NAS appliance depending on client workload characteristics.
3 NDMP NDMP is an open standard protocol for enterprise-wide backup of NAS devices on a client network. The main objective of NDMP is to address issues faced by Data Management Application (DMA) vendors such as Symantec and CommVault when attempting to backup networks of heterogeneous NAS devices. 3.1 Overview and benefits If mission-critical business data cannot be restored after a system failure, the entire business is put at risk.
3.2 NDMP architecture NDMP supports three methods of backup over the local area network. Local NDMP backup: The backup target is directly attached to the NAS. The backup data is transferred directly from block storage to the attached backup target without traveling across the LAN. Only the backup control data travels across the LAN from the NDMP client running the backup software. Remote NDMP backup: The backup target is attached to the NDMP client with backup software.
For a detailed discussion on various backup strategies using NDMP backup types to meet the required RTOs and RPOs, refer to Understanding Snapshots in Dell Fluid File System NAS. 3.4 NDMP direct access recovery Data protection is a continuous process of ensuring that data can be quickly recovered if it is lost. The RPO requirements include tolerance for data loss and RTO which specifies the tolerance for down time while a recovery is in progress.
3.6 Backup and restoring data FluidFS NAS solutions support full, incremental, and differential NDMP backups (dump levels 0-9), as well as DAR, in the three-way (or remote) configuration shown in Figure 4. In this configuration, the DMA server mediates the data transfer between NAS appliance and storage device. The current release of FluidFS does not support backup to locally attached tape or disk devices.
4 NDMP backup and recovery test methodology NDMP provides backup software vendors with the flexibility to offer backup and restore capabilities without installing any software agents on the NAS servers. There are many data protection products available for performing NDMP backup. In this solution, a Dell DL disk based backup and recovery appliance with CommVault Simpana software was used as the backup server.
4.2 Test objectives The primary objectives of the tests were to characterize the NDMP backup and recovery scenarios using FS76X0 for use cases listed below. Unstructured data comprised of Microsoft® Office®, Adobe® pdf, and media files. These files are usually smaller to medium in size and range from 4 KB to 1 GB. File shares storing streaming video and media files. These files are usually large in size and range from 1 GB to 10 GB.
4.3 Test approach The test approach can be summarized as: 4.4 The backup operations were performed with simulated real-world file transactions. The NDMP backup performance of a single NAS container was measured with default network settings to obtain a baseline. The tests were executed to characterize the backup throughput using data sets consisting of small and large files.
4.5 Test tools Load generation and monitoring tools were used to complete the tests and provide the best practices in this paper. 4.5.1 Load generation The vdbench file system workload generator was used to populate the NAS container with different sized files. These files simulated a real world NAS data distribution. A workload of 8 K random I/Os with 70% read and 30% writes was used for evaluating the performance impact on production I/O during NDMP backup.
5 NDMP backup and recovery test results and analysis This section describes the different NDMP backup and restore tests performed as well as the key findings from each test. For all 1 GbE configuration tests, a single FS7600 NAS appliance consisting of two active controllers and CommVault software installed on the backup server was used to perform NDMP backup. Three PS6100XV arrays were connected as NAS backend and a PS6100E array was used as the backup target.
5.1.1 NDMP backup performance impact These tests were executed to measure the performance impact on a production NAS system during an NDMP backup. Vdbench was used to simulate a real-world NAS client working environment. A workload of 8K random I/Os with 70% reads and 30% writes was simulated to represent a NAS client end-user collaboration environment. Note: The performance numbers displayed in the base line graphs are not representative of the maximum performance capacity of the NAS.
5.1.2 Unoptimized NDMP backup performance for large and small sized files These tests were executed using the default NDMP configuration without any optimization to establish a performance baseline to evaluate the following objectives. 5.1.2.
was effectively used in the backup from the FS7600. Performing backup of more than three NAS containers did not result in improved throughput as the 1 Gb NIC was saturated to its maximum capacity. The CPU utilization on the FS7600 controllers did not exceed 40%. Backup of 5 GB sized files Avg Throughput (MB/sec) MAX Throughput (MB/sec) Avg CPU Utilization (%) MAX CPU Utilization (%) 1 Container 76.84 90.73 29.1 34.24 2 Containers 112.04 116.76 25.71 36.54 3 Containers 116.9 117.57 34.
Backup throughput and CPU utilization for small sized files As can be seen from Figure 9, the backup throughput scales linearly as the number of file systems increased. This resulted in significant reduction in backup time. This is a very desirable behavior as real world deployments are likely to have many file systems configured on the NAS. It is also important to note that the average CPU utilization of NAS controller is higher compared to backup of large files as explained in Table 2.
The 10GbE test configuration consisted of three PS6110XV arrays as NAS backend. Throughput (MB/s) 350 300 250 200 150 100 50 0 Elapsed time FS7610 Configuration: NDMP backup throughput with twelve containers This configuration utilized a single 10 GbE NIC on the backup server in the default NDMP configuration. More than 300MB/sec backup throughput was achieved using the default FS7610 NDMP configuration compared to 117 MB/sec throughput on 1 GbE FS7600 NAS configuration as discussed in Section 6.1.2.1.
5.1.4.1 Optimizing backup rate: Utilize multiple streams and multiple NICs for efficient backup Without optimizations, the backup processes utilize one network interface on the backup server and the FS7600 appliance limits the maximum backup throughput to 120 MB/sec (theoretical max of a 1 Gb NIC). This is clearly demonstrated in the test results in Section 6.1.2.
The following steps utilize multiple front end NICs on the FS7600 and map the multiple NICs dedicated for the client network on the backup server. 1. Four virtual IP addresses were defined on the FS7600 as shown in Figure 12. FS Series NAS (FS7600) virtual IP configuration 2. Four NICs were dedicated on the backup server which were configured with separate IP addresses 3.
This optimization achieved an average backup throughput of approximately 278 MB/sec as shown in Figure 14 compared to about 117 MB/sec as in the case of the unoptimized configuration. The optimization created four 1 G lanes for backup and improved the backup performance by approximately 235%. NDMP backup optimized throughput using FluidFS load balancing and data interface pairs Utilizing the data interface pairs feature and FluidFS VIP together improved backup performance by more than 200%. 5.1.4.
As explained in Section 6.1.4.1, FluidFS architecture allows defining multiple virtual IP addresses per NAS appliance and these can be used to address the NAS independently. 1. Ensure that at least two virtual IP addresses are defined per NAS appliance. FS Series NAS (FS7610) virtual IP configuration 2. Add a NAS client for CommVault (Using the Commvault CommCell GUI) corresponding to each virtual IP address.
500 440 450 Throughput (MB/sec) 400 350 300 300 250 200 150 100 50 0 1 x Virtual IP 2 x Virtual IPs Backup throughput scaling with VIPs The optimization enabled the utilization of resources on all active controllers and this effectively improved the backup performance by approximately 47% versus using a single virtual IP. 5.1.4.3 Optimizing backup of large containers One of the key differentiating features of FluidFS is the ability to transcend the limitations of traditional file systems.
To test this scenario, we populated a single file share with 1.2 TB of data that consisted of more than ten million files distributed across 17 top level directories as described in Section 6.1. NDMP backup was performed on this file share using both the brute force and divide-and-conquer approaches. The backup jobs were setup as follows: Brute force approach: A single NDMP backup job was created that executed the backup on the entire container.
NDMP backup throughput using the brute force approach NDMP backup throughput using the divide-and-conquer approach In contrast, the divide-and-conquer approach benefited from the fact that multiple streams were used to perform the backup. This resulted in a high degree of parallelism during the initial phase of the backup; effectively eliminating multiple peaks and valleys in network utilization. During the initial phase of backup, the cumulative backup throughput consistently exceeded 150 MBps.
Implementing a divide-and-conquer approach (simultaneous backup of directories) as opposed to the brute force approach (entire file system) for large containers enables the use of multiple streams and a 300% reduction in the backup time. 5.2 NDMP recovery test scenarios In this section, the performance characteristics of restore operations from an NDMP backup target are explored. The tests in this section utilized the backup sets from the previous sections.
Restore network throughput with one and four NAS containers As seen in Figure 19, a single container restore delivered a throughput of 70 to 90 MBps. This is primarily because the restore operation was designed to utilized just one stream. Performing restore of four containers simultaneously delivered three times the performance with restore throughput touching close to 200 MBps consistently. The restore throughput also scales linearly as the number of containers are increased.
Multi-directory restore throughput, divide-and-conquer approach Performing simultaneous restore of directories utilized multiple data streams and indicated significantly higher restore throughput. 5.2.2 Flexible restore Backup strategies serve three primary purposes. First, restore all the data defined by the RPO objectives.
Test results for both in place and out of place restore scenarios are summarized in Table 3. In place and out of place recovery Recovery scenario Duration (Hr:Min:Sec) Remarks Recovery to the same path 01:16:27 All the files in original location were manually removed before performing a restore Recovery to the same path using the overwrite option 01:23:27 The NAS overwrite option was used on CommVault.
Directory structure for the single file restored using DAR “vdb_f0008.file” (located under six levels of directories) was restored within one and half minutes. The same operation would have taken more than one hour and forty minutes if DAR had not been enabled. The above results clearly show the significance and benefits of the NDMP DAR feature in a NAS environment with a huge amount of unstructured data.
6 Best practices: Putting it all together As demonstrated in the previous sections, the NDMP functionality of a FS Series NAS Appliances combined with the capabilities of CommVault Simpana, allows system administrators to implement effective data protection strategies. This section summarizes the key best practices identified through analyzing the test results.
Configure backup jobs to enable multiple streams. CommVault implementation generates one backup stream per NAS container, therefore, the backup jobs have to be structured to enable creation of multiple streams. See Appendix B for an example of how multiple streams can be configured while performing backups. The recommendations are: - 6.
7 Conclusions The growth of unstructured data presents significant challenges to system administrators in designing and implementing effective data protection strategies. NDMP protocol has emerged as a popular choice for backup and restore of unstructured data. This paper evaluated NDMP V4 based backups using a FS76x0 NAS appliance. The FS76x0 NAS appliances provide near line rates for large file backups.
A Solution configuration This appendix provides details of the test configurations used to support this paper. Solution infrastructure components Solution Configuration - Hardware Components: Description Servers 8 VMs were hosted on one of the ESXi servers for hosting the NFS export. Vdbench I/O workload was executed on these VMs during backup operation. Another eight VMs were hosted on the other ESXi server for hosting the CIFS share.
A.1 Solution architecture The high-level architecture block diagram of the 1GbE based FS7600 test configuration is shown in Figure 22. The network connectivity from FS7600 to both LAN and SAN are illustrated, as well as the connection paths from the PS Series arrays to the SAN.
The high-level architecture block diagram of the 10GbE based FS7610 test configuration is shown in Figure 23. The network connectivity from FS7600 to both LAN and SAN are illustrated, as well as the connection paths from the PS Series arrays to the SAN.
A.1.1 PS Series array configuration Three PS6100XV arrays were used as back end for the FS7600 appliance. Four virtual IPs were configured on FS7600 and used for NDMP backup/recovery operations. Two dedicated Dell Networking N3000 switches were used for SAN connectivity. The front end client connections were made to a separate pair of Dell Networking N3000 switches. Three PS6110XV arrays were used as back end for the FS7610 appliance.
B Backup optimization techniques B.1 Using multiple data streams for backup CommVault does not support multiple streams within a single container for NDMP backups. Using multiple data streams is highly recommended for better backup and recovery performance. It is also important to set the Number of Data Readers variable equal to the number of NAS containers being backed up simultaneously. This allows a dedicated data stream to be allocated for each NAS container.
B.2 Scheduling multi-directory backups If there are multiple directories within a NAS container to be backed up, then a single subclient should be created for each directory as shown in Figure 24.
Additional resources Support.dell.com is focused on meeting your needs with proven services and support. DellTechCenter.com is an IT Community where you can connect with Dell Customers and Dell employees for the purpose of sharing knowledge, best practices, and information about Dell products and your installations. Referenced or recommended Dell publications: Dell Fluid File System Overview: http://en.community.dell.