What is data deduplication?

Data deduplication is a technology that reduces storage requirements by identifying and removing redundant data.

 

What data deduplication techniques does dataStor Shield™ use?

All of our DATASTOR Shield™ products were built around the same enterprise class, patent-pending Adaptive Content Factoring™ engine. DATASTOR Shield™ uses three data deduplication techniques to identify and remove redundant data. First, all data is compressed using advanced data compression. Second, a global single-instance-storage (G-SIS) technique is used to remove redundant data at the file-level regardless of file name, path, or even server. Third, active files are analyzed at a sub-file level to remove redundancy providing efficiency for processing files like PSTs, Exchange EDBs, and SQL databases.

 

Can DATASTOR Shield™ replace my existing backup application?

Yes, DATASTOR Shield™ can protect data without any additional third-party backup applications. This is just one more feature that adds to your cost savings.

 

Is my data deduplicated before it touches the network?

Yes, using “source-based” data deduplication, DATASTOR Shield™ removes redundant data on the protected server before any data is transferred across LAN or WAN network connections.

 

Does DATASTOR Shield™ require proprietary hardware?

No, DATASTOR Shield™ is a software-only solution that runs on Microsoft Windows. This means that you have the freedom to choose the best hardware to fit your needs and budget.

 

Can DATASTOR Shield™ protect data stored on non-Windows based servers (Linux, Solaris, HP-UX, etc.)?

Yes, DATASTOR Shield™ can “post-process” data that has been transferred from non-Windows-based servers. For example, a database backup on HP-UX is transferred to the DATASTOR Shield™ server and then deduplicated and stored efficiently by a local protection plan. Currently, DATASTOR Shield™ supports only “source-based” data deduplication on Microsoft Windows.

 

What is the difference between source based, post-process and in-line data deduplication?

One important aspect of data deduplication is WHERE the redundant data is processed and removed. “Source-based” products like DATASTOR Shield™ process and remove redundant data on the protected server, before it is transferred across the network. “Post-process” and “in-line” products process data in a central location and store only unique data. “Post-process” also requires extra disk space to cache the data before redundant data is removed.

 

Can iSCSI connected storage be used by DATASTOR Shield™ to store deduplicated data?

Yes, DATASTOR Shield™ uses standard NTFS volumes to store deduplicated data. These NTFS volumes can be internal, iSCSI, and Fibre Channel connected.

 

Can NAS connected storage be used by DATASTOR Shield™ to store deduplicated data?

Yes, DATASTOR Shield™ 3.0 and later versions support NAS connected shares (CIFS\SMB, NFS) to store deduplicated data. These shares do not require NTFS.

 

Does DATASTOR Shield™ install agent software on all the protected computers?

No, the only thing that DATASTOR Shield™ puts on the protected computer is a scheduled task. This scheduled task remotely executes the deduplication process directly from the DATASTOR Shield™ server. This configuration keeps the footprint to a minimum on the protected computer and simplifies future software upgrades because only the DATASTOR Shield™ server must be upgraded. Every scheduled task is still managed centrally through the DATASTOR Shield™ management console.

 

What is the overhead (CPU and memory) of the deduplication process running on the protected server?

The memory usage of the deduplication process running on the protected server is less than 20MB. The CPU utilization varies based on the number and speed of the CPU(s). On most modern servers the CPU utilization ranges between 25-35% while the plan is running.

 

Can the DATASTOR Shield™ server be a virtual machine?

Yes, since DATASTOR Shield™ fully distributes the data deduplication process across the protected servers, the overhead on the DATASTOR Shield™ server is much less. One thing to note is that backend processes, like data expiration and data verification, will require more CPU and memory. These backend processes will take longer if the DATASTOR Shield™ server is running in a virtual machine. Storage scalability should also be considered when DATASTOR Shield™ is running in a virtual machine. You should determine how much storage capacity can be connected to the virtual machine and verify that this meets the needs of your environment.

 

Can DATASTOR Shield™ deduplicate and store virtual machine images (VMDK, VHD, XVA)?

Our best practice is to run a protection plan within the VM as if it were a physical computer. There are several advantages to protecting the computer in this way. If the VM is an application server, our Exchange and SQL support will quiesce the system during the protection plan run and integrates log truncation into the process. G-SIS will more efficiently store the files than with VM image processing. You will also see a shorter backup window, allowing for additional plan runs per day. However, if you need to protect VM image files, DATASTOR Shield™ will process the large image files themselves. You can simply store a copy of the virtual machine images directly on the DATASTOR Shield™ server and schedule a local protection plan to efficiently store these images. The original image can be overwritten every day with new images while DATASTOR Shield™ is keeping a deduplicated backup history for you.

 

Can DATASTOR Shield™ deduplicate and store Microsoft Exchange storage groups?

Yes, DATASTOR Shield™ supports creating protection plans for Exchange 2003, 2007, 2010, 2013. Backing up Exchange storage groups is done by integrating with the Microsoft Exchange VSS Writer found in Windows 2003 or later allowing the plan to capture a consistent image of Exchange storage groups while they are mounted. After DATASTOR Shield™ has a consistent image it uses sub-file data deduplication to remove redundant data found in the large EDB files. Every recovery point is a FULL backup, but the disk space used is far less. Exchange protection plans automatically discover storage group file locations, perform integrity checks on all EDB databases, and truncate logs after a successful backup.

 

 Can DATASTOR Shield™ deduplicate and store Microsoft SQL databases?

Yes, DATASTOR Shield™ supports creating protection plans for Microsoft SQL Server 2005, 2008, 2008 R2, 2012, 2014 . Backing up databases is done by integrating with SQL VSS Writer found in Windows 2003 or later to capture a consistent image of SQL databases while they are mounted. After DATASTOR Shield™ has a consistent image, it uses sub-file data deduplication to remove redundant data found in the large MDF files. Every recovery point is a FULL backup, but the disk space used is far less. Databases can be configured for either Simple Recovery mode or Full Recovery mode. Databases implemented with Full Recovery mode can be specified to have logs truncated after a successful backup.