Data Deduplication with Amazon S3 Backups Can Save Money and Reduce Backup Windows
Enterprises are seeking new and cost-effective ways to tackle their data protection challenges. Backup solutions have continued to evolve to improve performance of the data protection. However, these solutions, cannot keep pace with the growing amount of data. A new technique, called deduplication, is quickly gaining market interest by creating a disruptive economic advantage for companies in terms of storage and bandwidth costs as well as reducing data recovery times. Data deduplication offers companies the opportunity to dramatically reduce the amount of storage required for backups and to more efficiently centralize backup data to multiple sites for assured disaster recovery.
How does it work?
In a typical backup cycle, all the data would be transmitted across the network to Amazon S3 for storage when performing a backup. If your vendor supports deduplication, then it eliminates this network burden and determines that only changed data is needed to be transferred. This is usually much less than and efficient than transferring and storing the entire backup data with every backup.
Traditional daily backups as shown above do not recognize that the content of data has not changed significantly. Hence, each day, there is a 2GB backup made in the legacy backup solutions. However, S3SQL determines that data has not changed or slightly changed, and hence only incremental changes are transferred. Apart from storage and network bandwidth savings, perhaps, one of the often missed benefits is the reduction in backup time and backup window. The bandwidth speeds of .2MB/s = 1GB/hour will require 2 hours to upload a 2GB backup to Amazon S3. However, with deduplication, it might require only 1% of the data transferred and hence a backup of 2GB on day 2 through 7 took less than 2 minutes to complete. Benefits The key benefits of a backup solution using deduplication are: • Reduce S3 storage costs • 90% reduction in storage costs at Amazon S3. • For 10 weekly backups to be saved, each being 10GB will require 10 tapes or 10*10=100 GB of space at Amazon S3 • With deduplication, it takes only 10GB to store 10 weekly backups • This is a 10:1 reduction in storage space • Reduce S3 bandwidth costs • Every weekly full backup will only require changes to be transferred, and hence the amount of bandwidth used is much less instead of 10GB transfer each week • This is a 10:1 reduction in bandwidth usage • Reduce time to backup • 10 GB+ WAN backup impractical without de-duplication takes 10+ hours but with deduplication, it will be done in minutes instead of hours. • Enables more frequent backups and mitigates data loss • Enable a highly scalable solution
Although de-duplication is fairly common in the traditional data storage and backup solution, it is not available when using cloud computing and storage such as Amazon S3. Vendors using Amazon S3 or other online storage in the cloud must use deduplication and compression technologies to greatly reduce the amount of bandwidth, storage costs, and most importantly the amount of time needed for backup. It is a compelling technology that not only saves you money and time but also allows the solution to scale up compared to other traditional solutions available in the market.