Platform
- Data Security Cloud
  Data Security Cloud
  Fully managed data security across enterprise, cloud, SaaS, and end user.
- Data Protection
  Data Protection
  Modernize data protection to reduce costs and complexity
- Cyber Response & Recovery
  Cyber Response & Recovery
  Bounce back from cyber attacks with data that is always safe and ready.
- eDiscovery & Compliance
  eDiscovery & Compliance
  Secure, protect, and streamline data governance.
- Meet Dru - Your Copilot for Data Security
Solutions
- Modernize Data Protection
  Modernize Data Protection
  Learn how Druva helps you accelerate key business initiatives
- Accelerate Data Security
  Accelerate Data Security
  Enterprise Cloud Backup and data management across edge, on-premises and cloud workloads
- Key Technologies
  - Public Cloud
    Public Cloud
    Protect native AWS and Azure deployments with secure backups without the cost and complexity
    
    Druva for Amazon
    
    Amazon EC2
    
    Amazon RDS
  - Druva for Microsoft
    Druva for Microsoft
    Enterprise Cloud Backup and data management across edge, on-premises and cloud workloads
    
    Microsoft & Azure
    
    Azure VM
    
    Microsoft 365
    
    Microsoft 365 Backup Express
    
    Microsoft Dynamics 365
    
    Microsoft Entra ID
    
    Microsoft SQL
  - Endpoint and SaaS Apps
    Endpoint and SaaS Apps
    Enterprise Cloud Backup and data management across edge, on-premises and cloud workloads
    
    Google Workspace
    
    Salesforce
    
    Endpoints
  - Hybrid Workloads
    Hybrid Workloads
    Transform data center backup and disaster recovery for virtual environments
    
    VMware
    
    Hyper-V
    
    Nutanix
  - Enterprise Workloads
    Enterprise Workloads
    Enterprise Cloud Backup and data management across edge, on-premises and cloud workloads
    
    SAP HANA
    
    Oracle
    
    NAS/files
- Take a Tour
Customers
- Explore All Customer Stories
  We are trusted by the world's leading organizations to protect their data. Explore customer success stories to see how your peers are using Druva.
- Ransomware recovery ready
  Learn why Medallia chose Druva
  
  SaaS data protection across the enterprise
  See why Regeneron partnered with Druva
Resources
- Druva vs. Veeam TCO Calculator
  Find the hidden costs of legacy backup
  
  Forrester: Total Economic Impact of Druva 2024
  Customers see 224% ROI: Find out how
Partners
- Alliances
  Alliances
  Enterprise Cloud Backup and data management across edge, on-premises and cloud workloads
  - AWS
  - Dell
  - Microsoft
- Value Added Resellers
  Value Added Resellers
  Learn how you can profit with Druva and a cloud-first SaaS selling motion. Explore partner programs, access resources, and discover the benefits of partnering with Druva.
  - Partner+ Program
  - Partner Academy
- Partner Login
  Partner Login
  Enterprise Cloud Backup and data management across edge, on-premises and cloud workloads
  - Partner Portal
  - Managed Service Center
- Ecosystem
  Ecosystem
  Learn about Druva's strategic capabilities across platform, OEM, and other partnerships. Find out how Druva accelerates and protects customers' cloud journeys.
  - Security Integrations
  - Technology Partners
- Managed Service Providers
  Managed Service Providers
  Enterprise Cloud Backup and data management across edge, on-premises and cloud workloads
- Become a Partner
Company
- - Company
  - Leadership
  - Investors
  - Careers
  - Contact Us
  - Newsroom
  - Awards
  - Events
  - Diversity, Equity & Inclusion
  - Blog
- Get in touch with us
  Contact Us
  
  News, product innovations, and more
  Blog
Get Started
Support
Login
Language
- English
- Deutsch

News/Trends, Tech/Engineering, Product

3 Ways to Dedupe your Duplicated Duplicates

November 14, 2014 Yadin Porter de León

The biggest culprit in the growing costs of data storage and backup is duplicated files. In fact, in some companies, duplicated files account for up to 30% of data that is (re)created. Downloading multiple copies of a document and emailing files to yourself are just a few of the ways this duplication occurs.

It’s hard to predict storage costs if you can’t determine just how much data is going to be created — and this is a top concern for IT. In this video, Druva’s Chief Product Officer, Chandar Venkataraman, dives into the three attributes of inSync’s global deduplication.

Location: Where is the deduplication performed?

Client-side vs. Server-side:When you have deduplication taken care of on the client-side, you will experience storage savings because less data is being shipped over to the servers.

Logic: At what level is deduplication accomplished?

You could do it at a file level, fixed or variable block level, or app-aware deduplication.

App-aware deduplication is the most effective dedupe methodology, because you could identify file duplicates in attachments, emails, or even down to the folder from which they originate.

Scale: Dedupe one device or dedupe coverage for all devices?

It’s important to have deduplication at the global level because that’s when you get the full network effect. Rather than individual silos, a single user could seed other users.

Want to learn more? Download our white paper, 8 Must-Have Features for Endpoint Backup.

Data Deduplication for Corporate Endpoints

Video Transcript

Hello, my name is Chandar, from Druva. Today I’m going to be talking about Deduplication which is essentially looking at redundant data and eliminating them. Now different vendors make different claims about their own dedupe approach. I’m going to be talking about all the various approaches there are to dedupe.

The first attribute of deduplication is location. Where exactly is dedupe performed? You could do it on the server side for example on a data domain dedupe storage system or on the client’s side, but when done at the client’s side in addition to getting the same storage savings you also get bandwidth savings because you have to ship less data over the network, and that’s a great thing.

The second most important attribute of dedupe is logic. Let’s start with granularity. At what level is dedupe accomplished? Vendors could do it at a file level, you could look at two different files, identify if they’re the same, and de-duplicate them, or get sub-file level, typically at the block level.

You could adopt a fixed block approach, or a variable block approach. The variable block is usually more effective in finding duplicate blocks, regardless of where they’re stored. How many times you inserted a small paragraph in a big word document, or a small slide in a big power point? The variable block is going to be performing much better, but what’s more interesting is what’s called app aware dedupe. Imagine looking at inside a file just like the application that generated the file looks at it.

What if you could use technology like mappy, to look inside all your messages so you could demark every message very clearly and de-duplicate really effectively. If you did that you could identify duplicate attachments across messages and even dedup the attachments from where they were actually backed up inside the folder, and that’s very, very effective.

The last and most important attribute of dedupe, is scale. Now most vendors have a scale of one which is the dedup per device or per user, but imagine thousands, and thousands of devices where a single message from your CEO for example went to all of your different devices, or because of your sharing patterns, the same big document is shared across so many devices.

A poor use of Dedupe definitely has limitations. What’s most important is performing Dedupe at a global level and that’s when you get the true network effect. Imagine a global Dedupe, where every single user seeds every other user. You start to get exponential savings in both storage and bandwidth and that’s the true power of Dedupe.

And ultimate in scale is of course cloud, where the cloud is a design for infinite scale, and you could have a single Dedupe index across all of the regions globally and the cloud is great for global footprint and a global reach. Now Druva intelligently combines all the different attributes of Dedupe to come up with something very very unique: client side Dedupe, app aware Dedupe, global Dedupe, and we put it on the cloud. If you want to learn more, please visit druva.com. Thank you.

3 Ways to Dedupe your Duplicated Duplicates

Location: Where is the deduplication performed?

Logic: At what level is deduplication accomplished?

Scale: Dedupe one device or dedupe coverage for all devices?

Video Transcript

Druva Blog: Cloud Technology & Data Protection Articles

Druva Data Security Cloud

The Druva Platform

Data Protection

Cyber Response & Recovery

eDiscovery & Compliance

Modernize Data Protection

Accelerate Data Security

Key Technologies

Customers

Resources

Partners

Company