Video Transcript
Hello, my name is Chandar, from Druva. Today I’m going to be talking about Deduplication which is essentially looking at redundant data and eliminating them. Now different vendors make different claims about their own dedupe approach. I’m going to be talking about all the various approaches there are to dedupe.
The first attribute of deduplication is location. Where exactly is dedupe performed? You could do it on the server side for example on a data domain dedupe storage system or on the client’s side, but when done at the client’s side in addition to getting the same storage savings you also get bandwidth savings because you have to ship less data over the network, and that’s a great thing.
The second most important attribute of dedupe is logic. Let’s start with granularity. At what level is dedupe accomplished? Vendors could do it at a file level, you could look at two different files, identify if they’re the same, and de-duplicate them, or get sub-file level, typically at the block level.
You could adopt a fixed block approach, or a variable block approach. The variable block is usually more effective in finding duplicate blocks, regardless of where they’re stored. How many times you inserted a small paragraph in a big word document, or a small slide in a big power point? The variable block is going to be performing much better, but what’s more interesting is what’s called app aware dedupe. Imagine looking at inside a file just like the application that generated the file looks at it.
What if you could use technology like mappy, to look inside all your messages so you could demark every message very clearly and de-duplicate really effectively. If you did that you could identify duplicate attachments across messages and even dedup the attachments from where they were actually backed up inside the folder, and that’s very, very effective.
The last and most important attribute of dedupe, is scale. Now most vendors have a scale of one which is the dedup per device or per user, but imagine thousands, and thousands of devices where a single message from your CEO for example went to all of your different devices, or because of your sharing patterns, the same big document is shared across so many devices.
A poor use of Dedupe definitely has limitations. What’s most important is performing Dedupe at a global level and that’s when you get the true network effect. Imagine a global Dedupe, where every single user seeds every other user. You start to get exponential savings in both storage and bandwidth and that’s the true power of Dedupe.
And ultimate in scale is of course cloud, where the cloud is a design for infinite scale, and you could have a single Dedupe index across all of the regions globally and the cloud is great for global footprint and a global reach. Now Druva intelligently combines all the different attributes of Dedupe to come up with something very very unique: client side Dedupe, app aware Dedupe, global Dedupe, and we put it on the cloud. If you want to learn more, please visit druva.com. Thank you.