Platform
- Data Security Cloud
  Data Security Cloud
  Fully managed data security across enterprise, cloud, SaaS, and end user.
- Data Protection
  Data Protection
  Modernize data protection to reduce costs and complexity
- Cyber Response & Recovery
  Cyber Response & Recovery
  Bounce back from cyber attacks with data that is always safe and ready.
- eDiscovery & Compliance
  eDiscovery & Compliance
  Secure, protect, and streamline data governance.
- Meet Dru - Your Copilot for Data Security
Solutions
- Use Cases
  Use Cases
  Learn how Druva helps you accelerate key business initiatives
- Key Technologies
  - Public Cloud
    Public Cloud
    Protect native AWS and Azure deployments with secure backups without the cost and complexity
    
    AWS Backup
    
    Amazon EC2
    
    Amazon RDS
    
    Azure
  - Hybrid Workloads
    Hybrid Workloads
    Transform data center backup and disaster recovery for virtual environments
    
    VMware
    
    Hyper-V
    
    Nutanix
    
    Oracle
    
    MS SQL
    
    SAP HANA
    
    NAS/files
  - Endpoint and SaaS Apps
    Endpoint and SaaS Apps
    Enterprise Cloud Backup and data management across edge, on-premises and cloud workloads
    
    End User Protection
    
    Microsoft 365
    
    Microsoft 365 Backup Storage
    
    Microsoft Entra ID
    
    Microsoft Dynamics 365
    
    Salesforce
    
    Google Workspace
- Free Trial
Customers
- Explore All Customer Stories
  We are trusted by the world's leading organizations to protect their data. Explore customer success stories to see how your peers are using Druva.
- Ransomware recovery ready
  Learn why Medallia chose Druva
  
  SaaS data protection across the enterprise
  See why Regeneron partnered with Druva
Resources
- Druva vs. Veeam TCO Calculator
  Find the hidden costs of legacy backup
  
  Forrester: Total Economic Impact of Druva 2024
  Customers see 224% ROI: Find out how
Partners
- Programs
  Programs
  Learn how you can profit with Druva and a cloud-first SaaS selling motion. Explore partner programs, access resources, and discover the benefits of partnering with Druva.
- Strategic Partners
  Strategic Partners
  Learn about Druva's strategic capabilities across platform, OEM, and other partnerships. Find out how Druva accelerates and protects customers' cloud journeys.
  - Dell Technologies
  - AWS
  - VMware
  - Nutanix
- Become a Partner
Company
- - Company
  - Leadership
  - Investors
  - Careers
  - Contact Us
  - Newsroom
  - Awards
  - Events
  - Diversity, Equity & Inclusion
  - Blog
- Get in touch with us
  Contact Us
  
  News, product innovations, and more
  Blog
Get Started
Support
Login
Language
- English
- Deutsch

Tech/Engineering

Efficient Concurrent Reads with Read-Through LRU Cache

November 04, 2024 Aditya Supugade, Sr. Staff Software Engineer

Introduction

Druva Daintree storage SDK (Photon) provides a cached read API for pointed data reads for a file of the particular snapshot. This cached read API uses LRU (Least Recently Used) cache optimized for efficient data retrieval.

As we set out to build cached read APIs, there were a few essential requirements to address:

Low response times
Minimize network requests

To achieve this, the LRU (Least Recently Used) cache is used. Cache helps reduce latency, optimize performance for read-intensive tasks, and save costs.

Additionally, cached read APIs can be called by any number of threads parallelly.

If you have used LRU cache where parallel requests are frequent, you've likely run into the issue of multiple requests resulting in cache misses, leading to all threads making network calls to retrieve the data.

In a typical scenario using Least Recently Used (LRU) cache, the process of data retrieval is as follows:

Cache Workflow

When a cache get request is initiated, if the requested data is present in the cache, it is swiftly returned, enhancing response times significantly. If data is not present in the cache, a function call will be made to get the data from the server. Let's look at how it is efficiently managed using inflight requests and careful orchestration of data flow.

Managing Inflight Requests

The inflight requests dictionary serves as a tracking mechanism. When a cache miss happens, this dictionary is consulted. If another worker is already fetching the requested data, the new request is not immediately routed to the server. Instead, the requesting worker is placed in a waiting state, anticipating the data to arrive shortly.

In this setup, a clever utilization of Rust synchronization primitives comes into play. Here, once data is available the fetching worker communicates it back to all waiting workers simultaneously.

Efficient Orchestration

Once an entry is made in the inflight requests dictionary, the fetching process begins. The first worker initiates a call to the server, aiming to retrieve the requested data. Simultaneously, any other workers waiting for the same data are blocked, waiting for the first worker to make the information available.

Upon successful retrieval of data by the first worker, it will

Populate the cache, ensuring that subsequent requests for the same data can be handled from the cache.
Remove the entry from the inflight requests dictionary and send the data so all waiting workers will receive it.
Send data to the main request.

Handling Failures

In the event of a failure during the data-fetching process, a well-designed fail-safe mechanism comes into action. If the function call to the server encounters an error, all waiting workers are promptly notified with the error information. This ensures that errors are handled gracefully.

Cached Read Workflow

File Stats Cache

File Stats are required to determine the storage block size for that file. Since many requests come for a single file, file stats are cached.

File Stats Cache Design

The key for the file stats cache is the file path and the value is stats for that file which contains block size.

Meta Cache

Metadata can be cached to take advantage of the principle of locality. The request for metadata can be extended (balancing the API response time with the cache benefit) to fetch blocks of metadata that immediately follow. This will avoid some of the API calls and improve API latency.

Meta Cache Design

Fetching metadata from the server is handled in batches.

The meta cache key is a combination of a path and an offset, where the offset is a multiple of the meta batch size. The value for meta cache is the metadata of chunks from the offset to meta batch size.

When a read request is initiated for a particular offset and size, and its metadata is not present in the cache, the system dynamically adjusts the storage request offset to the nearest smaller multiple of the meta batch size. The request size is set to match the meta batch size, ensuring an efficient and streamlined metadata retrieval from the server.

Block Cache

Block caching is needed to handle the discrepancy in block sizes. Linux operates with 4 KB blocks, and IOs will be issued with this block size. Druva storage operates with a higher block size (for example, 1 MB). It will benefit to cache these large blocks as they get downloaded. The subsequent IO read requests (for 4 KB blocks) potentially will find a cache hit. Another use case for block cache is read-ahead. When an application intends to consume multiple blocks serially, it can request for read-ahead. These pre-fetched blocks stay in the cache and requests for them get quickly served as the cache gets populated up-front.

Block Cache Design

The block cache key is block id which is unique for each block and the value is block data.

Use Cases

Cached read API is used in the following workloads in Druva:

File level restore for Hybrid workloads (AHV, VMware)
Data restore for stream workloads (SAP HANA)

Conclusion

The Druva Daintree storage SDK (Photon) leverages advanced caching techniques, including the Least Recently Used (LRU) cache, to optimize data retrieval processes effectively. By integrating robust mechanisms such as inflight request management and synchronized data handling, the system ensures high performance, reduced latency, and minimized network requests, even in high-concurrency scenarios.

The cached read API's architecture embodies Druva's commitment to innovative solutions that prioritize reliability and performance, making it a critical component for delivering seamless and cost-effective data management.

Efficient Concurrent Reads with Read-Through LRU Cache

Introduction

Cache Workflow

Managing Inflight Requests

Efficient Orchestration

Handling Failures

Cached Read Workflow

File Stats Cache

File Stats Cache Design

Meta Cache

Meta Cache Design

Block Cache

Block Cache Design

Use Cases

Conclusion

Druva Blog: Cloud Technology & Data Protection Articles

Druva Data Security Cloud

The Druva Platform

Data Protection

Cyber Response & Recovery

eDiscovery & Compliance

Use Cases

Key Technologies

Customers

Resources

Partners

Company