Caching – no web application should be without it, but there are many ways to achieve it, and everyone seems to have an opinion on what works and what doesn’t. The introduction of Cloud Computing platforms Windows Azure brings with itnew and interesting caching options: the need for a map to guide through the new territory is important. Read on and navigate Windows Azure caching for your platform, application and use cases.
Cache Terminology
Before diving in, let’s cover some terminology definitions to clarify the conversation: differences between caching options can be reduced to fine details, but small differences in definition can mean big differences in behaviour.
Cache: For the purpose of this article, a Cache is a temporary store of data which is volatile but fast to access. For example, reading in a table of data from a database and storing it in a cache allows an application to run faster, because reading a cache is faster than reading a relational database and constructing data from different entities. But the data in the cache is not guaranteed to exist.
In Process v Out of Process : In process simply means that the data stored in the cache stays in the same memory space as the application itself. Out of process means that the data stored is in a different process to the application. That process could be on the same server, or it could be on another server completely. In vs Out of process is a dividing line between whether cached items increase or decrease the amount of memory the application process is using.
In Memory vs Out of Memory Cache : An in-memory cache is when the cached data is kept within running memory of a server (either in-process or out-of-process). In Memory caching is always faster because retrieving data directly from memory is always the fastest way for an application to get it, but memory is also the most expensive way of storing data. Out of Memory caches are those where the data is persisted to some other type of storage – usually disk storage in the form of a file, or storage table.
Application cache vs Page output Cache : As web applications are running and serving up web pages to visitors, caching will generally be used heavily. There are two styles to caching pieces of the application – caching data specific to running the application (such as a group of application settings), and caching the output of a web page, where either chunks of, or the entire html output of a page will be saved. For a page cache, application logic will generally read the html from the cache and send it back to the browser. For data stored in the application cache, typically there is more processing to be done before the data is transformed into html for the browser to load.
Serialization/Deserialization : A technical term which refers to transforming a piece of data from its representation in the memory of an application, to a format suitable for storage in a cache. Serialization can be done by the programming language/platform, or can be custom-written by a developer. When you write down some notes on a to-do list, you serialize your ‘to-do’ data onto a notepad. When you read it back later, you de-serialize that to-do data back into your memory. It’s the same concept for web applications – there are many different formats of serialization.
Distributed vs Single Locale : A distributed cache is one where the cache access is possible from different processes running on different servers. The alternative to a distributed cache is one where only a single process on a single server can access the cache. Distributed caches are important when building scalable, flexible Cloud applications.
Other types of caching products are available by installing caching products on Windows Azure VMs in IaaS. These do exist but are beyond the scope of this article.
Overview : Azure Caching Models
The Azure Cache Service is a distributed, in-memory cache that can easily scale and provide fast access to data by applications running within Azure. A Cache Service is created separately to specific Azure platform implementations and can be read to/written from different application platforms within Azure. The Cache Service has the advantage of being available to all different computing platforms within the Azure environment, so is ideal for data sharing where an application utilizes a range of Azure technologies. As an example, you may design an application that has worker roles processing items from a queue, and an Azure Website that reads results from a database for display to the visitors. In this scenario, data stored in the Azure Cache Service would be available to both the Azure Website(s) and the Worker Role(s). An Azure cache service is created specifically through the Azure management console, or via the Azure API, and is something that lives beyond application/server rebuilds and restarts.
In the Azure Cloud Services environment, there are two types of application platform – Web Roles and Worker Roles. Web Roles are typically used for delivering websites, and Worker Roles are used for delivering background processing. In-Role caching is caching delivered within this environment, either in the memory space of a web or worker role (Co-located Role caching), or by creating a worker role with the sole purpose of keeping cached items (Dedicated Role caching). In-Role caching is automatically distributed as the cached items are available between multiple roles in the same deployment. This means that the cache automatically expands to be available to new roles when an application is scaled up.
ASP.NET/IIS Cache
Using the ASP.NET/IIS Cache in Azure : If you’re running IIS/ASP.NET, you have access to the in-process cache that is native to ASP.NET. However, if you’re using either Cloud Services Web/Worker Roles or IaaS VMs to host your application, you must have two instances to have an uptime SLA. This means that any application built on these platforms is naturally a distributed application, and so you must implement some type of caching (or application) synchronization so that the two or more instances are communicating changes to the data with each other. This is because the native Windows Azure load-balancing does not have server affinity – there is no guarantee that each subsequent page requested by the visitor will be processed and returned by the server each time.
Out-of-memory custom caching
Azure provides for multiple types of persistent storage which can be leveraged for caching. Azure Table Storage is a simple key/value store that can store very large volumes of data at very fast access speeds. In addition, Azure SQL Database can be used, but the performance and size limits are more restrictive than Table Storage, and there is a higher likelihood of being throttled if excessive requests are made during period of high usage. Finally there is the option to write cache files out to Windows Azure Storage, or to the temp locations available in Windows Azure VMs. Using Azure Storage has the extra capability of being geographically distributed, but this is only suitable for cached data where changes are infrequent and cache freshness is not an issue. In all of these cases the caching solution must be coded manually for each custom project.
Caching Approach | Characteristics | Advantages | Disadvantages | Comments |
Caching Services | Out of Process In Memory Distributed | Accessible by all types of Azure technologies. Scalable. | Extra cost for Service | For building high-performance, scalable cloud systems. |
Azure In-Role Caching | Out of Process In Memory Distributed | Can run in the memory space of existing Roles | Only available to Web/Worker Roles within Cloud Services. | Suitable for adaptation to existing applications migrated to Azure. |
ASP.NET/IIS Cache | In-Process In Memory Single-Locale | Runs in the memory space of the existing application. | Requires cache synchronization to work with scalable Azure applications. | Will require adaptations to use distributed caching when used with Azure load balancing. |
Azure SQL Database | Out of Process Out of Memory Distributed | Data is persistently stored in a database table. | Requires custom coding to serialize data, and for management such as expiration. | Slower than in-memory applications, may be throttled in periods of high request. |
Azure Table Storage | Out of Process Out of Memory Distributed | Data is persistently stored in Azure table storage. | High performance for very large datasets, cost-effective. Requires custom code. | Slower than in-memory, but faster than SQL. May be throttled in periods of high requests. |
Azure Blob Storage | Out of Process Out of Memory Distributed | Data is persistently stored in Azure Blob storage. | Can leverage geo-redundant persistent storage. Requires custom code. | Slower than in memory, but easy to make highly distributed. |
Real life implementations of different Azure caching models
At DNN, our Cloud Services environment runs on Windows Azure and leverages many different types of caching to deliver a range of products and internal services.
Azure Cache Service
DNN Uses this as part of the provisioning process for delivering Evoq Trials. Free trials are available for Evoq Social and Evoq Content, two of the commercial solutions offered by DNN. The Free trials are built with Windows Azure Pack and interface with Worker roles in the cloud services environment for provisioning. The provisioning service collects customer details (name, email, product type) from the signup pages, and then allocates a pre-provisioned Windows Azure Website with Evoq Social or Evoq Content already installed and ready to go. The Azure Caching Service is leveraged to provide high performance provisioning between the different Azure environments. This allows DNN to deliver a new, personalized Trial to a customer in less than 30 seconds. You can try this process for yourself by signing up for an Evoq Social trial or Evoq Content trial – you’ll be running various Azure technologies together to go from ‘Go’ to new website in 30 seconds or less.
Azure In-Role Caching
The Evoq in the Cloud product runs on the Azure Cloud Services platform, and delivers high performance Content and/or Social websites with dedicated Web Roles. These are scalable, high-availability customizable websites built on the DNN Platform. The DNN Platform comes with a standard caching provider that utilizes file-based storage, but this is not distributed. For the Windows Azure implementation, In-Role caching has been leveraged by building a caching provider that snaps into the DNN modular architecture. This gives high performance, out-of-process, distributed caching between scaled Web Roles running DNN.
IIS/ASP.NET caching
The Evoq Trials previously mentioned run using Azure Websites technology. This means they use a single-instance model of deployment – trials are expected to be fast but do not need to be scalable. For this reason, they can use the standard in-process caching provider as delivered with the DNN Platform as standard. In-process caching works well for this type of Windows Azure Websites, because there is no scaling involved. Deploying DNN on a scalable version of Windows Azure Websites would require use of the Caching Service or another custom solution.
Windows Azure Caching - Summary
This article covered the different types of application caching available on the Windows Azure platform. When building applications on a cloud platform such as Windows Azure, it is important to take advantage of the scalability and flexibility of the underlying technologies. Obtaining the best results from the cloud involves building distributed, scalable applications. Building these types of applications always necessitates a careful caching strategy. The information in this article will help with the decision process when planning or reviewing a Windows Azure based application.
This article was originally posted at Codeproject : Understanding Windows Azure Caching