This document discusses data caching and its evolution. It covers reasons for application caching like improving response times and offloading load from databases. It describes the evolution from do-it-yourself local and distributed caching using key-value stores, to automated dynamic caching solutions. Automated dynamic caching solutions cache query results, ensure data is never stale through real-time invalidation, and provide efficient cache management to keep hot data in memory. These solutions require minimal configuration and automatically recognize query patterns and cache dependencies.
Report
Share
Report
Share
1 of 17
More Related Content
Data Caching Evolution - the SafePeak deck from webcast 2014-04-24
2. 2
Data Caching is a key component for high performance and
scalability application-database architecture
Reasons for application caching:
• Improve response times: reducing read data latency –
results are already in-memory cache
• Offload the “read load” from database servers (CPU & I/O)
• Improve overall system scalability and operational efficiency
Reasons for Data Caching
3. 3
Evolution of Data Caching
DYI
Local
Caching
DYI
Distributed
Caching:
Key-Value (KV)
DYI
Data Grids:
Distributed
KV + Complex
computing
AUTOMATED
DYNAMIC
CACHING:
Automated
performance
with full data
coherency
5. 5
1. A big based hash table:
• A key/value store, living entirely in RAM on multiple servers
• Accessed from any server as if the item is in the local RAM
2. Keys and values are arbitrary binary (serializable) entities:
query record-sets, file-blobs, application variables
3. Basic operations are put(K,V), get(K,V), replace(K,V), remove(K)
4. Non-transactional, not a database, no interface with real database
5. Cache is Stale – requires an effort to mimic database to cover changes
6. Development oriented (no automation), medium to long dev project
DIY Distributed Caching
6. 6
DIY Distributed Caching
(Memcached example)
# perl example
my $memclient = Cache::Memcached->new({ servers => [ '10.0.0.10:11211', '10.0.0.11:11211' ]});
3. Wrapping an SQL Query
2. Init a Memcached Client with a list of your pre-configured Memcached servers:
1. Install Memcached Service on servers (or even PCs)
# Define a query and use it as a key for the Memcached:
sql = "SELECT * FROM user WHERE user_id = ?"
key = 'SQL:' . user_id . ':' . md5sum(sql)
# We check if the value is 'defined' (or in cache), since '0' or 'FALSE' can be legitimate values!
if (defined result = memcli:get(key)) {
return result
} else {
# Query not in Cache, Get resultset from database server and convert to array
handler = run_sql(sql, user_id)
rows_array = handler:turn_into_an_array
# Cache it for five minutes
memcli:set(key, rows_array, 5 * 60)
return rows_array
}
4. Delete from cache
sql = "SELECT * FROM user WHERE user_id = ?“
key = 'SQL:' . user_id . ':' . md5sum(sql)
memcli:delete(key)
7. 7
DIY Data Grids: Distributed In-Memory
Computing and Data Management
* GridGain chart example
In-Memory Data Grids (IMDG): Manage data in distributed memory and
perform complex in-memory computing tasks.
8. 8
1. Reliable In-Memory data storage, live data, and jobs executions with
balancing among grid nodes. Implements:
Implements KV cache interface + Has indexed search by values
Reliable distributed locks interface
Events Processing => data change notifications and management
Advanced use-cases: Messaging, Map-Reduce calculations, Cluster-wide
singleton, more …
2. Very high read-write performance
3. Development oriented (no automation), a long dev project
DIY Data Grids:
Caching Concepts
9. 9
Automated Dynamic Caching
1. Application agnostic – Supports Custom and 3rd-party applications
2. Caching of any read-based queries and stored procedures
3. Never stale cache. Writes trigger:
Real-time eviction/invalidation of the right cache items
Real-time database synchronization
4. Efficient cache management
Keeps “hot-data” in memory
10. 10
Automated Dynamic Caching
Results of queries return in microseconds (Avg: 100µsec = 0.000100sec).
Ensures data delivery and high availability for SQL Server applications.
Safeguards against unpredictable traffic spikes and usage surges.
11. 11
1. Caching concept: Combination of cache and in-memory database –
Smart caching of Hot-data Results with real-time eviction on DMLs
2. Minimal configuration and management. Self-learning and self-adaptive:
• DB Schema – objects definition and dependencies map between objects
• Recognizes SQL patterns and their tables/views dependencies, that are
automatically activated for caching and real-time cache eviction
• Self-adaptive to schema and application changes
3. 100% ACID, 100% data integrity, 100% up-to-date database
4. Fast: µs access to in-memory hot data: x10-x1000 faster
5. Safe: embedded failover proxy & optional cluster deployment
SafePeak Software for SQL Server apps:
Automated Dynamic Caching
12. Results: Large Mobile Telecom Customer
E-Commerce Application
12
Aberdeen Group
Research: E-commerce response time acceleration effect: From 6 to 5 sec.:
Sales
Conversions
Customer
Satisfaction
Page
Views
12.5%7.5% 19%
9X Faster
DB
13. Case Study: Tier 1 SW Vendor
SharePoint Application
SQL Server CPU Time (% )
13
• 1/10 CPU Load
• 1/10 I/O Load
• No more overload
CPU Overload
• Cache hit = 75%
• Acceleration:
70% - 350%
Application Time (Sec)
14. 14
Real Time Performance Analysis & Control
Intuitive browser-
base Interface
Manage Cache on
Database, Table and
Query Level
Automated SQL
Patterns Detection
and Analysis
Real-time & statistical
Business Info on
production queries
Production
Deployment in
just HOURS
DB speed=98ms
Cache speed=0.15ms
New Average=3.8ms
15. 15
Better, Faster, Cost-Effective
SAFEPEAK:
Dynamic
Caching
Yes
Hours
Yes
None
Low
App coding
Caching and
DB Tuning
No
Months
No
•Dev + DBA Time
•3rd party apps
limitations
High
Hardware
Scale-Up
No
Weeks
Yes
•Hardware
•SW Licenses
•IT+DBA Time
High
Virtualization
Optimized
Deployment
Time
Application
Agnostic
Hidden High
Cost Factors
Cost
“SafePeak’s unique plug and play, automated solution, provides customers
an immediate improvement of application performance and response time. A key
differentiator vis-à-vis existing solutions.” Massimo Pezzini, VP & Fellow - Gartner