Combine BigMemory and Terracotta Server Array for Performance and Scalability
The age of Big Data is upon us. With ever expanding data sets, and the requirement to minimize latency as much as possible, you need a solution that offers the best in reliability and performance. Terracotta’s BigMemory caters to the world of big data, giving your application access to literally terabytes of data, in-memory, with the highest performance and controlled latency.
The age of Big Data is upon us. With ever expanding data sets, and the requirement to minimize latency as much as possible, you need a solution that offers the best in reliability and performance. Terracotta’s BigMemory caters to the world of big data, giving your application access to literally terabytes of data, in-memory, with the highest performance and controlled latency.
At Terracotta, while working with customers and testing our releases, we continuously experiment with huge amounts of data. This blog illustrates how we were blown away by the results of a test using BigMemory and the Terracotta Server Array (TSA) to cluster and distribute data over multiple computers.
Test Configuration
Our goal was to take four large servers, each with 1TB of RAM, and push them to the limit with BigMemory in terms of data set size, as well as performance while reading and writing to this data set. As shown in Figure 1, all four servers were configured using the Terracotta Server Array to act as one distributed but uniform data cache.
Figure 1 - Terracotta Server Array with a ~4TB BigMemory Cache
We then configured 16 instances of BigMemory across the four servers in the Terracotta Server Array, where each server had 952GB of BigMemory in-memory cache allocated. This left enough free RAM available to the OS on each server. With Terracotta Server Array, you can configure large data caches with high-availability, with your choice of data striping and/or mirroring across the servers in the array (for more information on this, read here http://terracotta.org/documentation/terracotta-server-array/configuration-guide or watch this video http://blip.tv/terracotta/terracotta-server-array-striping-2865283. The end result was 3.8 terabytes of BigMemory available to our sample application for its in-memory data needs.
Next, we ran 16 instances of our test application, each on its own server, to load data and then perform read and write operations. Additionally, the Terracotta Developer Console (see Figure 2) makes it quick and simple to view and test your in-memory data performance while your application is running. Note that we could have configured BigMemory on these servers as well, thereby forming a Level 1 (L1) cache layer for even lower latency access to hot-set data. However, in this blog we decided to focus on the near-linear scalability of BigMemory as configured on the four stand-alone servers. We’ll cover L1 and L2 cache hierarchies and hot-set data in a future blog.
Figure 2 - The developer console helps configure your application's in-memory data topology.
Data Set Size (2 billion entries; 4TB total)
Now that we had our 3.8 terabyte BigMemory up and running, we loaded close to 4 terabytes of data into the Terracotta Server Array at a rate of about 450 gigabytes per hour. The sample objects loaded were data arrays, each 1700 bytes in size. To be precise, we loaded 2 billion of these entries with 200GB left over for key space.
The chart below outlines the data set configuration, as well as the test used:
Test Configuration | |
Test: | Offheap-test |
svn url: | |
Element # | 2 Billion |
Value Size | 1700 bytes (simple byte arrays) |
Terracotta Server # | 16 |
Stripes # | 16 (1 Terracotta Server per Mirror Group) |
Application Node # | 16 |
Terracotta Server Heap | 3 GB |
Application Node Heap | 2 GB |
Cache Warmup Threads | 100 |
Test Threads | 100 |
Read Write %age | 10 |
Terracotta Server BigMemory | 238 GB/Stripe. Total: 3.8 TB |
The Test Results
As mentioned above, we ran 16 instances of our test application, each loading data at a rate of 4,000 transactions per second (tps), per server, reaching a total of 64,000 tps. At this rate, we were limited mostly by our network, as 64K tps of our data sample size translates to around 110 MB per second, which is almost 1 Gigabit per second (our network’s theoretical maximum). Figure 2 graphs the average latency measured while loading the data.
Figure 3 - Average latency, in millisecond, during the load phase.
The test phase consisted of operations distributed as 10% writes and 90% reads on randomly accessed keys over our data set of 2 billion entries. The chart below summarizes the incredible performance, measured in tps, of our sample application running with BigMemory.
Test Results | |
Warmup Avg TPS | 4k / Application Node = 64 k total |
Warmup Avg Latency | 24 ms |
Test Avg TPS | 4122 / Application Node = 66 k total |
Test Avg Latency | 23 ms |
We reached this level of performance with only four physical servers, each with 1 terabyte of RAM. With more servers and more RAM, we can easily scale up to 100 terabytes of data, almost linearly. This performance and scalability simply wouldn’t be possible without BigMemory and the Terracotta Server Array.
Check out the following links for more information:
· Terracotta Server Array: http://terracotta.org/documentation/terracotta-server-array/introduction
· Terracotta BigMemory: http://terracotta.org/documentation/bigmemory/overview
6 comments:
What would happen if you increased the writes/reads ratio to a more real world work load instead of 10/90?
Can the off heap allocator keep up?
Can you elaborate on the avg latency? How much time on avg does an individual *read* and *write* operation take once the caches are primed ? This was not clear from the blog post.
Yes, we ran some tests with increasing write ratio (keeping the no. of keys same) so effectively increasing no. of updates. It gives almost flat line performance.
Are you talking about increasing the number of keys over time too?
Average latency is the average latency of all the transactions including read and write.
The avg latency for reads were around 25 ms, while writes were 20 ms.
Reads have a higher latency because we are hitting terracotta server for each entry. This can be overcome by using BigMemory at the application level too.
did you actually work with a project which has that much data requirement ?
@Java Geek Yeah, we have a number of customers that need that much BigMemory. We have tested till 7.5 TB of BigMemory for them.
Post a Comment