Friday, February 20, 2009

Initial Lassen Performance Results

The below are the test results for the 8 tomcat cluster with terracotta server. It is the result form the masterRun.sh script.
Analyzing Jmeter logs ...
==============================
502 Error = 0
503 Error = 0
DEV-1929 = 0
//This includes network latency, etc.
Response Time
==============================
Average JMeter response time [Last Page/e1s31] = 144.109 ms
Average JMeter response time = 98.2516 ms

//This is server latency printed by the apache tomcat servlet engine
Server Latency Stats
=====================
Average = 9.90304 ms
Minimum = 0 ms
Maximum = 5851 ms
Std. Deviation = 40.5956
Total # of Requests = 2704000
Nodes = 8

L2 Configuration
==============================
TC_OPTS=-Xms4g -Xmx4g -XX:+DisableExplicitGC -XX:-TraceClassUnloading -XX:TargetSurvivorRatio=90 -Xss128k -XX:+AggressiveHeap -Dcom.tc.l2.objectmanager.fault.logging.enabled=true -Dcom.tc.l2.cachemanager.logging.enabled=true -Dcom.tc.l2.berkeleydb.je.maxMemoryPercent=15 -XX:SurvivorRatio=12

L1 Configuration
==============================
CATALINA_OPTS=-Duse.async.processing=true -Dasync.concurrency=5 -Dcom.tc.hibernate.useFineGrainedLocking=false -Dcom.tc.l1.cachemanager.enabled=false -Duse.pojoizer=true -Xms1024m -Xmx1024m -server -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -verbose:gc -XX:TargetSurvivorRatio=90 -Xss128k -XX:NewSize=512m -XX:MaxNewSize=512m

// We have async TIM installed which let the saving the results to database asynchronously.
MySQL DB (perf28) Entries in EXAM_RESULT
=========================================
COUNT(*)
4547

MySQL DB (perf28) results completion time
==========================================
min(START_TIME) max(END_TIME) completionsecs
2009-02-20 04:40:12 2009-02-20 06:21:41 6089

Object DB Size
==============================
13G /export1/bench/perfTests/terracotta/server/data/objectdb

L2 VerboseGC log Analysis

GC Report
================
Avg Full GC Duration = 5.50871 secs
Avg Full GC Interval = 247.74 secs
Total Full GC Time = 137.718 secs

No. of Full GC = 25
Avg Young GC Duration = 0.230732 secs
Avg Young GC Interval = 25.8137 secs
No. of Young GC = 246
Total Young GC Time = 56.76 secs

Avg Young GC Occurence b/w Full GC = 9.84
Avg Young GC Time b/w Full GC = 2.2704 secs

The page response time is less than 10 ms.

The above response time includes Servlet Initialization time. All the servlets/application initializes when first request hits the tomcat server which could be bit high.

After 5 mins from the test run completion

mysql -u root -h perf28 -e 'SELECT COUNT(*) FROM exam.EXAM_RESULT'
+----------+
| COUNT(*) |
+----------+
| 9600 |
+----------+

Lassen Performance Tuning

We need to apply the performance tuning parameters on Tomcat and HTTP server.

Tomcat Server Tuning

Jasper Configuration

Tomcat JSP compiler Jasper has some recommended configuration for production environment or rather non-development environment. They are described below and are made to web.xml

http://tomcat.apache.org/tomcat-6.0-doc/jasper-howto.html#Production%20Configuration



<servlet>
<servlet-name>jsp</servlet-name>
<servlet-class>org.apache.jasper.servlet.JspServlet</servlet-class>
<init-param>
<param-name>fork</param-name>
<param-value>false</param-value>
</init-param>
<init-param>
<param-name>xpoweredBy</param-name>
<param-value>false</param-value>
</init-param>

<!-- development Is Jasper used in development mode? If true, ---->
<!-- the frequency at which JSPs are checked for ---->
<!-- modification may be specified via the ---->
<!-- modificationTestInterval parameter. [true] ---->
<!-- -->

<init-param> <param-name>development</param-name> <param-value>false</param-value> </init-param>

<!-- mappedfile Should we generate static content with one ---->
<!-- print statement per input line, to ease ---->
<!-- debugging? [true] ---->
<!-- -->

<init-param>
<param-name>mappedfile</param-name>
<param-value>false</param-value>
</init-param>

<!-- modificationTestInterval ---->
<!-- Causes a JSP (and its dependent files) to not ---->
<!-- be checked for modification during the ---->
<!-- specified time interval (in seconds) from the ---->
<!-- last time the JSP was checked for ---->
<!-- modification. A value of 0 will cause the JSP ---->
<!-- to be checked on every access. ---->
<!-- Used in development mode only. [4] ---->
<!-- -->

<init-param>
<param-name>modificationTestInterval</param-name>
<param-value>3600</param-value>
</init-param>

<!-- genStrAsCharArray Should text strings be generated as char ---->
<!-- arrays, to improve performance in some cases? ---->
<!-- [false] ---->
<!-- -->

<init-param>
<param-name>genStringAsCharArray</param-name>
<param-value>true</param-value>
</init-param>

<load-on-startup>3</load-on-startup> </servlet>


Server Configuration

The following changes were made regarding the tomcat server made to server.xml

Reference: http://tomcat.apache.org/tomcat-6.0-doc/config/http.html

<Connector port="8080" protocol="HTTP/1.1"
enableLookups="false"
maxThreads="500"
minSpareThreads="100"
maxKeepAliveRequests="1"
acceptCount="200"
connectionTimeout="20000"
redirectPort="8443" />

maxThreads

It is the maximum number of request processing threads to be created by this Connector, which therefore determines the maximum number of simultaneous requests that can be handled. The number were set to 500 as there were requests rejected giving connection refused error when load applied on the server was about 400 user. There is thinktime applied in the test which should reduce the concurrency in the server but at certain point of time, there were few requests kept alive if the keep-alive timeout is too high.

maxThreads were increased to 1200, to match the number of users/node. This wasn't necessary but to solve the 40 min problem so that the connections are not refused by tomcat. Since maxThreads were increased, minSpareThreads were also increased to 500.

maxKeepAliveRequests

The maximum number of HTTP requests which can be pipelined until the connection is closed by the server. Setting this attribute to 1 will disable HTTP/1.0 keep-alive, as well as HTTP/1.1 keep-alive and pipelining. Setting this to -1 will allow an unlimited amount of pipelined or keep-alive HTTP requests. If not specified, this attribute is set to 100. There are redirects for each request so its good to have keep alive so that each redirected request is handled by the same thread.

Keep Alive on the tomcat server has been disabled, which probably solves the 502 Bad proxy gateway error

acceptCount

The maximum queue length for incoming connection requests when all possible request processing threads are in use. Any requests received when the queue is full will be refused. The default value is 10. It is set high so that we dont have any requests that is getting refused by the server.

MySQL Server tuning

The limit on connections made to MySQL server is specified by the max_client parameter. The default value is 100, so if we have connection pool size of 100 on the app servers, the limit on mysql server throws "com.mysql.jdbc.exceptions.MySQLNonTransientConnectionException: Too many connections".

The max connections were set to 800 which were eventually increased to 3300 (200 per node and 100 extra)

Load Balancer Tuning

The number of connections were increased from the default to handle the load.

<IfModule worker.c>
ServerLimit 50

#initial number of server processes to start
#http://httpd.apache.org/docs/2.2/mod/mpm_common.html#startservers
StartServers 3

#minimum number of worker threads which are kept spare
#http://httpd.apache.org/docs/2.2/mod/mpm_common.html#minsparethreads
MinSpareThreads 5000

#maximum number of worker threads which are kept spare
#http://httpd.apache.org/docs/2.2/mod/mpm_common.html#maxsparethreads
MaxSpareThreads 5000

#upper limit on the configurable number of threads per child process
#http://httpd.apache.org/docs/2.2/mod/mpm_common.html#threadlimit
ThreadLimit 200

#maximum number of simultaneous client connections
#http://httpd.apache.org/docs/2.2/mod/mpm_common.html#maxclients
MaxClients 10000

#number of worker threads created by each child process
#http://httpd.apache.org/docs/2.2/mod/mpm_common.html#threadsperchild
ThreadsPerChild 200

#maximum number of requests a server process serves
#http://httpd.apache.org/docs/2.2/mod/mpm_common.html#maxrequestsperchild
MaxRequestsPerChild 100000
</IfModule>

Linux tuning

We observed "too many open files" error on the linux machines. The hard limit for the "open files" parameter shown by ulimit -aH was set too low. We increased the ulimit on all the linux machines to the max (65536). This number also determines the number of socket that can be opened on a linux machine.

java.net.SocketException: Too many open files in system
at java.net.Socket.createImpl(Socket.java:388)
at java.net.Socket.(Socket.java:362)
at java.net.Socket.(Socket.java:240)
at org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:80)
at org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:122)
at org.apache.commons.httpclient.HttpConnection.open(HttpConnection.java:707)
at org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:387)
at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171)
at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)
at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323)
at org.apache.jmeter.protocol.http.sampler.HTTPSampler2.sample(HTTPSampler2.java:838)
at org.apache.jmeter.protocol.http.sampler.HTTPSamplerBase.sample(HTTPSamplerBase.java:1021)
at org.apache.jmeter.protocol.http.sampler.HTTPSamplerBase.sample(HTTPSamplerBase.java:1007)
at org.apache.jmeter.threads.JMeterThread.run(JMeterThread.java:290)
at java.lang.Thread.run(Thread.java:619)

Monitoring Lassen Performance Runs

Here is a brief descriptions of the logs.

1. server.<server_name>.log [ Dir:LassenPerfFramework/logs ]
This contains the startup logs for Terracotta Servers. If there are problems in the starting Terracotta Servers, check these logs first.

2. TCserver_GC.<server_name>.log [ Dir:LassenPerfFramework/logs ]
This contains the verbose GC output for the Terracotta Servers. If you want to work with different JVM GC settings, these logs contains the GC information.

3. log*.jtl [ Dir:LassenPerfFramework/logs ]
This contains the output from the JMeter test run. Each HTTP Response Logging from the Cluster goes into this. This does NOT contains the response body for the successful responses (HTTP response Code:302 or 200). For errors/exceptions (HTTP response Code:500 ,502, 503, 404, etc), JMeter prints out the Response from the server. It helps in diagnosing the problems.

Each line contains the response time for that sample and response code from the server.
<httpSample t="5" lt="5" ts="1184177284608" s="true" lb="/examinator/" rc="200" rm="OK" tn="Thread Group 1-1" dt="text"/>
t="5" : 5 ms is the response time for this sample
lb="/examinator" : The URL hit by the HTTP request.
rc="200" : HTTP Response code from the server
tn="Thread Group 1-1" : Thread Group.

4. GC*.log [ Dir:LassenPerfFramework/logs ]
This contains the output from JMeter that how many threads have started plus the JVM verbose GC output if enabled.

5. responseTime.log [ Dir: ~/perfTests/tomcat/logs ]
This contains the output from the ResponseTimeTrackingValve installed on the tomcat servers. It prints out the average, maximum, minimum server latency time per 1000 requests. This doesnt include the network latency.

6. catalina.out [ Dir: ~/perfTests/tomcat/logs ]
These are the logs from the Tomcat servers. Any tomcat related errors will be logged in these logs.

Monday, January 5, 2009

Lassen Performance Testing Framework

1. Overview

Currently, The framework does the following

  1. Reads the properties file
  2. Checks the machine availability, aborts if any other java process is found running
  3. Starts the Terracotta server
  4. Starts the tomcat servers
  5. Checks that each server is started
  6. Starts the examinator load test using Jmeters
  7. Collects the results
SVN URL : https://svn.terracotta.org/repo/forge/projects/exam-perf-test/LassenPerfFramework

2. lassen_perf.properties

The following properties file is being used to configure the testing environment. Perf# are the machine hostnames.

#################### Notes #########################
- Avoid using "
- keep these scripts on a shared drive
####################################################

BASE_DIR = /shares/perf/hsingh/lassen_perf
JAVA_HOME = /usr/java/default/

# apache tomcat
# CATALINA_HOME : The installation path for apache tomcat
# CATALINA_OPTS : Set JVM arguments for tomcat servers

CATALINA_HOME = /shares/perf/hsingh/apache-tomcat-6.0.18
CATALINA_OPTS = -Duse.async.processing=true -Dasync.concurrency=5 -Dcom.tc.l1.cachemanager.enabled=false -Duse.pojoizer=false -XX:NewSize=512m -XX:MaxNewSize=512m -Xms1024m -Xmx1024m -server -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -verbose:gc

# terracotta server
# TC_CONFIG_PATH : The path to tc-config.xml file for terracotta server
# TC_HOME : The installation path for Terracotta server
# TC_OPTS : Set JVM arguments for terracotta server here

TC_CONFIG_PATH = /shares/perf/hsingh/examinator/branches/tc-2.7/tc-config.xml
TC_HOME = /shares/perf/hsingh/terracotta-co/2.7/code/base/build/dist/terracotta-2.7.2-snapshot/
TC_OPTS = -Xms1g -Xmx1g -Dcom.tc.l2.objectmanager.fault.logging.enabled=true -Dcom.tc.l2.cachemanager.logging.enabled=true -Dcom.tc.l2.berkeleydb.je.maxMemoryPercent=15

# Jmeter
# JMETER_DIR : The installation path for Jmeter
# JMETER_NODES : Set machines to be used to run JMeter. Syntax: :. Each JMeter consumes 4 GB of RAM

JMETER_DIR = /shares/perf/hsingh/LassenPerfFramework/jakarta-jmeter-2.3.2
JMETER_NODES = perf01:3 perf37:5

# list of servers to be used to set the testing environment

L1_NODES = perf21 perf22 perf23 perf24 perf25 perf26 perf27 perf28 perf29 perf30 perf31 perf32 perf33 perf34 perf35 perf36
L2_NODE = perf02
MYSQL = perf28

3. Directory Listing

ScriptDirectoryComments
clean_mysql.sh scripts/Deletes the previous exam results from the mysql database
kill-server.sh scripts/Kills all the tomcat and terracotta server
lb-status.sh scripts/ CL tool to check the number of concurrent requests being processed by the Load Balancer
machine-status.sh scripts/Checks the java process running on the machines
set-env.sh scripts/ Reads and parses the Variables being used in the scripts
start-tomcat.sh scripts/Builts DSO boot jar and starts tomcat server
startAll-tomcat.sh scripts/Starts all the tomcat server
tc-server.sh scripts/starts terracotta server
GC scripts/results/Analyzes the verbose GC output
collectAll-tomcatlogs.pl scripts/results/collects all tomcat logs
latency.sh scripts/results/prints out the average server latency calculated using the response time tracking valve logs
results/.sh scripts/results/Master script for resutls


4. How to run the test

  1. Add or modify the machines to be used for the test in the lassen_perf.properties.
  2. Configure the Load balancer to use the tomcat machines specified in the properties file. Need to comment or uncomment specific tomcat servers in the httpd.conf (perf.conf).
  3. Restart the Load Balancer. /sbin/service apache2 restart on SLES 10. It requires root access.
  4. Clean the MySQL db using "scripts/clean_mysql.sh ."
  5. Start the test using masterRun.sh
  6. To abort the test stop the masterRun.sh script and kill all servers using "scripts/kill-server.sh"

Thursday, December 18, 2008

Load Testing Examinator: JMeter


Yes, you can load test Examinator yourself. We tested it with 19,200 users and got really good response times.

Scenario

The scenario is that the users will login to the application and choose an exam. They will start taking exam, answer some of the questions, select few of them for review, can go back in mid of test to any other question or any question selected for review. The scenario wants to simulate the real-world situation where all users would be taking a particular exam for the exam duration. Each user waits for around 45 secs to answer the next question. The users start ramping up in 2 minutes.

Overview
# of User Sessions: 19,200
Thinktime: 45 secs
User Arrival Rate: 160 users/sec
Ramp-up Time: 120 secs
Throughput: 445 pages/sec
Response Time: 5 ms

Duration of Exam: 110 mins
# of Questions in the Exam: 100
No. of Choices/Question: 5

Load Testing Tool: JMeter

# of Tomcat nodes: 16
# of Terracotta nodes: 1
# of JMeter instances: 16

Terracotta Monitoring: Terracotta Admin Console
System Monitoring: nmon
Load Balancer Monitoring: mod_status Module
Tomcat Monitoring: JConsole
GC Monitoring: verboseGC
JMeter can be really a pain-in-ass if we are trying to load 20,000 users from a single JMeter instance or distributed JMeter instances. Distributed JMeter uses RMI to control the instances running remotely creating an overhead on JMeter. JMeter goes into series of Full GC for high number of users for a large test plan. The Examinator test plan includes a lot of HTTP Samples, around 280 HTTP request.
Tips to use JMeter for load testing Examinator:
  • Avoid using GUI for load testing
  • Avoid using distributed JMeter instances
  • Reduce I/O by reducing data being saved under load per sample using bin/jmeter.properties
To make it happen, we ran 16 different JMeter instances after which we consolidated the results from each. Shell scripts was a great help to make it happen. Now we have a problem, each JMeter instance can't use same datapool (user login/passwords). We created copies of the JMeter Test plan and divided datapool of 19,200 into 16 files, each having 1200 users.

A small shell script does this for me:

if [ $# -lt 1 ];then
echo "Usage: "
exit
fi
name=`echo $1 | cut -d '.' -f 1`

while [ $i -lt $2 ]
do
echo "Creating ${name}${i}.jmx..."
cp $1 ${name}${i}.jmx
`sed -i 's/userList.csv/userList'$i'.csv/g' ${name}${i}.jmx`
done


Thus, each JMeter instances would be using a separate set of users and simulation 1200 users.

5 ms Response Time for Online tests - Isn't it awesome!!!

With Terracotta, it's all possible. We can get a response time of 5 ms for a online exam application under high load of 20,000 users. We have used best technologies and are able to get 5 ms response time and reduce the load on database.
The key fact to make it happen is to keep all the intermediate data in the memory and we have memory of 16 JVM clustered using Terracotta to use.
No need of slow access database to cluster the session data. Get more inside of it from http://www.terracotta.org/web/display/orgsite/Web+App+Reference+Implementation

Examinator is live!!!
http://reference.terracotta.org/examinator
The sources are available to download.

Thursday, December 4, 2008

502 Error: Bad Proxy

In the process of performance testing the examinator reference application of Terracotta, we faced a lots of 502 errors. During the performance testing few of the virtual users were receiving 502 error from the load balancer.

Load balancer: Apache HTTP server, mod_proxy_balancer module
App Server: Tomcat 6

The error received were

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">

<html><head>

<title>502 Proxy Error</title>

</head><body>

<h1>Proxy Error</h1>

<p>The proxy server received an invalid response from an upstream server.<br />

The proxy server could not handle the request <em><a href="/examinator/exam/list.do">GET&nbsp;/examinator/exam/list.do</a></em>.<p>

Reason: <strong>Error reading from remote server</strong></p></p>

<hr>

<address>Apache/2.2.3 (Linux/SUSE) Server at xyz.abc.lan Port 80</address>

</body></html>

By disabling keepAlive on tomcat server, we were able to reduce these errors. http://tomcat.apache.org/tomcat-6.0-doc/config/http.html

<Connector port="8080" protocol="HTTP/1.1"
maxKeepAliveRequests="1"
redirectPort="8443" />