Search

Tuesday, February 02, 2010

Tips on performance testing and optimization

Here I am explaining how to go about performing scalability testing, performance testing, and optimization, in a typical Java 2 Enterprise Edition (J2EE) environment.

Definitions:

Response Time the time it takes between initial request and complete download of response (rendering of entire web page).

Load a measurement of the usage of the system. A server is said to experience high load when its supported application is being heavily trafficked.

Scalability - A scalable application will has a response time that increases linearly as load increases. Such an application, will be able to process more and more volume by adding more hardware resources in a linear (not exponential) fashion.

Automation testing tools Tools (Silk from Segue Software, WebLoad, etc) used to simulate a user by requesting pages or going through pre-programmed workflow on your site.

Load testing tools Most automation testing tools can also be used as load testing software, like WebLoad. These tools will simulate any number of users using your site and provide you with important data like average response times.

Profiler. A profiler is a program that examines your application as it runs. It provides you with useful run time information such as time spent in particular code blocks, memory / heap utilization, number of instances of particular objects in memory, etc.

A Process for performance testing

1) Functional Testing. Most applications begin tests be first completing functional tests. That is, ensuring that all the usecases / workflow in your application work.

2) Load and Scalability Testing. Load and scalabilty testing has too forms:

  • Test Response time as you increase the size of our database
  • Testing response time as you increase concurrent users

3) Interpreting the results. After measuring response time at varied database sizes and loads, you can now make interpretations based on the average response time of these tests and the resource utilization of the server during the tests.

4) Optimization. After identifying problems in the last step, you now interpret the results and track down the problem.

Load and Scalability Testing

The purpose of load and scalability testing is to ensure that your application will have a good response time during peak usage. You can also test how your application will behave over time (as your website contains more and more data in your database). To begin testing, write some testing scripts that will populate your database with an average amount of data. Run your performance tests, measure your response time. Then populate your database with an extreme amount of data (3 to 4 times more data than you can foresee having in 3 years). Run your performance tests again. If response times are significantly larger for the second test, then something is wrong.

To run your performance tests, you will want to simulate server usage at different loads. As a rule of thumb, I simulate low load (one to 5 concurrent users), medium load (10-50 concurrent users), high load (100 concurrent users) and extreme load (1000+ concurrent users). Note that these numbers are arbitrary and depend on your business needs. Also, simulating 10 concurrent users with load testing software isn't representative of 10 people, since each robot in the load test may wait just milliseconds before hitting the server again. Thus, using a load tester to simulate 10 users is probably more representative of the web surfing patterns of 30-40 people.

Once you have tested at all three load levels, you can now compare average response times to see if your system is scales, that is, if the response time increases linearly.

Interpreting the results

The fun part of this process is interpreting the results of your load testing. Let us examine some of the different possibilities:

  1. Response time increases too much when database is over populated
    Response time should not increase too much if you move from a database with 100 rows in its tables to 50,000. Database indexing technology makes finding a row in a table take a matter of milliseconds, even if there are hundreds of thousands of rows. Thus, if your response time increases too much after moving from a moderately populated database to an over populated database, then you probably haven't indexed your appropriate columns yet.
  2. Response time increases exponentially as load increases
    If your system becomes un-useable as you increase concurrent users, then your system is not scalable. Interpreting these results are difficult, as the problem could be with hardware, deployment configuration, architecture, etc. Make sure you watch the server resources during the tests:
    1. Watch memory requirements
    2. Watch CPU usage
      If CPU is over used, need faster processor, or more processors. If the CPU is underused, then the problem is probably input/output (I/O) related. Check your database connections, your running thread count, and the network configuration of your test boxes.

If after checking your configuration, verifying that the slowdown is not a hardware bottleneck, and looking over your architecture for code to optimize, its time to run a code profiler.

Optimization

The database, your architecture, configuration and hardware will need to be optimized. As mentioned in the previous section, the easiest way not to scale is to have a database that isn't tuned. A database administrator (DBA) is always a vital person to have on any dev team, but if you don't have one, here is what you can do:

Look though your EJBs and verify that your database isn't doing linear searches for any of the SQL queries that you have encoded. To do this, copy your SQL from your code and in your database SQL window, run an EXPLAIN clause:

Explain select * from table where tablefield = somevalue

Although the explain syntax differs from database to database, there is always something similar. After running this line of code, your DB will tell you if it is searching an index or a linear search. Make sure you verify that every piece of SQL in your application is using your DBs indexes, and if not, create the indexes.

After optimizing the database, and optimizing your hardware configuration (as discussed in the previous section), the next step is optimizing your code, and this is done with a profiler.

A profiler is a program that analyzes your application as it runs. A Profiler provides you with information you could not otherwise get access to, such as:

  1. How many objects of each class are in memory and garbage collection behaviour
    • This information can help you identify classes, which should be pooled.
    • Can help you tune your java heap.
  2. How much time your application is spending in particular classes

This is the most important feature. Your profiler will point its finger and show you which classes are the bottlenecks.

One such program that really helped me is called Optimize-It. Optimize-It can be used with any java program or any java-based application server. Configuration with Weblogic is easy, and Optimize-It can be used to profile an application on a remote server.

Optimizing your architecture is extremely project specific, but here are some tips:

  1. Make sure you have minimized your network calls, especially database calls
    • It is better to make one large database call rather than many small ones.
    • Make sure ejbStore isn't storing anything for read only operations.
    • Use Details Objects to get entity bean state.
  2. Make sure to take advantage of caching where possible
    Your app. Server probably allows you to cache entity beans in memory, make sure you take advantage of this, as it will dramatically reduce database calls and speed up data access.
  3. Make sure you are using session beans as a façade to your entity beans.
    You can encapsulate the workflow of one entire use case in one network call to one method on a session bean (and one transaction).