Sunday, November 05, 2006

Centralized Analysis of Scale Data

The Grinder is a good tool for generating large levels of load against a server. But after the run is over, the love ends. There are no post-run analysis capabilities built in. No graphs of server-side performance metrics, or clients side data. I have begun working on an open-source data warehousing tool for test data generated by The Grinder. This tool parses the logs generated by each agent, and feeds all the information into a database. Once it has been centralized there, all sorts of post-run analysis becomes possible. I have two goals for this project:
  • Allow for simple analysis of Grinder runs: generate summary tables, graph server-side perf data such as CPU and disk activity, graph client side data such as response times and transactions per second.
  • Allow for long-term storage of test data. This will enable comparing performance of the server across many different builds, and longer-term, across many different server versions.
Current Situation Here is what is implemented today:
  • The database -- tables
  • The database -- sql scripts to generate TPS and response time graphs
  • The log parser and feeder classes which run on each agent
  • The grapher classes
This is an example of a transactions per second graph generated by the current code. In this scale run, the load was increasing over time (using the scheduling mechanism discussed at the end of my previous blog entry.) Here is an example of a response times graph generated by the current code: Future - Short Term In the short term, this basic functionality must be completed.
  • SQL code to generate reports summarizing a given scale run. Presenting information on response times, pass/fail rates, etc. in a tabular format.
  • Create feeders and graphers for server-side performance data, such as produced by perfmon, vmstat, sar, prstat, etc.
Future - Medium Term After the immediate needs are met, a good way of browsing the test results in the database is needed.
  • Storing the graphs themselves in the database rather than on the filesystem.
  • Implementing a Web UI for browsing scale test info in the database
Future - Long Term Currently, running a large-scale test with many agents is a cumbersome process. Each agent process must be manually started and connected to the console. In situations with, for example, 30 agent machines, this is problematic. Shell scripts exist today that can ssh in to remote agents and start them up, but they require Linux and are not a generic, cross-platform solution.
  • Extending the web UI to provide additional configuration management.
  • Extending the web UI to enable remote agent startup in a cross-platform manner, and the ability to drive a Grinder test in an AJAX environment via the console API.

Saturday, November 04, 2006

The Grinder: Addressing the Warts

Last June, I discussed the merits of The Grinder as a Load testing tool, and compared it with Load Runner and JMeter. Since then, I have been using The Grinder at work to perform load tests against our server product. In the nearly five months since my original analysis, all of my serious complaints have been addressed. In addition, several of my minor ones have been addressed as well, or workarounds have been discovered. So I thought it would be fair to revisit my original complaints, and in the cases where solutions have been found, share them.

Major Issues

Poor performance with large downloads.

This turned out to be a memory issue driven by my poor configuration. The Grinder does not have the option of eating the bytes of the HTTP response as it is coming in, or of spooling it off to disk. It keeps the entire body of the response in memory until the download is complete. In my test, I was requesting a web resource that was very large (4MB - 6MB). With its default heap size, and many threads running simultaneously, the agent process simply could not handle this level of traffic.

The trick was in how to give the agent more heap. The standard technique -- passing in heap-related arguments (-Xms, -Xmx) to the JVM when starting the agent was having no effect. This is because the main Java process spawns new Java instances to do the actual work, and these sub-instances had no knowledge of my settings. To pass these heap arguments to the JVM sub-processes, add a line to your grinder.properties file, and everything will work well:

grinder.jvm.arguments=-Xms300M -Xmx1600M

In the original post, there was some discussion about using native code such as libcurl to do the heavy-lifting here, but fortunately that has proven not to be necessary..


Minor Issues

Bandwidth Throttling / Slow Sockets

Bandwidth throttling, AKA slow sockets, is the name for a feature that allows each socket connection to the server to communicate at a specified bitrate, as opposed to downloading everything at wire speed as fast as possible. This is desirable because for any given level of transactions per second, the server will be working harder if it is servicing a large number of slow connections as opposed to a small number of very fast connections. So this allows the load tester to set up a far more realistic scenario, modeling the real-world use of many devices with varying network speeds connecting to the server.

And there is good news on this issue. By request, Phil Aston, the primary developer of The Grinder, has implemented this feature. I tested early Grinder builds with this feature, and am happy to say with confidence that the current implementation behaves correctly, is robust, and scalable. Awesome.

The feaure is implemented as method HTTPPluginConnection.setBandwidthLimit -- details here:

http://grinder.sourceforge.net/g3/script-javadoc/net/grinder/plugin/http/HTTPPluginConnection.html#setBandwidthLimit(int)

Load Scheduling

There is still not full-on load scheduler for The Grinder like there is for Load Runner. Thread execution is all or nothing. However, there is a neat little workaround that takes away most of the sting. If you just want your threads to ramp in smoothly over time in a linear fashion, you can have each thread sleep for a period of time before beginning execution, with higher-numbered threads sleeping progressively longer.

Be sure the module containing your TestRunner class imports the grinder:

from net.grinder.script.Grinder import grinder
Then put this code into the __init__ method of your TestRunner class:
      # rampTime is defined outside this snippet, it is the amount
      # of time (in ms) between the starting of each thread.
      sleepTime=grinder.threadID * rampTime
      grinder.sleep( sleepTime, 0 )

Results Reporting

The Grinder's results reporting an analysis features are weak. I am currently working on a data-warehousing system that stores the transaction data from each run, enabling detailed post-run processing and analysis. Details in subsequent blog entries.

Wednesday, June 14, 2006

Shootout: Load Runner vs The Grinder vs Apache JMeter

1 INTRODUCTION

I recently needed to recommend a tool to use for a scalability testing project, and I was in the fortunate situation of having some time to survey the field, and to look into the top contenders in greater depth. From an original list of over 40 candidates, I selected three finalists in the open-source and commercial categories. I then took some time to look at them in detail, to determine which tool to recommend for the ongoing scale testing effort. Since I have seen several questions about how these tools compare to each other on various mailing lists, I'm sharing my findings here in the hopes that others will find them useful.

My three finalists were Load Runner, from Mercury Interactive; JMeter, from the Apache foundation, and The Grinder, an open-source project hosted on SourceForge.

2 SUMMARY OF RESULTS

I found that I could use any of them and get a reasonably good amount of scale test coverage. Each tool has unique things it does very well, so in that sense, there is no “wrong answer.” Conversely, each of the tools I considered have unique deficiencies that will impede or block one or more of the scenarios in our test plan. So there is no “right answer” either – any option selected will be something of a trade-off.

Based on this research, I recommended The Grinder as the tool to go forward with. It has a simple, clean UI that clearly shows what is going on without trying to do too much, and offers great power and simplicity with its unique Jython-based scripting approach. Jython allows complex scripts to be developed much more rapidly than in more formal languages like Java, yet it can access any Java library or class easily, allowing us to re-use elements of our existing work.

Mercury's Load Runner had a largely attractive feature set, but I ultimately disqualified it due to shortcomings in these make-or-break areas:

  • Very high price to license the software.

  • Generating unlimited load is not permitted. With the amount of load our license allows, I will be unable to effectively test important clustered server configurations, as well as many of our “surge” scenarios.

  • Very weak server monitoring for Solaris environments. No support for monitoring Solaris 10.

JMeter was initially seen as an attractive contender, with its easy, UI-based script development, as well as script management and deployment features. It's UI is feature-rich and this product has the Apache branding. It was ultimately brought down by the bugginess of it's UI though, as several of it's key monitors gave incorrect information or simply didn't work at all.

3 Comparison Tables

All the items in the tables below are discussed in greater detail in the following sections. These tables are to give a quick overview

3.1 Critical Items

There are several features that are key to any scale testing effort. Items in this table are key to our efforts. Not having any of these will seriously impact our ability to generate complete scale test coverage.

Item

Load Runner

JMeter

The Grinder

Solaris Monitoring

-

neutral

neutral

Unlimited Load generation

-

+

+

Supports IP spoofing

+

-

+

Large download performance

+

-

- *

* Multiple workarounds are being investigated, including calling native (libcurl) code for the most intensive downloads.

3.2 Non-Critical Items

Items in this section are not make-or-break to our test effort, but will go a long way to making the test effort more effective.

3.2.1 General

Item

Load Runner

JMeter

The Grinder

Server monitoring

mixed

neutral

neutral

Batch Mode

-

+

+

Ease - Installation

-

+

+

Ease – Script Authoring

+

+

mixed

Ease – Running Tests

neutral

+

neutral

Results Reporting

+

-

-

Agent Management

+

+

-

Cross Platform

-

+

+

Cost

-

+

+

Technical Level

+

+

-

Stability/Bugginess

neutral

-

neutral

3.2.2 Agents

Item

Load Runner

JMeter

The Grinder

Transaction power

+

neutral

+

Custom protocols

+

+

+

Out-of-the-box protocols

+

neutral

-

Transaction aggregation

+

-

+

Scalability of Agent

+

neutral

neutral

Slow sockets

+

neutral

-

External libs usable

+

+

+

Load Scheduling

+

+

-

Ease of porting JCS

-

neutral

+

3.2.3 Controller

Item

Load Runner

JMeter

The Grinder

Scalability of Controller

neutral

-

+

Real-time test monitoring

+

-

neutral

Real-time load adjustment

+

-

-

Script management

+

+

-

Script Development Environment

+

+

-

4 GENERAL

4.1 Server Monitoring -- Windows, Solaris, etc.

4.1.1 Load Runner

Mercury is extremely strong in this area for Windows testing. Unfortunately, it is very weak in unix/Solaris. For windows hosts, Load Runner uses the native performance counters available in perfmon. This allows monitoring myriad information from the OS, as well as metrics from individual applications (such as IIS) that make their information available to perfmon.

For Solaris hosts, Load Runner is restricted to the performance counters available via rpc.rstatd. This means some very basic information on CPU and memory use, but not much else. Note that Load Runner does not currently support any kind of performance monitoring on Solaris 10.

4.1.2 JMeter

JMeter has no monitoring built in. Thus, wrapper scripts are required to synchronize test data with external perf monitoring data. This is the approach I used to great effect with our previous test harness. The advantage of this method is I can monitor (and graph!) any information the OS makes available to us. Since the amount of data to us is quite large, this is a powerful technique.

4.1.3 The Grinder

The same wrapper-based approach would be required here as I detailed above for JMeter.

4.2 Can generate unlimited load

This is a make-or-break item. There are many scenarios I just can't cover if I can only open a few thousand socket connections to the server.

4.2.1 Load Runner

Load runner restricts the number of vusers you can run. Even large amounts of money only allow a licence for a modest number of users. Historically, the rate for 10,000 HTTP vusers has been $250,000. However, on a per agent basis, load is generated very efficiently so it may take less hardware to generate the same amount of load. (But for the money you spend on the load runner license, you could buy a LOT of load generation hardware!)

4.2.2 JMeter

Since this is Free/Open Source, you may run as many agents as you have hardware to put them on. You can add more and more load virtually forever, as long as you have more hardware to run additional agents on. However, in specific unicast scenarios, such as repeatedly downloading very large files (like PIPEDSCHEDULE), the ability of agents to generate load falls off abruptly due to memory issues.

4.2.3 The Grinder

In this matter the Grinder's story is the same as JMeter. The limit is only the number of Agents. The Grinder suffers the same lack of ability to effectively download large files as JMeter. A workaround that uses native code (libcurl) to send requests is being investigated.

4.3 Can run in batch (non-interactive) mode

4.3.1 Load Runner

No. Hands-free runs can be scheduled with the scheduler, but multiple specific scenarios cannot be launched from the command line. This may be adequate for single tests; it's not clear how this would work if a series of automated tests was desired.

4.3.2 JMeter

Yes, the ability to do this is supported out of the box. However, it can only be run from a single agent; the distributed testing mechanism requires the UI. So for automated nightly benchmarks it may be ok, but for push-to-failure testing where much load is required, the UI is needed. It would presumably be possible to have a wrapper script launch JMeter in batch mode at the same time on multiple agents. This would achieve arbitrary levels of load, but would not have valid data for collective statistics like total transactions per second, total transactions, etc.

4.3.3 The Grinder

As with JMeter, a single agent can be run from the command line. See JMeter comments, above.

4.4 Ease of Use

4.4.1 Load Runner

4.4.1.1 Installation

Installation takes a ton of time, a lot of disk space, and a very specific version of Windows. But it's as simple as running a windows installer, followed by 3 or 4 product updaters.

4.4.1.2 Setting up Simple tests

For HTTP tests, Load Runner is strong in this category, with it's browser recorder and icon-based scripts.

4.4.1.3 Running Tests

The UI of the controller is complex and a bit daunting. There is great power in the UI if you can find it.

4.4.2 JMeter

4.4.2.1 Installation

Be sure Sun's JRE is installed. Unpack the tar file. Simple.

4.4.2.2 Setting up Simple tests

Very quick. Start up the console, a few clicks of the mouse, and you are ready to generate load. Add thread group, add a sampler, and you have the basics. Throw in an assertion or two on your sampler to validate server responses.

4.4.2.3 Running Tests

Both distributed and local tests can be started form the UI. A menu shows the available agents, and grays out the ones that are already busy. Standalone tests can be started from the command line. JMeter wins this category hands down.

4.4.3 The Grinder

4.4.3.1 Installation

Installation is as simple as installing java, and unpacking a tar file.

4.4.3.2 Setting up Simple tests

Setting up tests, even simple tests, requires writing Jython code. So developer experience is important. A proxy script recorder is included to simplify this. In addition, there are many useful example scripts included to help you get started.

4.4.3.3 Running Tests

Involves configuring a Grinder.properties file, manually starting an agent process, manually starting the console, then telling the test to run from within the console UI.

4.5 Results Reporting

4.5.1 Auto-generated?

Having key graphs generated at the conclusion of a scale run, such as load over time, server CPU, transactions per second, etc, can save a lot of tedium, since manually generating these graphs from log files is quite time consuming.

4.5.1.1 Load Runner

Load runner has an excellent integrated analysis tool that can dynamically generate graphs on any of the myriad performance counters available to it. The downfall of this approach is that there are only a small number of performance metrics it can gather on Solaris. And while I can gather additional server metrics using sar, vmstat, dtrace, iostat, mpstat, etc., integrating this information in to the load runner framework will be difficult at best.

4.5.1.2 JMeter

JMeter does not gather any server-side performance metrics. But it can generate a limited number of client-side graphs while the test is in progress. These graphs can be saved after the tes is over. Fortunately, all the test data is written in a standard format. So it probably makes more sense to generate all the desired graphs via shell scripts during post-processing. This is the same approach I used with our previous test harness.

4.5.1.3 The Grinder

Like with JMeter, there are no graphs generated out of the box, but with the standard-format log files, scripted post-production is reasonably straightforward, giving us a powerful and flexible view on the results.

4.5.2 Analysis tools?

4.5.2.1 Load Runner

Yes very powerful tool for doing analysis after a run. An infinite number of customized graphs can be generated. These graphs can be exported into an html report.

4.5.2.2 JMeter

Nothing included. I would want to transfer over some of the scripts used in our previous test harness, or write a simple tool that dumps test data into a DB for post-analysis.

4.5.2.3 The Grinder

Nothing included. See the JMeter comments, above.

4.6 Simplicity of Agent management

4.6.1 Load Runner

This works well in Load Runner; each agent can run as a service or an application, simplifying management. Test scripts are auto-deployed to agents.

4.6.2 JMeter

JMeter is good here. Each agent is a server that the controller can connect to at will in real-time. Test scripts are automatically sent to each agent, centralizing management.

4.6.3 The Grinder

Grinder is the weakest here. The properties files that define how much load to apply, must be manually deployed to all agents. A wrapper shell script like the one used by our previous test harness could address this by always deploying the Jython scripts to the agents before each run.

4.7 Tool is cross-platform

4.7.1 Load Runner

Not really. A subset of the complete agent functionality can be had for agents running on Linux or Solaris. Non-windows agents run each vuser as a process rather than a thread, reducing the amount of load an agent can produce. The controller and VUGen both are Windows-only. And Load Runner is poor at measuring non-Windows server statistics.

4.7.2 JMeter

Yes. Java/Swing app is platform-agnostic.

4.7.3 The Grinder

Yes. This app is based on Java, Swing, and Jython. Like JMeter, it will run anywhere you can set up a JVM.

4.8 Cost

4.8.1 Load Runner

Expect to pay in the low to mid six-figures for a license allowing any kind of robust load-generation capacity. But that's not all, there are high ongoing support costs as well. For the same kind of money I could get over 100 powerful machines to use as scale agents, as well as associated network switches, cabling, etc.

4.8.2 JMeter

Free. (Apache License)

4.8.3 The Grinder

Free. (Grinder License)

4.9 Intended audience/technical level

4.9.1 Load Runner

Load Runner has the widest audience of all these tools; perhaps not surprising given its maturity as a commercial product. It's browser-recording and icon-based script development give it the lowest technical barriers to entry of any of the three products. A QA engineer with modest technical background and little to no coding skills can still be productive with tool. And it's ability to load Windows .dll's and other libraries give it a power and flexibility useful to developers and other more advanced users.

4.9.2 JMeter

JMeter does not require developer skills to perform basic tests in any of the protocols it support out of the box. A form-driven UI allows the user to design their own scenario. This scenario is then auto-deployed to all agents during test initialization.

4.9.3 The Grinder

While it's possible that a regular QA engineer could be used to run the console and perform some testing, the tool is really more aimed at developers. This is the only tool of the three that did not include any kind of icon-based or UI-based script development. At a minimum, users will need to know how to write Python/Jython code to create simple test scripts, and the ability to write custom Java classes may be required as well, depending on the scenario.

4.10 Stability/Bugginess

4.10.1 Load Runner

The controller crashes occasionally under heavy load, but this is infrequent and largely manageable. Other than this, the product seems robust enough.

4.10.2 JMeter

JMeter fares poorly in this area.

TODO

4.10.3 The Grinder

I found no issues with the Grinder, other than the previously-mentioned memory issue with large file downloads.

5 AGENTS

5.1 Power of transactions

5.1.1 how flexible on what can be passed/failed?

5.1.1.1 Load Runner

Any arbitrary criteria can be set to define if a transaction passes. This includes but is not limited to response time, contents of response body, response code, or just about anything else.

5.1.1.2 JMeter

In JMeter, samplers generate your test requests. You can add a wide variety of assertion types to any of your samplers. These will allow you assert on response code, match regular expressions against the response body, assert on the size or md5sum of the response.

5.1.1.3 The Grinder

As with Load Runner, pass/fail criteria has merely to be defined within the test script. Criteria can be whatever you want.

5.1.2 user-defined transaction/statistic types?

5.1.2.1 Load Runner

Yes – if you get away from the icon-based view in Vugen and go to the code level, you can wrap anything you want in a transaction to get timing information, pass/fail data, etc.

5.1.2.2 JMeter

Yes – done through plugins.

5.1.2.3 The Grinder

Yes – an API exists to easily wrap any Java or Jython method in a transaction.

5.2 Other Protocols

5.2.1 Which protocols are supported out of the box?

5.2.1.1 Load Runner

This varies by the type of license purchased, with each protocol having a separate cost and a separate limit for the number of allowable VUsers. The potential number of protocols is extremely high, including Java, ODBC, FTP, HTTP, and others.

5.2.1.2 JMeter

Supports several protocols out of the box:

  • jdbc

  • http

  • ftp

  • jndi

5.2.1.3 The Grinder

The Grinder only supports HTTP out of the box.

5.2.2 Can transactions wrap custom (non-http) protocols? Can transactions wrap multiple (http or other) requests to the server?

5.2.2.1 Load Runner

Yes. There are multiple ways to do this. You can implement your own protocol handler in a .dll or in Load Runner's pseudo-c. Then you can invoke this handler from any type of VUser that you have a license for. Alternately, unless your protocol is something uncommon, you can probably buy a pre-existing implementation of your protocol, and licenses for VUsers to run this protocol.

5.2.2.2 JMeter

Yes. An external Java plugin that supports your protocol must be added in to JMeter to support this.

5.2.2.3 The Grinder

Any protocol can be tested with the Grinder. An HTTP plugin is included. In other cases, you will create a separate Java class that implements a handler for your protocol. In your test script, you will wrap this Java class in a Grinder test object. Your protocol is used/invoked by calling any method you want from your java class via the test wrapper. The wrapper will pass/fail the transaction based on response time.

This default behavior can be overridden with additional code in your Jython script. For example, after invoking your protocol method, you could inspect the state of your Java object and pass/fail the transaction based on information there.

5.3 Capacity of single agent to generate load, particularly in high-bandwidth scenarios

I have typically seen libraries like Apache's HTTPClient max out the CPU to 100% when it's conducting high-bandwidth, large file downloads. The library supports high bandwidth use and many transactions per second just fine, but has issues with repeated large file downloads.

5.3.1 Load Runner

Per-agent load generation capacity is strong. Licensing constraints may limit actual load generated.

5.3.2 JMeter

With the exception of the high-bandwidth case, per agent capacity is good.

5.3.3 The Grinder

Runs out of memory when repeatedly downloading large documents in many threads. Currently, there does not seem to be a workaround inside The Grinder itself. However, with my previous test harness I was able to work around this same issue by calling native code, and there is reason to believe that approach may work here as well.

5.4 Can support IP spoofing

Assuming a large range of valid IP addresses assigned to the agent machines, does the test harness support binding outgoing requests to arbitrary IP addresses? The ability to support this is critical for out test effort. If all broker requests come in from the same IP address, the broker thrashes unrealistically as it continually updates customer settings.

5.4.1 Load Runner

Yes. (see link in appendix 1)

5.4.2 JMeter

JMeter is weak here. There is a new mechanism (not yet released but available in nightly builds) where outbound requests can round-robin on a predetermined list of local IPs. This is not good enough for Fat Client simulation.

5.4.3 The Grinder

The local IP address to bind the outbound request to can be specified in the Jython scripts. This is just what I need.

5.5 Can support variable connection speed/bandwidth throttling

5.5.1 Load Runner

Load Runner supports this out of the box.

5.5.2 JMeter

JMeter does not support this out of the box, but there is a slow socket implementation in the wild, written for the Apache HTTP Client (which JMeter uses), that should be possible to drop in fairly easily.

5.5.3 The Grinder

The Grinder does not support this. It may be possible with additional code hacking, but the path for this is not clear. Their third-party HTTP implementation means writing a custom solution may be challenging. Perhaps it would be possible using JNI and libCurl, although the author of the libCurl binding suggest there may be a memory leak in the C layer.

5.6 Can run arbitrary logic and external libraries within agent

5.6.1 Load Runner

Windows .dll's may be loaded. Home-made libraries written in Load Runner's pseudo-C libs work fine as well. Additionally, function libraries can be embedded directly in the virtual user script.

5.6.2 JMeter

External Java libraries can be accessed via the plugin architecture.

5.6.3 The Grinder

The Grinder offers lots of flexibility for loading and executing third party libraries. With Jython, any Java code may be called, and most python code may be run unchanged. And there is a decent collection of example scripts that comes with the Grinder distribution.

5.7 Scheduling

5.7.1 Load Runner

Load Runner has a powerful, UI-based scheduling tool which allows you great flexibility to schedule arbitrary amounts of load over time. Load can be incrementally stepped up and stepped down, by single threads or entire groups. There is a graphical schedule builder that can generate schedules of arbitrary complexity.

5.7.2 JMeter

JMeter has UI-based scheduling that allows per-thread startup delays, as well as runs that start in the future. JMeter tests can run forever, for a specified time interval, or for a specified number of iterations for each thread.

5.7.3 The Grinder

No per-thread ramp-in. No generic scheduling tool. Primitive per-process (instead of per-thread) scheduling is possible but use of this feature probably reduces an Agent's maximum load-generation capacity, as the overhead of running a new process is far greater than the overhead of creating a new thread.

6 CONTROLLER

6.1 Ability of Controller to handle high volume of agent data

6.1.1 Load Runner

Load runner probably handles as much or more real-time data as any product out there. But they do it effectively. If you give the controller a beefy box to run on, you should have no problems.

6.1.2 JMeter

Limited. The amount of transaction monitors you can have running is configurable. If more that one or two are going and the agents are producing a lot of transaction data, the UI takes all the CPU, bogs down and becomes unusable.

6.1.3 The Grinder

The grinder does very well here, probably better than Load Runner. By design, the agents only send a limited amount of real-time data back to the controller during a test run. And the sampling period is adjustable with a big friendly slider. This is a handy feature I didn't fully appreciate at first – if the network bandwidth numbers are updating too fast, it's hard to see how many digits are in the number before it updates again. But with the slider, you can lock that number down for enough time to really consider it.

6.2 Real-Time Monitoring (Controller)

6.2.1 Load Runner

Load Runner features very strong real-time monitoring in the controller. Client side graphs, such as total transactions per second, errors per second, can be displayed next to server side graphs like CPU use and disk activity. The user can drag and drop from a list of dozens of graph types.

6.2.2 JMeter

Basic, table-based monitoring similar to what is in our previous test harness works properly. Other monitors threw null pointer exceptions.

6.2.3 The Grinder

The Grinder is good here. It has simple, sliding performance graphs for all transactions in one tab. These graphs are similar to what you see in the Windows Task Manager, where performance metrics older than a given amount of time slide off the left side of the graph. In addition, as in our previous test harness or JMeter, there is numeric data that periodically updates in a table.

6.3 Real-Time Load Adjustment

Sometimes while a test is in progress, you want to make adjustments. Increase the load. Decrease the load. Bring another agent online.

6.3.1 Load Runner

Load Runner wrote the book on this topic, with its highly-flexible ability to start and stop load in the middle of a test, with individual agents, groups of agents, or the entire set of agents.

6.3.2 JMeter

JMeter has the ability to interactively start and stop load on an agent-by-agent basis. It cannot interactively be done at the per-thread level, but agents and thread groups can have schedulers assigned to them.

6.3.3 The Grinder

The Grinder console does not have the ability to dynamically adjust the levels of load being generated by the agents. Coupled with its lack of a scheduler, this makes the Grinder the least flexible of the three tools when it comes to interactively setting load levels.

6.4 Controller-side script management/deployment

6.4.1 Load Runner

Yes.

6.4.2 JMeter

Yes.

6.4.3 The Grinder

Yes.

6.5 Can write simple scripts in the UI?

6.5.1 Load Runner

Load Runner comes with a powerful script-development tool, VUGen. This gives the test developer the option of developing icon-based test scripts, as well as the traditional code-view development environment. In addition. Load Runner can record web browser sessions to auto-generate scripts based on the recorded actions.

6.5.2 JMeter

Scripts are based on XML. They can be written in your preferred text editor, or created in an icon-based UI in the controller window. I found this feature to be both easy to use and surprisingly flexible. There is also a recorder feature to let you interactively create your scripts.

6.5.3 The Grinder

The Grinder is the weakest of the three here. It does have a TCP Proxy feature that can record browser sessions into Jython scripts. But there is no integrated graphical environment for script development

7 CONCLUSION

I selected The Grinder due to several make-or-break issues. However, each tool has unique strengths and weaknesses. Which tool is ultimately best for you depends on a number of things, such as:

  • Does you budget allow for an expenditure ranging from several tens to hundreds of thousands of dollars?

  • Will you be testing in a windows-only environment?

  • What is the technical level of your scale testers?

Both of the open source projects have merits, but neither one is ideal. My approach will be to work with the Grinder development team to resolve the most serious offenders.

8 Appendix 1 – Additional information

Load Runner system requirements (controller must be on Windows!)

http://www.mercury.com/us/products/performance-center/loadrunner/requirements.html

Linux/Solaris server monitoring (weak)

http://www.mercury.com/us/products/performance-center/loadrunner/monitors/unix.html

JMeter home page

http://jakarta.apache.org/jmeter/

JMeter Manual

http://jakarta.apache.org/jmeter/usermanual/index.html

The Grinder home page

http://grinder.sourceforge.net/

The Grinder Manual

http://grinder.sourceforge.net/g3/getting-started.html

Windows IP address multi homing

http://support.microsoft.com/kb/q149426/

9 Appendix 2 – Distinguishing features

These are some of the distinguishing features of each product:

Cool with Load Runner

  • highly developed, mature product

  • strong support

  • It is complex, but feature-rich

Problems w/ Load Runner

  • Extreme cost, both up front and ongoing

  • Limited load generation capacity based on license/key.

  • Limited ability to monitor server stats outside of windows.

Cool w/ Grinder

  • Jython scripting means rapid script development

  • Jython simplifies coding complex tasks

  • Good real-time feedback in the UI in most tabs.

  • Sockets based agent/controller communications. Trouble-free in our testing.

Problems w/ Grinder:

(Since this original article was posted, many of these issues have been addressed. See the blog entry titled "The Grinder: Addressing the Warts.")
  • no scheduling; load is all-or-nothing

  • no slow sockets, no prospects for easily fixing this
  • Memory failures in a few large file download scenarios.

Cool w/ JMeter:

  • Less technical expertise required
  • Overall more “slick” or “polished” feel – availability of startup scripts, more utility in the UI.

Problems w/ JMeter:
  • Limited feedback in the UI when the test is running
  • Memory and CPU issues when downloading very large files
  • The UI is buggy. Big pieces, including monitors, just don't work. Many Null Pointer Exceptions in the log, etc.

23

Tuesday, May 02, 2006

it's all about scalability

This blog will cover issues in scalability testing. We will review free and open-source test tools, discuss techniques for generating your test model, test execution, interpreting your results, and other discussion-driven topics. Enjoy!