An evidence-based approach to Java performance tuning

Donald Knuth is often quoted as saying this:

“Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%.”

source

Bearing that sage advice in mind, here is the recommended procedure for optimizing programs:

First of all, design and code your program or library with a focus on simplicity and correctness. To start with, don’t spend much effort on performance.
Get it to a working state, and (ideally) develop unit tests for the key parts of the codebase.
Develop an application level performance benchmark. The benchmark should cover the performance critical aspects of your application, and should perform a range of tasks that are typical of how the application will be used in production.
Measure the performance.
Compare the measured performance against your criteria for how fast the application needs to be. (Avoid unrealistic, unattainable or unquantifiable criteria such as “as fast as possible”.)
If you have met the criteria, STOP. You job is done. (Any further effort is probably a waste of time.)
Profile the application while it is running your performance benchmark.
Examine the profiling results and pick the biggest (unoptimized) “performance hotspots”; i.e. sections of the code where the application seems to be spending the most time.
Analyse the hotspot code section to try to understand why it is a bottleneck, and think of a way to make it faster.
Implement that as a proposed code change, test and debug.
Rerun the benchmark to see if the code change has improved the performance:

- If Yes, then return to step 4.
- If No, then abandon the change and return to step 9.  If you are making no progress, pick a different hotspot for your attention.

Eventually you will get to a point where the application is either fast enough, or you have considered all of the significant hotspots. At this point you need to stop this approach. If a section of code is consuming (say) 1% of the overall time, then even a 50% improvement is only going to make the application 0.5% faster overall.

Clearly, there is a point beyond which hotspot optimization is a waste of effort. If you get to that point, you need to take a more radical approach. For example:

Look at the algorithmic complexity of your core algorithms.
If the application is spending a lot of time garbage collection, look for ways to reduce the rate of object creation.
If key parts of the application are CPU intensive and single-threaded, look for opportunities for parallelism.
If the application is already multi-threaded, look for concurrency bottlenecks.

But wherever possible, rely on tools and measurement rather than instinct to direct your optimization effort.

Found a mistake? Have a question or improvement idea? Let me know.

Table Of Contents