JVM Tuning – Garbage Collection

As new objects are created, they are placed in the Java Heap, and as far as a Java application programmer is concerned, that’s the end of the story. When the objects are no longer being used, they’ll magically die off by themselves. This powerful paradigm is enabled by automated garbage collection, which insulates Java programmers from memory management which they would otherwise have to perform manually, as they would in C or C++ programming.
Automated garbage collection is a mechanism provided by the JVM (Java Virtual Machine) to reclaim heap space by removing objects which are eligible for garbage collection. An object becomes eligible for garbage collection (GC) when it is no longer reachable from any live threads or static references – in other words, when all references to it become null. Cyclic dependencies are not counted as references, so even if Object A holds a reference to Object B, and Object B holds a reference back to Object A, then (if they don’t have any other live reference to other objects), both Objects A and Object B will be eligible for garbage collection. There are corner cases such as weak references which we’ll ignore for simplicity.

In any application heap, some objects become garbage shortly after their creation, some survive for a long time and then become garbage, and others can remain live for the entirety of the program’s run.

A garbage collector divides the heap into multiple generations. Objects are created in the young generation, and objects that meet some promotion criteria, such as having survived a certain number of collections, are then promoted to the next older generation. A generational collector is free to use a different collection strategy for different generations and perform garbage collection on the generations separately.
One of the advantages of a generational approach is that it the garbage collection pauses can be made shorter by not collecting all generations at once. When the allocator is unable to fulfil an allocation request, it first triggers a minor collection, which only collects the young generation. Since many of the objects in the young generation will already be dead, minor collection pauses can be quite short and can often reclaim significant heap space. If the minor collection frees enough heap space, the user program can resume immediately. If it does not free enough heap space, it proceeds to collect higher generations until enough memory has been reclaimed. (In the event the garbage collector cannot reclaim enough memory after a full collection, it will either expand the heap or it will throw an OutOfMemoryError.)

The Java Heap

The JVM divides the heap into these generations:

  • Young generation (Nursery/New Generation)
    • Eden space (Creation Space)
      • Newly created objects
    • Survivor spaces (2 semi-spaces)
      • Objects surviving minor collections
  • Tenured generation (Old Generation)
    • Objects surviving major collections
  • Permanent generation
    • Stores loaded classes and method metadata

Heap Sizing

  • Optimal Java heap 1-2GB
  • Absolute max Java heap 10-15GB

Garbage Collectors

Mark-sweep collectors

The most basic form of garbage collector is the mark-sweep collector, in which the world is stopped and the collector visits each live node, starting from the roots, and marks each node it visits. When there are no more references to follow, collection is complete, and then the heap is swept (that is, every object in the heap is examined), and any object not marked is reclaimed as garbage and returned to the free list.

The big problem with mark-sweep is that every active (that is, allocated) object, whether reachable or not, is visited during the sweep phase. Because a significant percentage of objects are likely to be garbage, this means that the collector is spending considerable effort examining and handling garbage. Mark-sweep collectors also tend to leave the heap fragmented, which can cause locality issues and can also cause allocation failures even when sufficient free memory appears to be available.

Mark-compact collectors

Like mark-sweep, mark-compact is a two-phase process, where each live object is visited and marked in the marking phase. Marked objects are then copied to ensure that the live objects are compacted at the bottom of the heap. Long-lived objects tend to accumulate at the bottom of the heap, making object location easier (and faster).

Serial Mark-Sweep-Compact Collector (PSOldGen)

The serial mark-sweep-compact collector is useful for small (< 256MB) heaps, providing reasonable performance at the cost of occasional latency. It combines the mark-sweep and mark-compact operations.

  • Operation
  • Relocates live object to start of heap
  • Updates pointers
  • The Bad
    • O(objects in heap)
    • Stop-The-World
  • The Good
    • High throughput
    • Reduced fragmentation
    • Allocation is cheap
    • Object location cheap

Parallel Mark-Compact Collector (Throughput)

The parallel mark-compact collector is useful for medium size (1GB – 4GB) heaps where throughput is of greatest concern.

  • Operation
  • Relocates live object to start of heap
  • Updates pointers
  • The Good
    • High throughput
    • Reduced fragmentation
    • Object location cheap
    • Allocation is cheap
  • The Bad
    • O(objects in heap)
    • Stop-The-World (Although reduced wall-time of stop-the-world compared to serial)

Concurrent-Mark-Sweep (CMS) (Concurrent Low Pause Collector)

The concurrent mark-sweep collector is useful for medium size (1GB – 4GB) heaps where low latency is of greatest concern.

  • Operation
  • Concurrent Mark
  • Concurrent Sweep
  • The Good
    • Low pause
  • The Bad
    • Lower throughput
    • No relocation of live objects
    • Fragmentation
    • Object location expensive
    • Allocation expensive

Garbage First Collector (G1GC)

The garbage first collector is useful for medium to large (4GB – 16GB) heaps where low latency is of greatest concern, although it provides reasonable throughput.

  • Operation
  • Heap divided into regions
  • Divided into cards
  • Monitors garbage per region
  • Monitors current allocation rate (predicting demand)
  • Only collects regions based on demand
  • The Good
    • Low pause
  • The Bad
    • Assignment slower (updating cards)

Parallel Scavenge (PSYoungGen) (Young Generation only)

The parallel scavenge collector is a parallel young generation collector that may be used in combination with the Serial Mark-Sweep-Compact Collector or the Parallel Mark-Compact Collector.

Serial Copy (DefNew) (Young Generation only)

The serial copy collector is a serial young generation collector that may be used in combination with the Serial Mark-Sweep-Compact Collector or the Concurrent Mark-Sweep Collector.

Switching between HotSpot collectors

HotSpot JVM may use one of 6 combinations of garbage collection algorithms listed below.

Young collector Tenured collector JVM options
Serial (DefNew) Serial Mark-Sweep-Compact (PSOldGen) -XX:+UseSerialGC
Parallel scavenge (PSYoungGen) Serial Mark-Sweep-Compact (PSOldGen) -XX:+UseParallelGC
Parallel scavenge (PSYoungGen) Parallel Mark-Compact (ParOldGen) -XX:+UseParallelGC -XX:+UseParallelOldGC
Serial (DefNew) Concurrent Mark Sweep -XX:+UseConcMarkSweepGC -XX:-UseParNewGC
Parallel (ParNew) Concurrent Mark Sweep -XX:+UseConcMarkSweepGC -XX:+UseParNewGC

Tuning the Heap

Unfortunately, there is no one-size-fits-all combination of garbage collector and generation sizes. The only way to determine the ideal settings is system profiling, gathering statistics and trial-and-error.

Here are some rules of thumb to follow when tuning the heap, but remember there is no silver bullet…

Young generation

  • Ensure the eden space is large enough to hold the working set of a burst operation – lots of new objects created at high tempo
  • Ensure the survivor spaces are can cope with long-lived objects that need to be tenured fast – tweak the tenuring threshold

Tenured generation

  • Find the maximum working set size (system under load)
  • Over-provision the maximum working set by 25-30%
  • Merely using the largest tenured generation size you can afford is counter-productive
    • More system resources are consumed for no benefit
    • Larger heaps results in longer garbage collections – hurting performance
This entry was posted in java and tagged , , , , , . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>