What happens when you only limit the maximum heap size?

#java Nov 15, 2022 13 min Mike Kowalski

The JVM has spoiled us with its cleverness. It makes so many decisions behind the scenes, that lots of us gave up on looking at what’s inside. Memory-related discussions are probably more likely to appear at a conference or during a job interview than at “real” work. Of course, depending on what you work on.

Java apps are often run in containers these days. Built-in container awareness makes the JVM respect various container-specific limits (e.g. CPU, memory). This means, that even when running an app with a dummy java -jar app.jar (which is usually not the best idea), everything should just work. That’s probably why the only memory-related option provided is often the -Xmx flag (or any of its equivalents). In other words, we tend to only limit the maximum heap size, like this:

java -Xmx256m <hopefully few other options here> -jar app.jar 

Looking at the apps like this, made me wonder: what happens when we omit other heap-related flags? Is there any performance penalty, especially when running with a small heap? What happens under the hood? Do containers make any difference here?

The short answer is: the JVM will pick the heap configuration for us, but its choice may not be the coolest beer in the fridge. Even with a small heap, the performance implications may be noticeable - especially combined with sometimes-a-default Serial GC. But of course, it all depends…

Let’s try to explain what it depends on, starting with an experiment.

Note: In this article, the terms ‘Java’ and ‘JVM’ refer to the most popular HotSpot virtual machine, known from the OpenJDK project. Other Java VM implementations like Eclipse OpenJ9 may behave differently. All the tests have been performed using Amazon Coretto OpenJDK 17 distribution.

The experiment

I’ve created an extremely simple Spring Boot 2.7 app exposing a single reactive REST endpoint together with the default set of Actuator endpoints. These dependencies have been selected only to keep the app busy on startup. The application itself has been instructed to stop (by calling System.exit(0);) right after it’s fully initialized. It has been also dockerized with the following configuration:

FROM amazoncorretto:17.0.5-al2
COPY target/experiment-0.0.1-SNAPSHOT.jar app.jar
CMD ["java", "-Xmx128m", "-XX:+UseSerialGC", "-Xlog:gc", "-jar", "app.jar"]

Then, I kept running the app changing only the container’s memory limit. The rest of the parameters (single CPU, maximum heap size of 128m, Serial GC, and GC logs enabled) remained the same:

❯ docker build -t heap-experiment:latest . >/dev/null 2>&1
❯ docker run --cpus=1 --memory=512m  heap-experiment:latest > logs-512m.txt
❯ docker run --cpus=1 --memory=1024m heap-experiment:latest > logs-1024m.txt
❯ docker run --cpus=1 --memory=1536m heap-experiment:latest > logs-1536m.txt
❯ docker run --cpus=1 --memory=2048m heap-experiment:latest > logs-2048m.txt
❯ docker run --cpus=1 --memory=4096m heap-experiment:latest > logs-4096m.txt

Using produced GC logs from each run, I was able to calculate basic statistics about the time spent on the garbage collection. Single CPU combined with Serial GC ensured that each GC pause was actually stopping our application. The results were as follows:

Container memory Pause Young events Pause Young total time Pause Full events Pause Full total time Total GC time
512m 103 69.627 ms 2 23.625 ms 93.252 ms
1024m 68 60.613 ms 1 15.540 ms 76.153 ms
1536m 49 54.170 ms 1 16.479 ms 70.649 ms
2048m 38 55.748 ms 1 14.935 ms 70.683 ms
4096m 18 40.504 ms 1 15.231 ms 55.735 ms

The biggest difference in total GC time between the two configurations (512m vs 4096m) was close to 37.5 ms. The JVM has spent this time on additional garbage collection, which apparently could be avoided. In certain use cases, such a difference on startup could actually be significant (or even affect reliability)!

Should we just blindly increase the memory limits of our containers then? Not really. Instead, let’s see where such a difference comes from.

For the impatient: if you want to skip the part about the JVM internals, you can jump directly to the last paragraph discussing the results once again.

JVM Ergonomics

The “magic” part of the JVM responsible for many of the default configuration choices is called Ergonomics.

Ergonomics is the process by which the Java Virtual Machine (JVM) and garbage collection heuristics, such as behavior-based heuristics, improve application performance.

The JVM provides platform-dependent default selections for the garbage collector, heap size, and runtime compiler. (…) In addition, behavior-based tuning dynamically optimizes the sizes of the heap to meet a specified behavior of the application.

HotSpot Virtual Machine Garbage Collection Tuning Guide

Decisions made by the Ergonomics process depends on the target environment (platform). Things like the number of CPUs or the amount of available memory really count. Ergonomics behavior could differ across computers and containers, which makes it not so straightforward to predict.

Heap size is one of the aspects controlled by the JVM Ergonomics unless configured directly. Quick recap: a heap is a place where all objects and arrays instantiated by our app are stored. This is also what we most often look at while talking about memory consumption (although it’s more complicated than that). In short, the JVM allocates a certain amount of memory for us, so we could keep our app’s data there.

To see some of the heap-related options controlled by the Ergonomics, we can set the -XX:+PrintFlagsFinal option:

❯ java -XX:+PrintFlagsFinal -version 2>&1 | grep ergonomic | grep Heap | tr -s ' '
 size_t G1HeapRegionSize = 4194304 {product} {ergonomic}
 size_t InitialHeapSize = 536870912 {product} {ergonomic}
 size_t MaxHeapSize = 8589934592 {product} {ergonomic}
 size_t MinHeapDeltaBytes = 4194304 {product} {ergonomic}
 size_t MinHeapSize = 8388608 {product} {ergonomic}
 uintx NonNMethodCodeHeapSize = 5839564 {pd product} {ergonomic}
 uintx NonProfiledCodeHeapSize = 122909338 {pd product} {ergonomic}
 uintx ProfiledCodeHeapSize = 122909338 {pd product} {ergonomic}
 size_t SoftMaxHeapSize = 8589934592 {manageable} {ergonomic}

Covering all these options will be far too much for this article. Luckily, only three of them are the most relevant to our example: MinHeapSize, InitialHeapSize, and MaxHeapSize.

Heap size configuration

There’s an alternative way of obtaining the current heap size configuration at runtime that I’m gonna use here. By setting -Xlog:gc+init, the JVM will log some of the configuration parameters relevant to the JVM on startup.

❯ java '-Xlog:gc+init' \
    -XX:MinHeapSize=16m \
    -XX:InitialHeapSize=32m \
    -XX:MaxHeapSize=100m \
    -jar app.jar 2>&1 | grep Capacity
    
[0.003s][info][gc,init] Heap Min Capacity: 16M
[0.003s][info][gc,init] Heap Initial Capacity: 32M
[0.003s][info][gc,init] Heap Max Capacity: 104M

These configuration values map to the specific options of the java command:

  • Min Capacity (-XX:MinHeapSize=size): the minimum size (in bytes) of the memory allocation pool
  • Initial Capacity (-XX:InitialHeapSize=size): the initial size (in bytes) of the memory allocation pool
  • Max capacity (-XX:MaxHeapSize=size or -Xmx for short): the maximum size (in bytes) of the memory allocation pool

Hold on a second… Where is the famous -Xms flag? Although often confused, -Xms sets both the minimum and the initial size of the heap to the same value. Let’s have an example to illustrate that:

❯ java '-Xlog:gc+init' -Xms32m -Xmx100m \
    -jar app.jar 2>&1 | grep Capacity
    
[0.003s][info][gc,init] Heap Min Capacity: 32M
[0.003s][info][gc,init] Heap Initial Capacity: 32M
[0.003s][info][gc,init] Heap Max Capacity: 104M

Ok, but what if we don’t set these values explicitly? This is where JVM Ergonomics kicks in. According to HotSpot Virtual Machine Garbage Collection Tuning Guide, the defaults are:

  • Initial heap size of 1/64 of physical memory
  • Maximum heap size of 1/4 of physical memory

Unfortunately, the Guide says nothing about the minimal heap size. java command refference states that:

The default value is selected at run time based on the system configuration.

According to some tests I run with Docker, the default minimum heap size most likely is 8M, no matter how much memory is available. Yet, I can’t promise you it’s always like this. There are many great things about JVM Ergonomics, but predictability is certainly not one of them…

Dynamic heap sizing

On startup, the JVM allocates a certain amount of memory (Initial Capacity) for the heap. During the application lifecycle, the Ergonomics process can decide to shrink or enlarge the heap, based on determining application needs. Nevertheless, the size of the heap must always fit between the min and max capacity, giving us this simple formula:

Min Capacity <= Initial Capacity <= Max Capacity

How is the JVM making such decisions? Once more, we can find some hints in the HotSpot Virtual Machine Garbage Collection Tuning Guide

By default, the virtual machine grows or shrinks the heap at each collection to try to keep the proportion of free space to live objects at each collection within a specific range.

By default, the JVM aims to keep the free space in a generation between 40% and 70%. The respective configuration options are -XX:MinHeapFreeRatio and -XX:MaxHeapFreeRatio.

Sounds straightforward? Let me disappoint you. In the same Guide, we can read that the JVM can also try to “preferentially meet” one of two goals: maximum pause time (-XX:MaxGCPauseMillis) and throughput (understood as the percentage of time not spent on garbage collection, -XX:GCTimeRatio). When the preferred goal is not met, the JVM will try to meet the other. If that fails as well, the heap may be resized.

To make it even less clear, the selected garbage collector may also affect this heap resizing strategy. According to the -XX:MaxGCPauseMillis option documentation:

The other generational collectors do not use a pause time goal by default.

According to the Guide, we may expect the heap size to be changing even in some fairly stable conditions:

It’s typical that the size of the heap oscillates as the garbage collector tries to satisfy competing goals. This is true even if the application has reached a steady state. The pressure to achieve a throughput goal (which may require a larger heap) competes with the goals for a maximum pause-time and a minimum footprint (which both may require a small heap).

The minimum heap size limits GC aggressiveness. Even if according to Ergonomics the heap should be further shrunk, it can’t fall below this value. Selecting a wrong value on our own may prevent the JVM from meeting the goals described above. Leaving it to the defaults (most likely 8MB), allows the Ergonomics process to experiment.

Every sub-optimal JVM decision can slow your app down. If the selected value turns out to be too small, the GC pressure may increase. If its too large, GC pause times may be longer than they could. Yet, for many of our apps that’s probably good enough. It’s also definitely better than guessing. However, if you fight for every millisecond, you may want to limit JVM Ergonomics’ freedom.

Also, the “stable state” identified by the JVM can differ depending on the starting point. Increasing only the initial heap size for one of my real-life applications significantly reduced the average GC time and garbage collection frequency. Importantly, this comparison has been performed under the same, controlled load.

Observing JVM Ergonomics in action can tell you a lot about your app. Picking heap configuration options based on the stable state identified by the JVM feels like a really good starting point. This way we can try to make a “snapshot” of the current setup, that could be then turned into configuration params like -Xms and -Xmx.

Experiment revisited

Since we know more about automated heap sizing now, it’s fairly easy to explain the difference identified in our experiment. The only memory limit-sensitive default is the initial heap size, which differed across application runs. Let’s update the results table to better illustrate that.

Initial heap size Container memory equivalent Pause Young events Pause Young total time Pause Full events Pause Full total time Total GC time
8m 512m 103 69.627 ms 2 23.625 ms 93.252 ms
16m 1024m 68 60.613 ms 1 15.540 ms 76.153 ms
24m 1536m 49 54.170 ms 1 16.479 ms 70.649 ms
32m 2048m 38 55.748 ms 1 14.935 ms 70.683 ms
64m 4096m 18 40.504 ms 1 15.231 ms 55.735 ms

The smaller the initial heap size was, the more GC Pauses had been observed. Theoretically, if all objects ever produced by the app would fit into the initial heap size (assuming proper free space buffer), we wouldn’t need a single GC Pause on startup.

What’s interesting, the first two runs ended up with a similar final heap usage (measured just before stopping the app) close to 22m. In the 8m/512m case, the heap has been resized 3 times before getting there. In the 16m/1024m one, there was only one resize required. This explains the significant difference in GC time between these two. It also proves that dynamic resizing comes with a cost.

For the bigger (more busy) apps, I expect startup differences to be even more significant. As their init procedure would be far more complicated, it would also cause a lot more work for the GC. That’s why picking the right initial heap size may be so important.

On startup, the initial heap size seems to be more important than the minimum one. As the app usually generates lots of objects then, the likelihood of reducing the heap is relatively low. The minimum heap size could have more of an impact later if the memory pressure can drop.

Even with a non-single-threaded GC and more CPU “cores” available, Full GC pauses could be painful. As they are the most expensive GC operations, we should limit their number as much as possible. According to the results, too low initial heap size can make the Full GC pauses more frequent during application startup.

The reason I chose observing application startup over a long-time app operation was the repeatability. The latter would depend heavily on the generated (artificial) load, which could be very different from the “real” one. Starting a Spring Boot application felt like a typical, real-life use case.

Yet, your mileage may vary depending on the application and the traffic characteristics. That’s why I encourage you to experiment (and measure!) on your own.

Summary

When you limit only the maximum heap size, both the minimum and the initial size will be picked by JVM Ergonomics. The initial heap size defaults to 1/64 of the available memory. Therefore, when running in a container, it’s probably better to set it explicitly.

According to my experiment, too small initial heap size may increase GC pressure or even affect application startup time. The more you care about the overall latency and throughput, the more likely you will need to step in. Defining heap size-related limits on your own could make a substantial impact here.

JVM Ergonomics is a true work of art, but it’s also quite unpredictable. The JVM will try its best to tune the settings at runtime, but this does not ensure that its choices will be optimal. Even so, the path toward these choices could sometimes be unacceptable from the performance point of view. Yet, the values chosen by observing them can be a good starting point for more advanced tuning.

Side notes

  • java command options mentioned before are not the only way of tuning the heap size. -XX:MinRAMPercentage, -XX:InitialRAMPercentage, and -XX:MaxRAMPercentage could be used as an alternative. Yet, they are not always behaving as we would think. Some people advertise these flags as better ones, as they allow to scale the heap together with the container’s memory. However, in specific crisis situations, blindly increasing both could make things even worse. Call me old-fashioned, but my personal preference is to set the sizes explicitly.
  • Whichever maximum heap size flag you choose, remember: there’s no wrong way of limiting the max heap size. In majority of the cases I can think of, it’s a safety net worth having. Just pick one of them and understand how it works.
  • In certain scenarios, it might be worth tuning more complex aspects like the size of the heap’s Young and Old generation (when using a generational collector like G1). If your app produces significantly more short-lived objects than long-living ones (or the other way around), you may want to try it. Yet, for the majority of the apps, it might be just a bit too much.
  • Probably the most well-known way of improving auto-tuning predictability is to disable automated heap sizing:

Setting -Xms and -Xmx to the same value increases predictability by removing the most important sizing decision from the virtual machine. However, the virtual machine is then unable to compensate if you make a poor choice.

Mike Kowalski

Software engineer believing in craftsmanship and the power of fresh espresso. Writing in & about Java, distributed systems, and beyond. Mikes his own opinions and bytes.