Automatic Memory Fit for Java Apps in Azure Container Apps
Introduction
Containerization has become the prevailing choice for hosting and managing Java applications, due to its unmatched portability, consistency, and scalability benefits. However, by default, when running a Java app inside containers, the JVM assumes that the OS memory is shared among multiple applications, so it conservatively claims memory for the Java application.
With Azure Container Apps, we can optimize memory allocation by simply dedicating more memory to the Java app, since only one Java application can run inside a container. This feature is called automatic memory fit, which is enabled by default when you deploy from Java source code or artifact.
In this blog we’ll cover the following:
Highlight Java memory management terms and concepts
Discuss how automatic memory fit works
Provide a benchmark comparison from before and after with the Java app memory fit
Java memory regions and JVM default settings
Heap memory
The Java heap is the working memory for dynamically allocating and managing objects during the running of an application. By default, when running an app in a container, the JVM reserves only a small portion of OS memory for the heap. This results in both low utilization of resources and low application performance because the JVM is busy with frequent garbage collection events.
Non-heap memory
Non-heap memory refers to memory that the JVM allocates for purposes other than running user apps. Non-heap memory consists of memory in the metaspace, code cache, direct memory, and stack memory.
Metaspace
Metaspace is a non-heap memory region for class metadata, constant pool information, and method bytecode. Unlike heap memory, it has no limit by default and can grow dynamically, which means if you don’t reserve enough memory for metaspace, it competes with other regions while running in an environment that has limited resources.
Code cache
As the Java application runs, the code (or more precisely, the bytecode) is interpreted or compiled to native code based on the compiler’s optimization level. A CPU can directly execute the cached native code, so it doesn’t need to interpret or compile again. By default, this space has a soft limit of 240 MB.
Thread stack
Stack memory is allocated during thread execution for short-lived values and references to the object in the heap. For each thread, this space is 1 MB in size by default, and is collected after the thread ends.
Direct memory
Direct buffer memory is allocated outside the Java heap. It represents the operating system’s native memory used by the JVM process. Direct buffers are used by Java I/O frameworks, such as Netty, for processing the data from network or disk I/O. By default, Java has 10 MB for this space.
Automatic memory fitting
Automatic memory fitting tries to make the best use of the container memory available for Java applications and enhances their performance by minimizing garbage collection. Here are the principles we apply to accomplish these objectives:
Within the container, reserve as much memory as possible for the JVM.
Within the JVM, reserve enough memory for non-heap regions, and give the rest to the heap.
We use the following methods to determine the amount of memory to allocate for the non-heap memory:
Memory Region
JVM Flag
Value
Metaspace
-XX:MaxMetaspaceSize
(JVM class count + application class count + agent class count + adjustment count(by default 0))
* class load factor * average class file size + Overhead
CodeCache
-XX:ReservedCodeCacheSize
240M
Direct Memory
-XX:MaxDirectMemorySize
10M
Stack
-Xss
1M * Thread Count (by default 250)
Because non-heap memory = Metaspace + CodeCache + Direct Memory + Stack size x Thread Count, this leaves the remaining memory for user apps (heap memory).
Heap memory = container memory limit – non-heap memory – headroom
You can change the size of headroom, which is 0 by default, to save more memory for the system when needed. Set BPL_JVM_HEAD_ROOM in an environment variable to define headroom as a percentage of total memory space.
To observe how automatic memory fit works, open the log stream from Azure Container Apps and look at the application startup part.
Benchmark comparison
To check the results of memory fitting, we performed load tests on the Spring petclinic project. We have changed petclinic so that it sends back random values and adds them to petclinic data. This lets us create unique values and trigger garbage collection.
The baseline image we used for comparison is a container app with 1 CPU and 2GB memory. It was built with Maven and a handwritten Dockerfile without any JVM memory options specified.
FROM mcr.microsoft.com/openjdk/jdk:17-ubuntu
WORKDIR /app
COPY target/spring-petclinic-*.jar spring-petclinic.jar
ENTRYPOINT [“java”, “-jar”, “/app/spring-petclinic.jar”]
The test subject consists of an app also with 1 CPU and 2GB memory. We deployed it to Azure Container Apps from Jar without any JVM memory options specified.
Results
In short, we saw 2.4x lower GC, 18% faster response time and 30% higher throughput when compared with and without automatic memory fit. Here are some detailed performance metrics.
Working set memory
GC count
Response time and throughput
For enabling/disabling and changing runtime settings, refer to automatic memory fitting in documentation.
To report a problem, ask a question, or share your feedback, please open an issue on Azure Container Apps GitHub.
Microsoft Tech Community – Latest Blogs –Read More