Home United States USA — software Under The Hood With the JVM's Automatic Resource Management

Under The Hood With the JVM's Automatic Resource Management

429
0
SHARE

The deprecation of Object: : finalize is an unusual step for the Java ecosystem. We dive deep into the Hotspot JVM to see how it works. We also compare to RAII and the Java 7, try-with-resources syntax. The article contrasts these very…
Material in this article has been adapted with permission from the forthcoming book “Optimizing Java” by Ben Evans and James Gough. The book is published by O’ Reilly and is available now in Early Release from O’Reilly and from Amazon.
InfoQ recently reported on the proposed deprecation of the method finalize () on the Object type. This method has been present since Java 1.0, but is widely regarded as a misfeature and a significant piece of legacy cruft in the platform. Nevertheless, the deprecation of a method present on Java’s Object type, would be a highly unusual step.
The finalize () mechanism is an attempt to provide automatic resource management, in a similar way to the RAII (Resource Acquisition Is Initialisation) pattern from C++ and similar languages. In that pattern, a destructor method (known as finalize () in Java) is provided, to enable automatic cleanup, and release of resources when the object is destroyed.
The basic use case for this is fairly simple – when an object is created, it takes ownership of some resource, and the object’s ownership of that resource persists for the lifetime of the object. Then, when the object dies, the ownership of the resource is automatically relinquished.
Let’s look at a quick simple C++ example that shows how to put an RAII wrapper around C-style file I/O. The core of this technique is that the object destructor method (denoted with a ~ at the start of a method named the same as the class) is used for cleanup:
The standard rationale for this approach is the observation that when the programmer opens a file handle it is all too easy to forget to call the close () function when it is no longer required, and so tying the resource ownership to the object lifetime makes sense. Getting rid of the object’s resources automatically then becomes the responsibility of the platform, not the programmer.
This promotes good design, especially when the only reason for a type to exist is to act as a “holder” of a resource such as a file or network socket.
In the Java world, the way that this is implemented is to use the JVM’s garbage collector as the subsystem that can definitively say that the object has died. If a finalize () method is provided on a type, then all objects of that type receive special treatment. An object that overrides finalize () is treated specially by the garbage collector.
One detail of Hotspot that we need to be aware of is that the VM has some special, implementation specific bytecodes in addition to the standard Java instructions. These specialist bytecodes are used to rewrite the standard ones in order to cope with certain special circumstances.
A complete list of the bytecode definitions, both standard Java and Hotspot special-case can be found here .
For our purposes, we care about the special case: return_register_finalizer instruction. This is needed because it is possible for JVMTI to rewrite bytecode for Object.. To precisely obey the standard, and register the finalizer at the correct time, it is necessary to identify the point at which Object. completes without the rewriting, and the special-case bytecode is used to mark this point.
The code for actually registering the object as needing finalization can be seen in the Hotspot interpreter. The file hotspot/src/cpu/x86/vm/c1_Runtime1_x86.cpp contains the core of the x86-specific port of the Hotspot interpreter. This has to be processor-specific because Hotspot makes heavy use of low-level assembly / machine code. The case register_finalizer_id contains the registration code.
Once the object has been registered as needing finalization, then instead of being immediately reclaimed during the garbage collection (GC) cycle, the object undergoes the following extended lifecycle:
Overall, this means that all objects to be finalized must first be recognized as unreachable via a GC mark, then finalized, and then GC must run again in order for the data to be collected. This means that finalizable objects persist for 1 extra GC cycle at least. In the case of objects that have become tenured, this can be a significant amount of time.
The mechanism has some extra complexity – more than we would like – as the queue-draining threads have to start secondary finalization threads that actually run the finalize () method. This is necessary to guard against the possibility that finalize () will block.
If finalize () were run on the queue-draining threads then a badly written finalize () could prevent the entire mechanism from working. To prevent this, we are forced to create a brand new thread for each object instance that requires finalization.
Not only that, but finalization threads must also ignore any exceptions that are thrown. This seems strange at first, but the finalization thread has no real way to handle the exception, and the original context that created the finalizable object is long gone. There is no meaningful way for any user code to be provided that could be aware of, or recover from the exception.
To clarify this, recall that an exception in Java provides a way to unwind the stack to find a method within the current execution thread that can recover from a non-fatal error. Seen in this light the restriction that finalization ignores exceptions is more understandable – the finalize () call happens on a totally different thread than the one that created or executed the object.
The majority of the finalization implementation is actually written in Java. The JVM has separate threads to perform finalization, that run at the same time as application threads for the majority of the required work. The core functionality is contained in the class java.lang.ref. Finalizer, a package-private class that is fairly simple to read.
The Finalizer class also provides some insight into how classes that are granted additional privilege by the runtime are granted that privilege. For example, it contains code like this:
Of course, in regular application code, this code would be nonsensical, as it creates an unused object. Unless the constructor has side-effects (usually considered a bad design decision in Java) , this would do nothing. In this case, the intent is to “hook” a new finalizable object.
The implementation of finalization also relies heavily on the FinalReference class. This is a subclass of java.lang.ref. Reference, a class that the runtime and VM handle specially. Like the more well-known soft and weak references, FinalReference objects get special treatment by the GC subsystem, comprising a mechanism that provides an interesting interaction between the VM and Java code (both platform and user) .
For all its technical interest the Java finalization implementation is fatally flawed, due to a mismatch with the memory management scheme of the platform.”
In the C++ case, memory is handled manually, with explicit lifetime management of objects under the explicit control of the programmer. This means that destruction can happen as the object is deleted, and so the acquisition and release of resources is directly tied to the lifetime of the object.
Java’s memory management subsystem is a garbage collector that runs as-needed, in response to running out of available memory to allocate. It therefore runs at non-deterministic intervals (if at all) and so the finalize () method is run only when the object is collected, at some unknown time.
If the finalize () mechanism is used to automatically release resources (such as file handles) , then there is no guarantee as to when (if ever) those resources will actually become available. This makes the finalize () mechanism fundamentally unsuitable for its stated purpose – automatic resource management.

Continue reading...