Home United States USA — software Did You Know the Fastest Way of Serializing a Java Field Is...

Did You Know the Fastest Way of Serializing a Java Field Is Not Serializing It at All?

152
0
SHARE

Learn how to apply C++’s trivially copyable scheme in Java and get blazing serialization speed using Unsafe and memcpy to improve Java serialization performance
Join the DZone community and get the full member experience. Learn how to apply C++’s trivially copyable scheme in Java and get blazing serialization speed using Unsafe and memcpy to directly copy the fields in one single sweep to memory or to a memory-mapped file. In a previous article about open-source Chronicle Queue, there was some benchmarking and method profiling indicating that the speed of serialization had a significant impact on execution performance. After all, this is only to be expected as Chronicle Queue (and other persisted queue libraries) must convert Java objects located on the heap to binary data which is subsequently stored in files. Even for the most internally efficient libraries, this inevitable serialization procedure will largely dictate performance. In this article, we will use a Data Transfer Object (hereafter DTO) named MarketData which contains financial information with a relatively large number of fields. The same principles apply to other DTOs in any other business area. Java’s Serializable marker interface provides a default way to serialize Java objects to/from the binary format, usually via the ObjectOutputStream and ObjectInputStream classes. The default way (whereby the magic writeObject() and readObject() are not explicitly declared) entails reflecting over an object’s non-transient fields and reading/writing them one by one, which can be a relatively costly operation. Chronicle Queue can work with Serializable objects but also provides a similar, but faster and more space-efficient way to serialize data via the abstract class SelfDescribingMarshallable. Akin to Serializable objects, this relies on reflection but comes with substantially less overhead in terms of payload, CPU cycles, and garbage. Default serialization often comprises the steps of: The identification of non-transient fields can be cached, eliminating this step to improve performance. Here is an example of a class using default serialization: As can be seen, the class does not add anything over its base class and so it will use default serialization as transitively provided by SelfDescribingMarshallable. Classes implementing Serializable can elect to implement two magic private (sic!) methods whereby these methods will be invoked instead of resorting to default serialization. This provides full control of the serialization process and allows fields to be read using custom code rather than via reflection which will improve performance.

Continue reading...