Byron Kiourtzoglou

About Byron Kiourtzoglou

Byron is a master software engineer working in the IT and Telecom domains. He is always fascinated by SOA, middleware services and mobile development. Byron is co-founder and Executive Editor at Java Code Geeks.

Java Best Practices – High performance Serialization

Continuing our series of articles concerning proposed practices while working with the Java programming language, we are going to discuss and demonstrate how to utilize Object Serialization for high performance applications.

All discussed topics are based on use cases derived from the development of mission critical, ultra high performance production systems for the telecommunication industry.

Prior reading each section of this article it is highly recommended that you consult the relevant Java API documentation for detailed information and code samples.

All tests are performed against a Sony Vaio with the following characteristics :

  • System : openSUSE 11.1 (x86_64)
  • Processor (CPU) : Intel(R) Core(TM)2 Duo CPU T6670 @ 2.20GHz
  • Processor Speed : 1,200.00 MHz
  • Total memory (RAM) : 2.8 GB
  • Java : OpenJDK 1.6.0_0 64-Bit

The following test configuration is applied :

  • Concurrent worker Threads : 200
  • Test repeats per worker Thread : 1000
  • Overall test runs : 100

High performance Serialization

Serialization is the process of converting an object into a stream of bytes. That stream can then be sent through a socket, stored to a file and/or database or simply manipulated as is. With this article we do not intend to present an in depth description of the serialization mechanism, there are numerous articles out there that provide this kind of information. What will be discussed here is our proposition for utilizing serialization in order to achieve high performance results.

The three main performance problems with serialization are :

  • Serialization is a recursive algorithm. Starting from a single object, all the objects that can be reached from that object by following instance variables, are also serialized. The default behavior can easily lead to unnecessary Serialization overheads
  • Both serializing and deserializing require the serialization mechanism to discover information about the instance it is serializing. Using the default serialization mechanism, will use reflection to discover all the field values. Furthermore if you don’t explicitelly set a „serialVersionUID“ class attribute, the serialization mechanism has to compute it. This involves going through all the fields and methods to generate a hash. The aforementioned procedure can be quite slow
  • Using the default serialization mechanism, all the serializing class description information is included in the stream, such as :
    • The description of all the serializable superclasses
    • The description of the class itself
    • The instance data associated with the specific instance of the class

To solve the aforementioned performance problems you can use Externalization instead. The major difference between these two methods is that Serialization writes out class descriptions of all the serializable superclasses along with the information associated with the instance when viewed as an instance of each individual superclass. Externalization, on the other hand, writes out the identity of the class (the name of the class and the appropriate „serialVersionUID“ class attribute) along with the superclass structure and all the information about the class hierarchy. In other words, it stores all the metadata, but writes out only the local instance information. In short, Externalization eliminates almost all the reflective calls used by the serialization mechanism and gives you complete control over the marshalling and demarshalling algorithms, resulting in dramatic performance improvements.

Of course, Externalization efficiency comes at a price. The default serialization mechanism adapts to application changes due to the fact that metadata is automatically extracted from the class definitions. Externalization on the other hand isn’t very flexible and requires you to rewrite your marshalling and demarshalling code whenever you change your class definitions.

What follows is a short demonstration on how to utilize Externalization for high performance applications. We will start by providing the “Employee” object to perform serialization and deserialization operations. Two flavors of the “Employee” object will be used. One suitable for standard serialization operations and another that is modified so as to able to be externalized.

Below is the first flavor of the “Employee” object :

package com.javacodegeeks.test;

import java.io.Serializable;
import java.util.Date;
import java.util.List;

public class Employee implements Serializable {

 private static final long serialVersionUID = 3657773293974543890L;
 
 private String firstName;
 private String lastName;
 private String socialSecurityNumber;
 private String department;
 private String position;
 private Date hireDate;
 private Double salary;
 private Employee supervisor;
 private List<string> phoneNumbers;
 
 public Employee() {
 }
 
 public Employee(String firstName, String lastName,
   String socialSecurityNumber, String department, String position,
   Date hireDate, Double salary) {
  this.firstName = firstName;
  this.lastName = lastName;
  this.socialSecurityNumber = socialSecurityNumber;
  this.department = department;
  this.position = position;
  this.hireDate = hireDate;
  this.salary = salary;
 }

 public String getFirstName() {
  return firstName;
 }

 public void setFirstName(String firstName) {
  this.firstName = firstName;
 }

 public String getLastName() {
  return lastName;
 }

 public void setLastName(String lastName) {
  this.lastName = lastName;
 }

 public String getSocialSecurityNumber() {
  return socialSecurityNumber;
 }

 public void setSocialSecurityNumber(String socialSecurityNumber) {
  this.socialSecurityNumber = socialSecurityNumber;
 }

 public String getDepartment() {
  return department;
 }

 public void setDepartment(String department) {
  this.department = department;
 }

 public String getPosition() {
  return position;
 }

 public void setPosition(String position) {
  this.position = position;
 }

 public Date getHireDate() {
  return hireDate;
 }

 public void setHireDate(Date hireDate) {
  this.hireDate = hireDate;
 }

 public Double getSalary() {
  return salary;
 }

 public void setSalary(Double salary) {
  this.salary = salary;
 }

 public Employee getSupervisor() {
  return supervisor;
 }

 public void setSupervisor(Employee supervisor) {
  this.supervisor = supervisor;
 }

 public List<string> getPhoneNumbers() {
  return phoneNumbers;
 }

 public void setPhoneNumbers(List<string> phoneNumbers) {
  this.phoneNumbers = phoneNumbers;
 }

}

Things to notice here :

  • We assume that the following fields are mandatory :
    • “firstName”
    • “lastName”
    • “socialSecurityNumber”
    • “department”
    • “position”
    • “hireDate”
    • “salary”

Following is the second flavor of the “Employee” object :

package com.javacodegeeks.test;

import java.io.Externalizable;
import java.io.IOException;
import java.io.ObjectInput;
import java.io.ObjectOutput;
import java.util.Arrays;
import java.util.Date;
import java.util.List;

public class Employee implements Externalizable {

 private String firstName;
 private String lastName;
 private String socialSecurityNumber;
 private String department;
 private String position;
 private Date hireDate;
 private Double salary;
 private Employee supervisor;
 private List<string> phoneNumbers;
 
 public Employee() {
 }
 
 public Employee(String firstName, String lastName,
   String socialSecurityNumber, String department, String position,
   Date hireDate, Double salary) {
  this.firstName = firstName;
  this.lastName = lastName;
  this.socialSecurityNumber = socialSecurityNumber;
  this.department = department;
  this.position = position;
  this.hireDate = hireDate;
  this.salary = salary;
 }

 public String getFirstName() {
  return firstName;
 }

 public void setFirstName(String firstName) {
  this.firstName = firstName;
 }

 public String getLastName() {
  return lastName;
 }

 public void setLastName(String lastName) {
  this.lastName = lastName;
 }

 public String getSocialSecurityNumber() {
  return socialSecurityNumber;
 }

 public void setSocialSecurityNumber(String socialSecurityNumber) {
  this.socialSecurityNumber = socialSecurityNumber;
 }

 public String getDepartment() {
  return department;
 }

 public void setDepartment(String department) {
  this.department = department;
 }

 public String getPosition() {
  return position;
 }

 public void setPosition(String position) {
  this.position = position;
 }

 public Date getHireDate() {
  return hireDate;
 }

 public void setHireDate(Date hireDate) {
  this.hireDate = hireDate;
 }

 public Double getSalary() {
  return salary;
 }

 public void setSalary(Double salary) {
  this.salary = salary;
 }

 public Employee getSupervisor() {
  return supervisor;
 }

 public void setSupervisor(Employee supervisor) {
  this.supervisor = supervisor;
 }

 public List<string> getPhoneNumbers() {
  return phoneNumbers;
 }

 public void setPhoneNumbers(List<string> phoneNumbers) {
  this.phoneNumbers = phoneNumbers;
 }

 public void readExternal(ObjectInput objectInput) throws IOException,
   ClassNotFoundException {
  
  this.firstName = objectInput.readUTF();
  this.lastName = objectInput.readUTF();
  this.socialSecurityNumber = objectInput.readUTF();
  this.department = objectInput.readUTF();
  this.position = objectInput.readUTF();
  this.hireDate = new Date(objectInput.readLong());
  this.salary = objectInput.readDouble();
  
  int attributeCount = objectInput.read();

  byte[] attributes = new byte[attributeCount];

  objectInput.readFully(attributes);
  
  for (int i = 0; i < attributeCount; i++) {
   byte attribute = attributes[i];

   switch (attribute) {
   case (byte) 0:
    this.supervisor = (Employee) objectInput.readObject();
    break;
   case (byte) 1:
    this.phoneNumbers = Arrays.asList(objectInput.readUTF().split(";"));
    break;
   }
  }
  
 }

 public void writeExternal(ObjectOutput objectOutput) throws IOException {
  
  objectOutput.writeUTF(firstName);
  objectOutput.writeUTF(lastName);
  objectOutput.writeUTF(socialSecurityNumber);
  objectOutput.writeUTF(department);
  objectOutput.writeUTF(position);
  objectOutput.writeLong(hireDate.getTime());
  objectOutput.writeDouble(salary);
  
  byte[] attributeFlags = new byte[2];
  
  int attributeCount = 0;
  
  if (supervisor != null) {
   attributeFlags[0] = (byte) 1;
   attributeCount++;
  }
  if (phoneNumbers != null && !phoneNumbers.isEmpty()) {
   attributeFlags[1] = (byte) 1;
   attributeCount++;
  }
  
  objectOutput.write(attributeCount);
  
  byte[] attributes = new byte[attributeCount];

  int j = attributeCount;

  for (int i = 0; i < 2; i++)
   if (attributeFlags[i] == (byte) 1) {
    j--;
    attributes[j] = (byte) i;
   }

  objectOutput.write(attributes);
  
  for (int i = 0; i < attributeCount; i++) {
   byte attribute = attributes[i];

   switch (attribute) {
   case (byte) 0:
    objectOutput.writeObject(supervisor);
    break;
   case (byte) 1:
    StringBuilder rowPhoneNumbers = new StringBuilder();
    for(int k = 0; k < phoneNumbers.size(); k++)
     rowPhoneNumbers.append(phoneNumbers.get(k) + ";");
    rowPhoneNumbers.deleteCharAt(rowPhoneNumbers.lastIndexOf(";"));
    objectOutput.writeUTF(rowPhoneNumbers.toString());
    break;
   }
  }
  
 }
}

Things to notice here :

  • We implement the “writeExternal” method for marshalling the “Employee” object. All mandatory fields are written to the stream
  • For the “hireDate” field we write only the number of milliseconds represented by this Date object. Assuming that the demarshaller will be using the same timezone as the marshaller the milliseconds value is all the information we need to properly deserialize the “hireDate” field. Keep in mind that we could serialize the entire “hireDate” object by using the “objectOutput.writeObject(hireDate)” operation. In that case the default serialization mechanism would kick in resulting in speed degradation and size increment for the resulting stream
  • All the non mandatory fields (“supervisor” and “phoneNumbers”) are written to the stream only when they have actual (not null) values. To implement this functionality we use the “attributeFlags” and “attributes” byte arrays. Each position of the “attributeFlags” array represents a non mandatory field and holds a “marker” indicating whether the specific field has a value. We check each non mandatory field and populate the “attributeFlags” byte array with the corresponding markers. The “attributes” byte array indicates the actual non mandatory fields that must be written to the stream by means of “position”. For example if both “supervisor” and “phoneNumbers” non mandatory fields have actual values then “attributeFlags” byte array should be [1,1] and “attributes” byte array should be [0,1]. In case only “phoneNumbers” non mandatory field has a non null value “attributeFlags” byte array should be [0,1] and “attributes” byte array should be [1]. By using the aforementioned algorithm we can achieve minimal size footprint for the resulting stream. To properly deserialize the “Employee” object non mandatory parameters we must write to the steam only the following information :
    • The overall number of non mandatory parameters that will be written (aka the “attributes” byte array size – for the demarshaller to parse)
    • The “attributes” byte array (for the demarshaller to properly assign field values)
    • The actual non mandatory parameter values
  • For the “phoneNumbers” field we construct and write to the stream a String representation of its contents. Alternatively we could serialize the entire “phoneNumbers” object by using the “objectOutput.writeObject(phoneNumbers)” operation. In that case the default serialization mechanism would kick in resulting in speed degradation and size increment for the resulting stream
  • We implement the “readExternal” method for demarshalling the “Employee” object. All mandatory fields are written to the stream. For the non mandatory fields the demarshaller assigns the appropriate field values according to the protocol described above

For the serialization and deserialization processes we used the following four functions. These functions come in two flavors. The first pair is suitable for serializing and deserializing Externalizable object instances, whereas the second pair is suitable for serializing and deserializing Serializable object instances.

public static byte[][] serializeObject(Externalizable object) throws Exception {
  ByteArrayOutputStream baos = null;
  ObjectOutputStream oos = null;
  byte[][] res = new byte[2][];
  
  try {
   baos = new ByteArrayOutputStream();
   oos = new ObjectOutputStream(baos);
   
   object.writeExternal(oos);
   oos.flush();
   
   res[0] = object.getClass().getName().getBytes();
   res[1] = baos.toByteArray();
  
  } catch (Exception ex) {
   throw ex;
  } finally {
   try {
    if(oos != null)
     oos.close();
   } catch (Exception e) {
    e.printStackTrace();
   }
  }
  
  return res;
 }
 
public static Externalizable deserializeObject(byte[][] rowObject) throws Exception {
  ObjectInputStream ois = null;
  String objectClassName = null;
  Externalizable res = null;
  
  try {
   
   objectClassName = new String(rowObject[0]);
   byte[] objectBytes = rowObject[1];
   
   ois = new ObjectInputStream(new ByteArrayInputStream(objectBytes));
   
   Class objectClass = Class.forName(objectClassName);
   res = (Externalizable) objectClass.newInstance();
   res.readExternal(ois);
  
  } catch (Exception ex) {
   throw ex;
  } finally {
   try {
    if(ois != null)
     ois.close();
   } catch (Exception e) {
    e.printStackTrace();
   }
   
  }
  
  return res;
  
 }
 
public static byte[] serializeObject(Serializable object) throws Exception {
  ByteArrayOutputStream baos = null;
  ObjectOutputStream oos = null;
  byte[] res = null;
  
  try {
   baos = new ByteArrayOutputStream();
   oos = new ObjectOutputStream(baos);
   
   oos.writeObject(object);
   oos.flush();
   
   res = baos.toByteArray();
  
  } catch (Exception ex) {
   throw ex;
  } finally {
   try {
    if(oos != null)
     oos.close();
   } catch (Exception e) {
    e.printStackTrace();
   }
  }
  
  return res;
 }
 
public static Serializable deserializeObject(byte[] rowObject) throws Exception {
  ObjectInputStream ois = null;
  Serializable res = null;
  
  try {
   
   ois = new ObjectInputStream(new ByteArrayInputStream(rowObject));
   res = (Serializable) ois.readObject();
  
  } catch (Exception ex) {
   throw ex;
  } finally {
   try {
    if(ois != null)
     ois.close();
   } catch (Exception e) {
    e.printStackTrace();
   }
   
  }
  
  return res;
  
 }

Below we present a performance comparison chart between the two aforementioned approaches

The horizontal axis represents the number of test runs and the vertical axis the average transactions per second (TPS) for each test run. Thus higher values are better. As you can see by using the Externalizable approach you can achieve superior performance gains when serializing and deserializing compared to the plain Serializable approach.

Lastly we must pinpoint that we performed our tests providing values for all non mandatory fields of the “Employee” object. You should expect even higher performance gains if you do not use all the non mandatory parameters for your tests, either when comparing between the same approach and most importantly when cross comparing between the Externalizable and Serializable approaches.

Happy coding!

Justin

Related Articles :
Related Whitepaper:

Bulletproof Java Code: A Practical Strategy for Developing Functional, Reliable, and Secure Java Code

Use Java? If you do, you know that Java software can be used to drive application logic of Web services or Web applications. Perhaps you use it for desktop applications? Or, embedded devices? Whatever your use of Java code, functional errors are the enemy!

To combat this enemy, your team might already perform functional testing. Even so, you're taking significant risks if you have not yet implemented a comprehensive team-wide quality management strategy. Such a strategy alleviates reliability, security, and performance problems to ensure that your code is free of functionality errors.Read this article to learn about this simple four-step strategy that is proven to make Java code more reliable, more secure, and easier to maintain.

Get it Now!  

7 Responses to "Java Best Practices – High performance Serialization"

  1. Proagent says:

    thanks for this nice article,
    can you tell us what the tool you used for performance comparison ?

  2. Andre says:

    Tweaked your code a bit, this is 4 times faster roughly …

    package com.javacodegeeks.test;;

    import java.io.ByteArrayInputStream;

    import java.io.ByteArrayOutputStream;

    import java.io.DataInputStream;

    import java.io.DataOutputStream;

    import java.io.IOException;

    public class DataMessageTransmission_test {

    private DataMessageTransmission_test employee = this;

    private String firstName;

    private String lastName;

    private String socialSecurityNumber;

    private String department;

    private String position;

    private long hireDate;

    private Double salary;

    private String supervisor;

    private String[] phoneNumbers;

    private static byte[][]serial;

    public DataMessageTransmission_test() {}

    public DataMessageTransmission_test(String firstName, String lastName,String socialSecurityNumber, String department, String position, long hireDate, Double salary) {

    employee.firstName = firstName;

    employee.lastName = lastName;

    employee.socialSecurityNumber = socialSecurityNumber;

    employee.department = department;

    employee.position = position;

    employee.hireDate = hireDate;

    employee.salary = salary;

    }

    public String getFirstName() {

    return firstName;

    }

    public void setFirstName(String firstName) {

    employee.firstName = firstName;

    }

    public String getLastName() {

    return lastName;

    }

    public void setLastName(String lastName) {

    employee.lastName = lastName;

    }

    public String getSocialSecurityNumber() {

    return socialSecurityNumber;

    }

    public void setSocialSecurityNumber(String socialSecurityNumber) {

    employee.socialSecurityNumber = socialSecurityNumber;

    }

    public String getDepartment() {

    return department;

    }

    public void setDepartment(String department) {

    employee.department = department;

    }

    public String getPosition() {

    return position;

    }

    public void setPosition(String position) {

    employee.position = position;

    }

    public long getHireDate() {

    return hireDate;

    }

    public void setHireDate(long hireDate) {

    employee.hireDate = hireDate;

    }

    public Double getSalary() {

    return salary;

    }

    public void setSalary(Double salary) {

    employee.salary = salary;

    }

    public String getSupervisor() {

    return supervisor;

    }

    public void setSupervisor(String supervisor) {

    employee.supervisor = supervisor;

    }

    public String[] getPhoneNumbers() {

    return phoneNumbers;

    }

    public void setPhoneNumbers(String[] phoneNumbers) {

    employee.phoneNumbers = phoneNumbers;

    }

    public byte[] readExternal(DataInputStream ois) throws IOException, ClassNotFoundException {

    DataMessageTransmission_test employee = new DataMessageTransmission_test();

    employee.firstName = ois.readUTF();

    employee.lastName = ois.readUTF();

    employee.socialSecurityNumber = ois.readUTF();

    employee.department = ois.readUTF();

    employee.position = ois.readUTF();

    employee.hireDate = ois.readLong();

    employee.salary = ois.readDouble();

    int attributeCount = ois.read();

    byte[] attributes = new byte[attributeCount];

    ois.readFully(attributes);

    for (int i = 0; i < attributeCount; i++) {

    if(i == 0){employee.supervisor = ois.readUTF();}

    if(i == 1){employee.phoneNumbers = ois.readUTF().split(";");}

    }

    return null;

    }

    public void writeExternal(DataOutputStream objectOutput) throws IOException {

    objectOutput.writeUTF(firstName);

    objectOutput.writeUTF(lastName);

    objectOutput.writeUTF(socialSecurityNumber);

    objectOutput.writeUTF(department);

    objectOutput.writeUTF(position);

    objectOutput.writeLong(hireDate);

    objectOutput.writeDouble(salary);

    byte[] attributeFlags = new byte[2];

    int attributeCount = 0;

    if (supervisor != null) {attributeFlags[0] = 1;attributeCount++;}

    if (phoneNumbers != null) {attributeFlags[1] = 1;attributeCount++;}

    objectOutput.write(attributeCount);

    byte[] attributes = new byte[attributeCount];

    int j = attributeCount;

    for (int i = 0; i < 2; i++)

    if (attributeFlags[i] == 1) {j–;attributes[j] = (byte) i;}

    objectOutput.write(attributes);

    for (int i = 0; i < attributeCount; i++) {

    if(i == 0){objectOutput.writeChars(supervisor);}

    if(i == 1){

    String rowPhoneNumbers = new String();

    for(int k = 0; k “+rowObject[1]);

    ois = new DataInputStream(new ByteArrayInputStream(rowObject[1]));

    res[1] = readExternal(ois);

    }catch (Exception ex) {

    throw ex;

    } finally {

    try {

    if(ois != null)

    ois.close();

    } catch (Exception e) {

    e.printStackTrace();

    }

    }

    return res;

    }

    public static void main(String args[]){

    DataMessageTransmission_test employee = new DataMessageTransmission_test(“Herman”,”Klaasen”,”1258964587A”,”Main office”,”vice Director”,System.currentTimeMillis(),2050.45);

    long time1 = 0, time2 = 0;

    try {

    time1 = System.nanoTime();

    serial = DataMessageTransmission_test. writeObjectToDataStream(employee);

    } catch (Exception e) {

    // TODO Auto-generated catch block

    e.printStackTrace();

    }

    try {

    employee.readObjectFromDataStream(serial);

    System.out.println(employee.getFirstName());

    time2 = System.nanoTime();

    } catch (Exception e) {

    // TODO Auto-generated catch block

    e.printStackTrace();

    }

    System.out.println(“”+(time2-time1));

    }

    }

    • Charles says:

      Hello Andre,

      Could you please email your code to play with? The one posted seems having some problem….
      Thanks in advance.

      Charles_L_chan (at) me (dot) com

  3. Rüdiger Möller says:

    You should checkout https://code.google.com/p/fast-serialization/ . This library outperforms manual serialization in many cases.

Leave a Reply


2 × = twelve



Java Code Geeks and all content copyright © 2010-2014, Exelixis Media Ltd | Terms of Use
All trademarks and registered trademarks appearing on Java Code Geeks are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries.
Java Code Geeks is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.

Sign up for our Newsletter

15,153 insiders are already enjoying weekly updates and complimentary whitepapers! Join them now to gain exclusive access to the latest news in the Java world, as well as insights about Android, Scala, Groovy and other related technologies.

As an extra bonus, by joining you will get our brand new e-books, published by Java Code Geeks and their JCG partners for your reading pleasure! Enter your info and stay on top of things,

  • Fresh trends
  • Cases and examples
  • Research and insights
  • Two complimentary e-books
Get tutored by the Geeks! JCG Academy is a fact... Join Now
Hello. Add your message here.