Groovy

Comparing JARs with Groovy

It can sometimes be useful to compare the contents of two JARs. In this blog post, I demonstrate a Groovy script that acts like a simple “diff” tool for comparing two JAR files. The Groovy script shown here, jarDiff.groovy, can undoubtedly be improved upon, but does perform what I wanted it to. The script compare two provided JARs in the following ways:

  • Shows path, name, and size of both JARs regardless of whether they are identical or different.
  • Shows entries in each JAR that do not exist in the other JAR
  • Shows entries that are in common (by name) in each JAR but have different attributes (CRC, size, or modification date)

The above characteristics of the script’s output mean that, for identical JARs, only the path/file name of each JAR and the size of each JAR are displayed. For different JARs, those same attributes will be displayed along with entries that exist in only one JAR and not the other and entries common between the two JARs with differing CRC, size, or modification date. An important distinction to make regarding this script is that it mostly is useful for comparing metadata in two JARs and does not provide differencing at the level of methods/APIs (as would be provided by a tool such as javap) or at the source code level (would require a decompiler). This script identifies that differences exist and these other tools can then be used to investigate the deeper details of the differences.

#!/usr/bin/env groovy

/**
 * jarDiff.groovy
 *
 * jarDiff.groovy <first_jar_file> <second_jar_file>
 *
 * Script that compares to JAR files, reporting basic characteristics of each
 * along with differences between the two JARs.
 */

if (args.length < 2)
{
   println "\nUSAGE: jarDiff.groovy <first_jar_file> <second_jar_file>\n"
   System.exit(-1)
}

TOTAL_WIDTH = 180
COLUMN_WIDTH = TOTAL_WIDTH / 2 - 3
ROW_SEPARATOR = "-".multiply(TOTAL_WIDTH)

import java.util.jar.JarFile

def file1Name = args[0]
def jar1File = new JarFile(file1Name)
def jar1 = extractJarContents(jar1File)
def file2Name = args[1]
def jar2File = new JarFile(file2Name)
def jar2 = extractJarContents(jar2File)

def entriesInJar1ButNotInJar2 = jar1.keySet() - jar2.keySet()
def entriesInJar2ButNotInJar1 = jar2.keySet() - jar1.keySet()

println ROW_SEPARATOR
println "| ${file1Name.center(COLUMN_WIDTH)} |${file2Name.center(COLUMN_WIDTH)} |"
print "| ${(Integer.toString(jar1File.size()) + " bytes").center(COLUMN_WIDTH)} |"
println "${(Integer.toString(jar2File.size()) + " bytes").center(COLUMN_WIDTH)} |"
println ROW_SEPARATOR

if (jar1File.manifest != jar2File.manifest)
{
   def manifestPreStr = "# Manifest Entries: "
   def manifest1Str = manifestPreStr + Integer.toString(jar1File.manifest.mainAttributes.size())
   print "| ${manifest1Str.center(COLUMN_WIDTH)} |"
   def manifest2Str = manifestPreStr + Integer.toString(jar2File.manifest.mainAttributes.size())
   println "${manifest2Str.center(COLUMN_WIDTH)} |"
   println ROW_SEPARATOR
}

entriesInJar1ButNotInJar2.each
{ entry1 ->
   print "| ${entry1.center(COLUMN_WIDTH)} |"
   println "${" ".center(entry1.size() > COLUMN_WIDTH ? 2 * COLUMN_WIDTH - entry1.size() : COLUMN_WIDTH)} |"
   println ROW_SEPARATOR
}
entriesInJar2ButNotInJar1.each
{ entry2 ->
   print "| ${" ".center(entry2.size() > COLUMN_WIDTH ? 2 * COLUMN_WIDTH - entry2.size() : COLUMN_WIDTH)}"
   println "| ${entry2.center(COLUMN_WIDTH)} |"
   println ROW_SEPARATOR
}

jar1.each 
{ key, value ->
   if (!entriesInJar1ButNotInJar2.contains(key))
   {
      def jar2Entry = jar2.get(key)
      if (value != jar2Entry)
      {
         println "| ${key.center(COLUMN_WIDTH)} |${jar2Entry.name.center(COLUMN_WIDTH)} |"
         if (value.crc != jar2Entry.crc)
         {
            def crc1Str = "CRC: ${value.crc}"
            def crc2Str = "CRC: ${jar2Entry.crc}"
            print "| ${crc1Str.center(COLUMN_WIDTH)} |"
            println "${crc2Str.center(COLUMN_WIDTH)} |"
         }
         if (value.size != jar2Entry.size)
         {
            def size1Str = "${value.size} bytes"
            def size2Str = "${jar2Entry.size} bytes"
            print "| ${size1Str.center(COLUMN_WIDTH)} |"
            println "${size2Str.center(COLUMN_WIDTH)} |"
         }
         if (value.time != jar2Entry.time)
         {
            def time1Str = "${new Date(value.time)}"
            def time2Str = "${new Date(jar2Entry.time)}"
            print "| ${time1Str.center(COLUMN_WIDTH)} |"
            println "${time2Str.center(COLUMN_WIDTH)} |"
         }
         println ROW_SEPARATOR
      }
   }
}

/**
 * Provide mapping of JAR entry names to characteristics about that JAR entry
 * for the JAR indicated by the provided JAR file name.
 *
 * @param jarFile JAR file from which to extract contents.
 * @return JAR entries and thir characteristics.
 */
def TreeMap<String, JarCharacteristics> extractJarContents(JarFile jarFile)
{
   def jarContents = new TreeMap<String, JarCharacteristics>()
   entries = jarFile.entries()
   entries.each
   { entry->
      jarContents.put(entry.name, new JarCharacteristics(entry.name, entry.crc, entry.size, entry.time));
   }
   return jarContents
}

Like all Groovy scripts, the above could be written in Java, but Groovy is better suited to script writing than Java. The above Groovy script makes use of Groovy features that I have covered in previous blog posts such as Scripted Reports with Groovy (for formatting output of differences) and Searching JAR Files with Groovy (for perusing and reading JAR files).

There are several potential enhancements for this script. These include having the script show differences in MANIFEST.MF files beyond the differences detected in all files in the JARs by comparing the contents of one manifest file to another. Other enhancements might use comparison of the methods defined on the classes/interfaces/enums contained in the JARs via use of reflection. For now, however, I am content to use javap or javac -Xprint to see the method changes once the above script identifies differences in a particular class, enum, or interface.

Being able to quickly identify differences between two JARs can be beneficial in a variety of circumstances such as comparing versions of one’s own generated JARs for changes or for comparing JARs of provided libraries and frameworks that are not named in such a way to make their differences obvious. The Groovy script demonstrated in this post identifies high-level differences between two JARs and at the same time shows off some nice Groovy features.
 

Reference: Comparing JARs with Groovy from our JCG partner Dustin Marx at the Inspired by Actual Events blog.
Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

0 Comments
Inline Feedbacks
View all comments
Back to top button