Creating Vulnerability Assessment Artifacts Using Maven Assembly

This article will discuss using Maven Assembly to create artifacts that can be provided to third-party vulnerability assessment sites (e.g., Veracode) for review.

Static Analysis for Bugs vs. Vulnerability Assessments

At this point everyone is aware of findbugs and uses it religiously, right?

Right?

Findbugs uses static analysis to find bugs. More precisely, it uses static analysis to find bugs that can be found by static analysis. For instance I’ve seen a common pattern of

public void foo(Object obj) {
    if (obj != null) {
        obj.doSomething();
    }

    // lots of obscuring code

    obj.doSomethingElse()
}

Should we check for null a second time? Did we need to check the first time? Should we have returned from the ‘if’ clause?

Why do we also need vulnerability assessment?

What is vulnerability assessment? How is it different from bugs?

The key concept is that vulnerable code is superficially bug-free but is still vulnerable to misuse to attack this site or its users.

An example of vulnerable code is using unsanitized user-provided values. Anyone working on the front-end should know the importance of sanitizing these values.

But what happens when user-provided data is passed out of the front-end, e.g., when it’s written to the database? Will everyone who pulls data from the database know that it might contain unsanitized user-provided data? What about malicious data put into the database via SQL injection?

Static analysis for vulnerability assessment looks a lot like static analysis to find bugs, just a lot more through. Whereas findbugs may take 5 minutes to run Veracode may take a few hours!

(Dynamic analysis takes this a step further and runs the tests against a live system. You can do a light version of this using integration tests.)

Artifacts for Vulnerability Assessment

What do we need to provide for vulnerability assessments? The short answer is three things:

  • our compiled code (e.g., java or scala)
  • our scripted code (e.g., jsp)
  • every jar file we depend on, recursively

We don’t need to provide our source code or resources. The compiled code does need to include debug systems so it can give meaningful error messages – knowing only that there’s 19 defects in a library containing 79 classes isn’t very helpful!

A good format is a tarball containing:

  • our jar and wars at the top level, sans version number
  • our dependencies under “/lib”, with version number

The version numbers are stripped or retained for tracking purposes. Our code has a continuity across multiple runs. Our dependencies can change at any time and don’t have any continuity beyond what’s explicitly indicated in the version numbers.

Our war files should be stripped of embedded jars since they’ll be present under the ‘lib’ directory. “Thick” war files just increase the size of the uploaded artifact.

We can build this with two maven assembly descriptors.

va-war.xml (vulnerability assessment skinny war)

The first assembly creates a stripped down .war file. I don’t want to call it a skinny war since the intended purpose is different but they have a lot in common.

<assembly
    xmlns="http://maven.apache.org/plugins/maven-assembly-plugin/assembly/1.1.2"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation=
       "http://maven.apache.org/plugins/maven/assembly-plugin/assembly/1.1.2
        http://maven.apache.org/xsd/assembly-1.1.2.xsd">

    <id>va-war</id>
    <formats>
        <format>war</format>
    </formats>

    <includeBaseDirectory>false</includeBaseDirectory>

    <fileSets>
        <!-- grab everything except any jars -->
        <fileSet>
            <directory>target/${project.artifactId}-${project.version}</directory>
            <outputDirectory>/</outputDirectory>
            <includes />
            <excludes>
                <exclude>**/*.jar</exclude>
            </excludes>
        </fileSet>
    </fileSets>
</assembly>

You can exclude additional files if you have sensitive information or a lot of large artifacts:

<excludes>
                <exclude>**/*.jar</exclude>
                <exclude>**/*.jks</exclude>
                <exclude>**/*.p12</exclude>
                <exclude>**/*.jpg</exclude>
                <exclude>**/db.properties</exclude>
            </excludes>

You need to be careful though – you need to include anything that’s scripted, e.g., jsp files or velocity templates.

va-artifact.xml (vulnerability assessment artifact)

The second artifact collects all of the dependencies and stripped down wars into a single tarball. Our jars and wars are at the top level of the tarball, all dependencies are in a ‘lib’ directory. This makes it easy to distinguish between our artifacts and our dependencies.

<assembly
    xmlns="http://maven.apache.org/plugins/maven-assembly-plugin/assembly/1.1.2"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation=
       "http://maven.apache.org/plugins/maven/assembly-plugin/assembly/1.1.2
        http://maven.apache.org/xsd/assembly-1.1.2.xsd">

    <id>va-artifact</id>
    <formats>
        <format>tar.gz</format>
    </formats>

    <includeBaseDirectory>false</includeBaseDirectory>
    <dependencySets>

        <!-- ******************************************* -->
        <!-- Our code should not include version numbers -->
        <!-- ******************************************* -->
        <dependencySet>
            <includes>
                <include>${project.groupId}:*:jar</include>
                <include>${project.groupId}:*:va-war</include>

                <!-- we could also include subprojects -->
                <include>${project.groupId}.**:*:jar</include>
            </includes>

            <!-- we might have sensitive resources -->
            <excludes>
                <exclude>${project.groupId}:*-properties</exclude>
            <excludes>
            <outputFileNameMapping>${artifact.artifactId}${dashClassifier?}.${artifact.extension}</outputFileNameMapping>
        </dependencySet>

        <!-- *********************************************** -->
        <!-- Our dependencies should include version numbers -->
        <!-- *********************************************** -->
        <dependencySet>
            <outputDirectory>lib</outputDirectory>
            <includes />

            <excludes>
                <exclude>${project.groupId}:*</exclude>
                <exclude>*.pom</exclude>

                <!-- exclude standard APIs -->
                <exclude>javax.*:*</exclude>
                <exclude>dom4j:*</exclude>
                <exclude>jaxen:*</exclude>
                <exclude>jdom:*</exclude>
                <exclude>xml-apis:*</exclude>
            </excludes>
        </dependencySet>
    </dependencySets>
</assembly>

Building the Artifacts

The assembly descriptors are only half of the story. We still need to call maven assembly and we do not want to do it for every build.

This is an ideal time for profiles – we will only build artifacts when a specific profile is specified.

pom.xml for war modules

The necessary addition to the pom.xml file for war modules is modest. We need to call our assembly descriptor but we don’t need to explicitly add dependencies.

<profiles>
    <profile>
        <id>vulnerability-assessment</id>
        <build>
            <plugins>
                <plugin>
                    <artifactId>maven-assembly-plugin</artifactId>
                    <configuration>
                        <descriptors>
                            <descriptor>src/main/assembly/va-war.xml</descriptor>
                        </descriptors>
                    </configuration>
                    <executions>
                        <execution>
                            <id>make-assembly</id>
                            <phase>package</phase>
                            <goals>
                                <goal>single</goal>
                            </goals>
                        </execution>
                    </executions>
                </plugin>
            </plugins>
        </build>
    </profile>
</profiles>

pom.xml for top-level modules

The necessary addition to the pom.xml file for the top-level module is more complex, especially when the distribution assembly is created in a submodule instead of the root module. In this case we need to explicitly add a dependency on both pom files and lite war files. If we don’t specify the former we’ll lose most dependencies, if we don’t specify the latter we’ll lose the .war files.

<profiles>
    <profile>
        <id>vulnerability-assessment</id>
        <build>
            <plugins>
                <plugin>
                    <artifactId>maven-assembly-plugin</artifactId>
                    <configuration>
                        <descriptors>
                            <descriptor>src/main/assembly/va-artifact.xml</descriptor>
                        </descriptors>
                    </configuration>
                    <executions>
                        <execution>
                            <id>make-assembly</id>
                            <phase>package</phase>
                            <goals>
                                <goal>single</goal>
                            </goals>
                        </execution>
                    </executions>
                </plugin>
            </plugins>
        </build>

        <dependencies>
            <!-- specify parent pom -->
            <dependency>
                <groupId>${project.groupId}</groupId>
                <artifactId>parent</artifactId> <!-- FIXME -->
                <version>${project.version}</version>
                <type>pom</type>
            </dependency>

            <!-- specify each war file and corresponding pom file -->
            <dependency>
                <groupId>${project.groupId}</groupId>
                <artifactId>webapp-1</artifactId> <!-- FIXME -->
                <version>${project.version}</version>
                <type>war</type>
                <classifier>va-war</classifier>
            </dependency>
            <dependency>
                <groupId>${project.groupId}</groupId>
                <artifactId>webapp-1</artifactId> <!-- FIXME -->
                <version>${project.version}</version>
                <type>pom</type>
            </dependency>

            <!-- second... -->
            <dependency>
                <groupId>${project.groupId}</groupId>
                <artifactId>webapp-2</artifactId> <!-- FIXME -->
                <version>${project.version}</version>
                <type>war</type>
                <classifier>va-war</classifier>
            </dependency>
            <dependency>
                <groupId>${project.groupId}</groupId>
                <artifactId>webapp-2</artifactId> <!-- FIXME -->
                <version>${project.version}</version>
                <type>pom</type>
            </dependency>

            <!-- and so on... -->

        </dependencies>
    </profile>
</profiles>

One small gotcha!

There is one small gotcha! in this specific approach. It is possible that the individual web modules will have dependencies on different versions of common libraries. Nobody wants this but once projects reach a certain size you can’t afford the time and effort required to keep all of the modules in sync.

This information will be lost when we do dependency resolution at a common location.

I don’t consider this a problem for two reasons. First, we can perform vulnerability assessments at a finer granularity – essentially perform the analysis at the .war level instead of the .ear file. This guarantees the libraries will match but will tremendously increase our work load if we have a large number of web modules.

Second, our primary focus is the vulnerabilities in our code, not in specific versions of third-party libraries. Those libraries provide important hints to the assessment tools but we only want a full analysis of our code. We can always run separate assessments of the libraries we depend upon if it’s necessary.

Jenkins Veracode Plugin

Finally I want to point out that there’s a Jenkins plugin for Veracode analysis: Veracode Scanner Plugin. It can be used to schedule scans on a regular basis so you don’t find hundreds of defects when you finally remember to run a scan just days before a release.
 

Related Whitepaper:

Functional Programming in Java: Harnessing the Power of Java 8 Lambda Expressions

Get ready to program in a whole new way!

Functional Programming in Java will help you quickly get on top of the new, essential Java 8 language features and the functional style that will change and improve your code. This short, targeted book will help you make the paradigm shift from the old imperative way to a less error-prone, more elegant, and concise coding style that’s also a breeze to parallelize. You’ll explore the syntax and semantics of lambda expressions, method and constructor references, and functional interfaces. You’ll design and write applications better using the new standards in Java 8 and the JDK.

Get it Now!  

Leave a Reply


8 × five =



Java Code Geeks and all content copyright © 2010-2014, Exelixis Media Ltd | Terms of Use
All trademarks and registered trademarks appearing on Java Code Geeks are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries.
Java Code Geeks is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.

Sign up for our Newsletter

15,153 insiders are already enjoying weekly updates and complimentary whitepapers! Join them now to gain exclusive access to the latest news in the Java world, as well as insights about Android, Scala, Groovy and other related technologies.

As an extra bonus, by joining you will get our brand new e-books, published by Java Code Geeks and their JCG partners for your reading pleasure! Enter your info and stay on top of things,

  • Fresh trends
  • Cases and examples
  • Research and insights
  • Two complimentary e-books