Enterprise class Java code

David GreenJune 15th, 2012Last Updated: October 22nd, 2012

0 108 4 minutes read

There’s a natural instinct to assume that everybody else’s code is an untidy, undisciplined mess. But, if we look objectively, some people genuinely are able to write well crafted code. Recently, I’ve come across a different approach to clean code that is unlike the code I’ve spent most of my career working with (and writing).

Enterprise-class Code

There’s a common way of writing code, perhaps particularly common in Java, but happens in C#, too – that encourages the developer to write as many classes as possible. In large enterprises this way of building code is endemic. To paraphrase: every problem in an enterprise code base can be solved by the addition of another class, except too many classes.

Why does this way of writing code happen? Is it a reaction to too much complexity? There is a certain logic in writing a small class that can be easily tested in isolation. But when everyone takes this approach, you end up with millions of classes. Trying to figure out how to break up complex systems is hard, but if we don’t and just keep on adding more classes we’re making the problem worse not better.

However, it goes deeper than that. Many people (myself included, until recently) think the best way to write well-crafted, maintainable code is to write lots of small classes. After all, a simple class is easy to explain and is more likely to have a single responsibility. But how do you go about explaining a system consisting of a hundred classes to a new developer? I bet you end up scribbling on a whiteboard with boxes and lines everywhere. I’m sure your design is simple and elegant, but when I have to get my head around 100 classes just to know what the landscape looks like – it’s going to take me a little while to understand it. Scale that up to an enterprise with thousands upon thousands of classes and it gets really complex: your average developer may never understand it.

An Example

Perhaps an example would help? Imagine I’m working on some trading middleware. We receive messages representing trades that we need to store and pass on to systems further down the line. Right now we receive these trades in a CSV feed. I start by creating a TradeMessage class.

TradeMessage

private long id;

private Date timestamp;

private BigDecimal amount;

private TradeType type;

private Asset asset;

I’m a good little functional developer so this class is immutable. Now I have two choices: i) I write a big constructor that takes a hundred parameters or ii) I create a builder to bring some sanity to the exercise. I go for option ii).

TradeMessageBuilder

public TradeMessageBuilder onDate(Date timestamp)

public TradeMessageBuilder forAmount(BigDecimal amount)

public TradeMessageBuilder ofType(TradeType type)

public TradeMessageBuilder inAsset(Asset asset)

public TradeMessage build()

Now I have a builder from which I can create TradeMessage classes. However, the builder requires the strings to have been parsed into dates, decimals etc. I also need to worry about looking up Assets, since the TradeMessage uses the Asset class, but the incoming message only has the name of the asset.

We now test-drive outside-in like good little GOOS developers. We start from a CSVTradeMessageParser (I’m ignoring the networking or whatever else feeds our parser).

We need to parse a single line of CSV, split it into its component parts from which we’ll build the TradeMessage. Now we have a few things we need to do first:

Parse the timestamp
Parse the amount
Parse the trade type
Lookup the asset in the database

Now in the most extreme enterprise-y madness, we could write one class for each of those “responsibilities”. However, that’s plainly ridiculous in this case (although add in error handling or some attempts at code reuse and you’d be amazed how quickly all those extra classes start to look like a good idea).

Instead, the only extra concern we really have here is the asset lookup. The date, amount and type parsing I can add to the parser class itself – it’s all about the single responsibility of parsing the message so it makes sense.

CSVTradeMessageParser

public TradeMessage parse(String csvline)

private Date parseTimestamp(String timestamp)

private BigDecimal parseAmount(String amount)

private TradeType parseType(String amount)

Now – there’s an issue test driving this class – how do I test all these private methods? I could make them package visible and put my tests in the same package, but that’s nasty. Or I’m forced to test from the public method, mock the builder and verify the correctly parsed values are passed to the builder. This isn’t ideal as I can’t test each parse method in isolation. Suddenly making them separate classes seems like a better idea…

Finally I need to create an AssetRepository:

AssetRepository

public Asset lookupByName(String name)

The parser uses this and passes the retrieved Asset to the TradeMessageBuilder.

And we’re done! Simple, no? So, if I’ve test driven this with interfaces for my mocked dependencies, how many classes have I had to write?

TradeMessage
TradeType
TradeMessageBuilder
ITradeMessageBuilder
CSVTradeMessageParser
Asset
AssetRepository
IAssetRepository
TradeMessageBuilderTest
CSVTradeMessageParserTest
AssetRepositoryTest

Oh, and since this is only unit tests, I probably need some end-to-end tests to check the whole shooting match works together:

CSVTradeMessageParserIntegrationTest

12 classes! Mmm enterprise-y. This is just a toy example. In the real world, we’d have FactoryFactories and BuilderVisitors to really add to the mess.

Another Way

Is there another way? Well, let’s consider TradeMessage is an API that I want human beings to use. What are the important things about this API?

TradeMessage

public Date getTimestamp()

public BigDecimal getAmount()

public TradeType getType()

public Asset getAsset()

public void parse(String csvline)

That’s really all callers care about – getting values and parsing from CSV. That’s enough for me to use in tests and production code. Here we’ve created a nice, clean, simple API that is dead easy to explain. No need for a whiteboard, or boxes and lines and long protracted explanations.

But what about our parse() method? Hasn’t this become too complex? Afterall it has to decompose the string, parse dates, amounts and trade types. That’s a lot of responsibilities for one method. But how bad does it actually look? Here’s mine, in full:

public void parse(String csvline) throws ParseException
{
    String[] parts = csvline.split(',');

    setTimestamp(fmt.parse(parts[0]));
    setTradeType(TradeType.valueOf(parts[1]));
    setAmount(new BigDecimal(parts[2]));
    setAsset(Asset.withName(parts[3]));
}

Now of course, by the time you’ve added in some real world complexity and better error handling it’s probably going to be more like 20 lines.

But, let me ask you which would you rather have: 12 tiny classes, or 4 small classes? Is it better for complex operations to be smeared across dozens of classes, or nicely fenced into a single method?

Reference: Enterprise class code from our JCG partner David Green at the Actively Lazy blog.