Keep As Much Stuff As Possible In The Application Itself

Bozhidar BozhanovApril 26th, 2012Last Updated: October 21st, 2012

0 40 4 minutes read

There’s a lot of Ops work to every project. Setting up server machines, and clusters of them, managing the cloud instances, setting up the application servers, HAProxy, load balancers, database clusters, message queues, search engine, DNS, alerts, and whatnot. That’s why the Devops movement is popular – there’s a lot more happening outside the application that is vital to its success. But unix/linux is tedious. I hate it, to be honest. Shell script is ugly and I would rather invent a new language and write a compiler for it, that write a shell script. I know many “hackers” will gasp at this statement, but let’s face it – it should be used only as a really last resort, because it will most likely stay out of the application’s repository, it is not developer friendly, and it’s ugly (yes, you can version it, you can write it with good comments, and still…)

But enough for my hate for shell scripting (and command-line executed perl scripts for that matter). That’s not the only thing that should be kept to minimum. (Btw, this is the ‘whining’ paragraph’, you can probably skip it). The “Getting Started” guide of all databases, message queues, search engines, servers, etc. says “easy to install”. Sure, you just apt-get install it, then go to /usr/lib/foo/bar and change a configuration file, then give permissions to a newly-created user that runs it, oh, and you customize the shell-script to do something, and you’re there. Oh, and /usr/lib/foo/bar – that’s different depending on how you install it and who has installed it. I’ve seen tomcat installed in at least 5 different ways. One time all of its folders (bin, lib, conf, webapps, logs, temp) were in a completely different location on the server. And of course somebody decided to use the built-in connection pool, so the configuration has to be done in the servlet container itself. Use the defaults. Put that application server there and leave it alone. But we need a message queue. And a NoSQL database in addition to MySQL. And our architects say “no, this should not be run in embedded mode, it will couple the components”. So a whole new slew of configurations and installations for stuff that can very easily be run inside our main application virtual machine/process. And when you think the external variables are just too many – then comes URL rewriting. “Yes, that’s easy, we will just add another rewrite rule”. 6 months later some unlucky developer will be browsing through the code wondering for hours why the hell this doesn’t open. And then he finally realizes it is outside the application, opens the apache configuration file, and he sees wicked signs all over.

To summarize the previous paragraph – there’s just too much to do on the operations side, and it is (obviously) not programming. Ops people should be really strict about versioning configuration and even versioning whole environment setups (Amazon’s cloud gives a nice option to make snapshots and then deploy them on new instances). But then, when somethings “doesn’t work”, it’s back to the developers to find the problem in the code. And it’s just not there.

That’s why I have always strived to keep as much stuff as possible in the application itself. NoSQL store? Embedded, please. Message queue? Again. URL rewrites – your web framework does that. Application server configurations? None, if possible, you can do them per-application. Modifications of the application server startup script? No, thanks. Caching? It’s in-memory anyway, why would you need a separate process. Every external configuration needed goes to a single file that resides outside the application, and Ops (or devs, or devops) can change that configuration. No more hidden stones to find in /usr/appconf, apache or whatever. Consolidate as much as possible in the piece that you are familiar and experienced with – the code.

Obviously, not everything can be there. Some databases you can’t run embedded, or you really want to have separate machines. You need a load balancer, and it has to be in front of the application. You need to pass initialization parameters for the virtual machine / process, in the startup script. But stick to the bare minimum. If you need ti make something transparent to the application, do it with a layer of code, not with scripts and configurations. I don’t know if that aligns somehow with the devops philosophy, because it is more “dev” and less “ops”, but it actually allows developers to do the ops part, because it is kept down to a minimum. And it does not involve ugly scripting languages and two-line long shell commands.

I know I sound like a big *nix noob. And I truly am. But as most of these hacks can be put up in the application and be more predictable and easy to read and maintain – I prefer to stay that way. If it is not possible – let them be outside it, but really version them, even in the same repository as the code, and document them.

The main purpose of all that is to improve maintainability and manageability. You have a lot of tools, infrastructure and processes around your code, so make use of them for as much as possible.

Reference: Keep As Much Stuff As Possible In The Application Itself from our JCG partner Bozhidar Bozhanov at the Bozho’s tech blog blog.