But enough for my hate for shell scripting (and command-line executed perl scripts for that matter). That’s not the only thing that should be kept to minimum. (Btw, this is the ‘whining’ paragraph’, you can probably skip it). The “Getting Started” guide of all databases, message queues, search engines, servers, etc. says “easy to install”. Sure, you just apt-get install it, then go to /usr/lib/foo/bar and change a configuration file, then give permissions to a newly-created user that runs it, oh, and you customize the shell-script to do something, and you’re there. Oh, and /usr/lib/foo/bar – that’s different depending on how you install it and who has installed it. I’ve seen tomcat installed in at least 5 different ways. One time all of its folders (bin, lib, conf, webapps, logs, temp) were in a completely different location on the server. And of course somebody decided to use the built-in connection pool, so the configuration has to be done in the servlet container itself. Use the defaults. Put that application server there and leave it alone. But we need a message queue. And a NoSQL database in addition to MySQL. And our architects say “no, this should not be run in embedded mode, it will couple the components”. So a whole new slew of configurations and installations for stuff that can very easily be run inside our main application virtual machine/process. And when you think the external variables are just too many – then comes URL rewriting. “Yes, that’s easy, we will just add another rewrite rule”. 6 months later some unlucky developer will be browsing through the code wondering for hours why the hell this doesn’t open. And then he finally realizes it is outside the application, opens the apache configuration file, and he sees wicked signs all over.
To summarize the previous paragraph – there’s just too much to do on the operations side, and it is (obviously) not programming. Ops people should be really strict about versioning configuration and even versioning whole environment setups (Amazon’s cloud gives a nice option to make snapshots and then deploy them on new instances). But then, when somethings “doesn’t work”, it’s back to the developers to find the problem in the code. And it’s just not there.
That’s why I have always strived to keep as much stuff as possible in the application itself. NoSQL store? Embedded, please. Message queue? Again. URL rewrites – your web framework does that. Application server configurations? None, if possible, you can do them per-application. Modifications of the application server startup script? No, thanks. Caching? It’s in-memory anyway, why would you need a separate process. Every external configuration needed goes to a single file that resides outside the application, and Ops (or devs, or devops) can change that configuration. No more hidden stones to find in /usr/appconf, apache or whatever. Consolidate as much as possible in the piece that you are familiar and experienced with – the code.
Obviously, not everything can be there. Some databases you can’t run embedded, or you really want to have separate machines. You need a load balancer, and it has to be in front of the application. You need to pass initialization parameters for the virtual machine / process, in the startup script. But stick to the bare minimum. If you need ti make something transparent to the application, do it with a layer of code, not with scripts and configurations. I don’t know if that aligns somehow with the devops philosophy, because it is more “dev” and less “ops”, but it actually allows developers to do the ops part, because it is kept down to a minimum. And it does not involve ugly scripting languages and two-line long shell commands.
I know I sound like a big *nix noob. And I truly am. But as most of these hacks can be put up in the application and be more predictable and easy to read and maintain – I prefer to stay that way. If it is not possible – let them be outside it, but really version them, even in the same repository as the code, and document them.
The main purpose of all that is to improve maintainability and manageability. You have a lot of tools, infrastructure and processes around your code, so make use of them for as much as possible.
Reference: Keep As Much Stuff As Possible In The Application Itself from our JCG partner Bozhidar Bozhanov at the Bozho’s tech blog blog.
A hands-on guide to leveraging NoSQL databases!
NoSQL databases are an efficient and powerful tool for storing and manipulating vast quantities of data. Most NoSQL databases scale well as data grows. In addition, they are often malleable and flexible enough to accommodate semi-structured and sparse data sets. This comprehensive hands-on guide presents fundamental concepts and practical solutions for getting you ready to use NoSQL databases. Expert author Shashank Tiwari begins with a helpful introduction on the subject of NoSQL, explains its characteristics and typical uses, and looks at where it fits in the application stack. Unique insights help you choose which NoSQL solutions are best for solving your specific data storage needs.