What's New Here?

scala-logo

Student Questions about Scala, Part 2

Preface This is the second post answering questions from students in my course on Applied Text Analysis. You can see the first one here. This post generally covers higher level questions, starting off with one basic question that didn’t make it into the first post. Basic Question Q. When I was working with Maps for the homework and tried to turn a List[List[Int]] into a map, I often got the error message that Scala “cannot prove that Int<:<(T,U)”. What does that mean? A. So, you were trying to do the following. scala> val foo = List(List(1,2),List(3,4)) foo: List[List[Int]] = List(List(1, 2), List(3, 4))scala> foo.toMap <console>:9: error: Cannot prove that List[Int] <:< (T, U). foo.toMap ^This happens because you are trying to do the following at the level of a single two-element list, which can be more easily seen in the following. scala> List(1,2).toMap <console>:8: error: Cannot prove that Int <:< (T, U). List(1,2).toMap ^So, you need to convert each two-element list to a tuple, and then you can call toMap on the list of tuples. scala> foo.map{case List(a,b)=>(a,b)}.toMap <console>:9: warning: match is not exhaustive! missing combination Nilfoo.map{case List(a,b)=>(a,b)}.toMap ^ res3: scala.collection.immutable.Map[Int,Int] = Map(1 -> 2, 3 -> 4)You can avoid the warning messages by flatMapping (which is safer anyway). scala> foo.flatMap{case List(a,b)=>Some(a,b); case _ => None}.toMap res4: scala.collection.immutable.Map[Int,Int] = Map(1 -> 2, 3 -> 4)If you need to do this sort of thing a lot, you could use implicits to make the conversion from two-element Lists into Tuples, as discussed in the previous post about student questions. File system access Q. How can I make a script or program pull in every file (or every file in a certain format) from a directory that is given as a command line argument and perform operations on it? A. Easy. Let’s say you have a directory example_dir with the following files. $ ls example_dir/ file1.txt file2.txt file3.txt program1.scala program2.scala program3.py program4.pyI created these with some simple contents. Here’s a bash command that will print out each file and its contents so you can recreate them (and also see a handy command line for loop). $ for i in `ls example_dir`; do echo "File: $i"; cat example_dir/$i; echo; done File: file1.txt Hello.File: file2.txt Nice to meet you.File: file3.txt Goodbye.File: program1.scala println("Hello.")File: program2.scala println("Goodbye.")File: program3.py print("Hello.")File: program4.py print("Goodbye.")So, here’s how we can do the same using Scala. In the same directory that contains example_dir, save the following as ListDir.scala. val mydir = new java.io.File(args(0)) val allfiles = mydir.listFiles val contents = allfiles.map { file => io.Source.fromFile(file).mkString }allfiles.zip(contents).foreach { case(file,content) => println("File: " + file.getName) println(content) }You can now run it as scala ListDir.scala example_dir. If you want to look at only files of a particular type, use filter on the list of files returned by mydir.listfiles. For example, the following gets the Scala files and prints their names. val scalaFiles = mydir.listFiles.filter(_.getName.endsWith(".scala")) println(scalaFiles.mkString("\n"))As an exercise, now consider what you would need to do to recursively explore a directory that has directories and list the contents of all the files that are in it. Tip: you’ll need to use the isDirectory() method of java.io.File. Q. Is it possible to run an R program within a Scala program? Like write a Scala program that performs R operations using R. If so, how? Are there directory requirement of some sort? A.Though I haven’t used them, you could look at the JRI (Java-R Interface) or RCaller. For some simple things, you can always take the strategy of saving some data to a file, calling an R program that processes that file and produces some output in one or more files, which you then read back into Scala. This is useful for other things you might want to do, including invoking arbitrary applications to compute and output some values based on data created by your program. Here’s an example of doing something like this. Save the following as something like CallR.scala, and then run scala CallR.scala. It assumes you have R installed. import java.io._val data = List((4,1000), (3,1500), (2,1500), (2,6000), (1,14000), (0,18000))val outputFilename = "vague.dat" val bwriter = new BufferedWriter(new FileWriter(outputFilename))val dataLine = data.map { case(numAdjectives, price) => "c("+numAdjectives+","+price+")" }.mkString(",")bwriter.write( """data = rbind(""" + dataLine + ")" + "\n" + """pdf("out.pdf")""" + "\n" + """plot(data)"""+ "\n" + """data.lm = lm(data[,2] ~ data[,1])""" + "\n" + """abline(data.lm)""" + "\n" + """dev.off()""" + "\n") bwriter.flush bwriter.closeval command = List("R", "-f", outputFilename) scala.sys.process.stringSeqToProcess(command).lines.foreach(println)It takes a set of points as a Scala List[(Int,Int)] and creates a set of R commands to plot the points, fit a linear regression model to the points, plot the regression line, and then output a PDF. I took the particular set of points used here from the example in Jurafsky and Martin in the chapter on maximum entropy (multinomial logistic regression), which is based on a study of how vague adjectives in a house listing affect its purchase price. For example, houses that had four vague adjectives in their listing sold for $1000 over their list price, while ones with one vague adjective sold for $14,000 over list price (read the book Freakonomics for some fascinating discussion of this). Here’s the R code that is produced. data = rbind(c(4,1000),c(3,1500),c(2,1500),c(2,6000),c(1,14000),c(0,18000)) pdf("vague_lm.pdf") plot(data) data.lm = lm(data[,2] ~ data[,1]) abline(data.lm) dev.off()Here is the image produced in vague_lm.pdf.To recap, the basic logic of this process is the following.Have or create some set of points in Scala (which, to be useful, would be based on some computation you ran and now need to go to R for to complete). Use this data to create an R script programatically using Scala code. Run the R script using scala.sys.process.You could also have the R script output text information to a file which you could then read back into Scala and parse to get your results. Note that this is not necessarily the most robust way to do this in general, but it does demonstrate a way to do things like calling system commands from within a Scala program. Another alternative is to look at frameworks like ScalaLab, which aims to support a Matlab-like environment for Scala. It’s on my stack of things to look at, and it would allow one to use Scala to directly do much of what one would want to call out to R and other such languages for. High level questions Q. Since Scala runs over JVM, can we conclude that anything that was written in Scala, can be written in Java? (with loss of performance and may be with lengthly code). A. For any two sufficiently expressive languages X and Y, one can write anything in X using Y and vice versa. So, yes. However, in terms of the ease of doing this, it is very easy to translate Java code to Scala, since the latter supports mutable, imperative programming of the kind usually done in Java. If you have Scala code that is functional in nature, it will be much harder to translate easily to Java (though it can of course be done). Efficiency is a different question. Sometimes the functional style can be less efficient (especially if you are limiting yourself to a single machine), so at times it can be advantageous to use while loops and the like. However, for most cases, efficiency of programmer time matters more than efficiency of running time, so quickly putting together a solution using functional means and then optimizing it later — even at the “cost” of being less functional — is, in my mind, the right way to go. Josh Suereth has a nice blog post about this, Macro vs Micro Optimization, highlighting his experiences at Google. Compared to Scala, the amount of code written will almost always be longer in Java, due both to the large amount of boilerplate code and to the higher-level nature of functional programming. I find that Scala programs (written in idiomatic, functional style) converted from Java are generally 1/4th to 1/3rd the number of characters of their Java counterparts. Going from Python to Scala also tends to produce less lengthy code, perhaps 3/4ths to 5/6ths or so in my experience. (Though this depends a great deal on what kind of Scala style you are using, functional or imperative or a mix). Q. Scala seems to be relatively new — so, does it have supporting libraries for common tasks in NLP, like good JSON/XML parsers that you know of? A. Sure. Basically anything that has been written for the JVM is quite straightforward to use with Scala. For natural language processing, we’ll be using the Apache OpenNLP library (which I and Gann Bierner began in 1999 while at the University of Edinburgh), but you can also use other toolkits like the Stanford NLP software, Mallet, Weka, and others. In fact, using Scala often makes it much easier to use these toolkits. There are also Scala specific toolkits that are beginning to appear, including Factorie, ScalaNLP, and Scalabha (which we are using in the class). Scala has native XML support that I find pretty handy, though others wish it weren’t in the language. It is covered in most of the books on Scala, and Dan Spiewak has a nice blog post on it: Working with Scala’s XML Support. The native JSON support isn’t great, but Java libraries for JSON work just fine. Q. General question/comment: Scala lies in the region between object-oriented and functional programming language. My question is — Why? Is it because it makes coding a lot simpler and reduces the number of lines? In that case, I guess python achieves this goal reasonably well, and it has a rich library for processing strings. I am able to appreciate certain things, and ease of getting things done in Scala, but I am not exactly sure why this was even introduced, that too in a somewhat non-standard way (such a mixture of OOP and functional programming paradigm is the first that I have heard of). I’ll defer to Odersky, the creator of Scala. This is from his blog post “Why Scala?“: Scala took a risk in that, before it came out, the object-oriented and functional approaches to programming were largely disjoint; even today the two communities are still sometimes antagonistic to each other. But what the team and I have learned in our daily programming practice since then has fully confirmed our initial hopes. Objects and functions play extremely well together; they enable new, expressive programming styles which lend themselves to high-level domain modeling and and embedded domain-specific languages. Whether it’s log-analysis at NASA, contract modelling at EDF, or risk analysis at many of the largest financial institutions, Scala-based DSLs seem to spring up everywhere these days. Here are some other interesting reads that touch on these questions:Twitter on Scala InfoQ question: Why Scala? Bruce Eckel: Scala: The Static Language that Feels Dynamic (viewpoint from a Pythonista)Q. Do you see any distinct advantage of using Scala for NLP-related stuff? I know this is not a very specific question, but it would be great if you continue highlighting the difference between scala and other languages (like Java, Python) so that our understanding becomes clearer and clearer with more examples. A. In many ways, such questions are a matter of personal taste. I used Python and Java before I switched to primarily using Scala. I liked Python for rapid prototyping, and Java for large-scala system development. I find Scala to be as good, or better, for prototyping than Python, and it is every bit as good, or better, than Java for large scale development. Now, I can use a single language — Scala — for most development. The exception is that I still use R for plotting data sets and also doing certain statistical analyses. The transition from Java to Scala was straightforward, and I went from writing Java-as-Scala to a more and more functional style as I got more comfortable with the language. The resulting code is far better designed, making it more robust, more extensible, and more fun. Specifically with respect to NLP, a definite advantage of Scala is that, as mentioned previously, it is really easy to use existing Java libraries (or any JVM library, for that matter). Another is that as one uses a more functional style, that makes it easier to transition (in terms of both thinking and actual coding) to certain kinds of distributed computing architectures, such as MapReduce. As a really interesting example of Scala and distributed computing, check out Spark. With so much of text analytics being performed on massive datasets, this capability has become increasingly important. Another thing is that the actor-based computing model supported by the Akka library (which is closely tied to the core Scala libraries) also holds many attractions for building language processing systems that need to deal with asynchronous information flows and data processing (FWIW, Akka can be used from Java, though far less enjoyable than from Scala). It is also quite handy for creating distributed versions of many classes of machine learning algorithms that can take better advantage of the structure of the solution than the one-size-fits-all MapReduce strategy can. For examples, you can check out the Akka version of Modified Adsorption and the Hadoop version of the same algorithm in the Junto toolkit. At the end of the day, though, whether one language is “better” than another will depend on a given programmer’s preferences and abilities. For example, a great alternative to Scala is Clojure, which is dynamically typed, also JVM-based, and also functional — even more so than Scala. So, when evaluating this or that language, ask whether you can get more done more quickly and more maintainably. The outcome will be a function of the capabilities of the language and your skill as a programmer. Q. In C++ a class is just a blueprint of an object and it has a size of 1 no matter how many members it has. Does the size of a Scala class depend on its members? Also, is there anything corresponding to “sizeof” operator in Scala? A. I don’t know the answer to this. Any useful responses from readers would be welcome, and I’ll add them to this answer if and when they come in. Reference: Student Questions about Scala, Part 2 from our JCG partner Jason Baldridge at the Bcomposes blog....
jcg-logo

Best Of The Week – 2012 – W09

Hello guys, Time for the “Best Of The Week” links for the week that just passed. Here are some links that drew Java Code Geeks attention: * The 10 commandments of good source control management: An amazing collection of source control management best practices, don’t miss it. The examples are for .NET and Subversion, but the principles described are totally generic. Also check out Git Tutorial – Getting Started and Git DVCS – Getting started. * Spring Data – Part 5: Neo4j: This tutorial shows how to get started with Neo4j (install and run a standalone Neo4j server instance) and how to configure a Maven based Spring Data Neo4j project for high-level manipulation of Neo4j graphs. Also see Domain modeling with Spring Data Neo4j and The Persistence Layer with Spring Data JPA. * Hibernate 4.1 Released With Improved Auditing Support: JBoss has released version 4.1 of Hibernate which includes several fixes and improvements. There is also a very interesting new feature in the audit module of Hibernate (previously known as Envers), having an additional history table for each audited Entity. * Cloud Computing Basics: the CAP Theorem: An introduction to CAP theorem (Consistency, Availability and Partition Tolerance) and how that applies to Cloud computing. The theorem states that a distributed system can satisfy any two of these guarantees at the same time, but not all three. * Communicate Business Value to Your Stakeholders: This article states that stakeholders in an organization need to understand what it is you will do for them in a language that creates meaning for them and suggests that this can be achieved by communicating in terms of benefits, not features. * From Java code to Java heap: This article provides insight into the memory usage of Java code, covering the memory overhead of putting an int value into an Integer object, the cost of object delegation, and the memory efficiency of the different collection types. Additionally, it shows how inefficiencies occur in your application and how to choose the right collections to improve your code.* Developer Productivity Report – Part 4: Deployment Pipeline: This research shows that most Production Update processes are done manually, something that was a bit expected (sadly). More specifically, the majority of those surveyed said that their processes for deploying applications to the production environment are generally not automated to any significant degree. * VMware Introduces Spring Hadoop: VMware have announced the availability of Spring Hadoop, which integrates Spring and Hadoop. The project provides a convenient mechanism for the configuration, creation, and execution of various services and utilities such as MapReduce, Hive, Pig, and Cascading jobs via the Spring container. Also see Big Data analytics with Hive and iReport. * Integration of a transaction manager in Tomcat for Spring and OpenJPA: This tutorial shows how to add a JTA transaction manager to Tomcat where a web application should be deployable on a full fledged application server without modification. The open source version of the Atomikos transaction manager was used. Also see GWT Spring and Hibernate enter the world of Data Grids. * Spring 3.1 and MVC Testing Support: In this presentation the TestContext Framework is introduced and is shown how to use @Configuration and environment profiles for testing with Spring 3.1, and the testing support available in Spring MVC. Also see Spring MVC Development – Quick Tutorial and Spring Pitfalls: Transactional tests considered harmful. That’s all for this week. Stay tuned for more, here at Java Code Geeks. Cheers, Ilias Tsagklis...
spring-logo

RESTful Web Applications with Jersey and Spring

A couple of months ago, we were tasked with creating an API to expose some functions in our system to third party developers. We chose to expose these functions as a series of REST web services. I got to play with Jersey, the reference implementation of JSR 311 (Java API for Restful Services); this turned out to be a nice surprise, as it proved to be extremely powerful and elegant. In this post, we’ll create a very simple REST web service using Jersey. The sample code used in this post can be found here. REST IN SHORT REST (Representation State Transfer) is not new – it was first proposed in 2000 (Fielding, Architectural Styles and the Design of Network-based Software Architectures) – but it’s still quite underused today, having only really come into fashion these last couple of years. It is used to describe resources through a URL, and allows the manipulation of such resources. The idea is to leverage the HTTP Protocol to create a platform-agnostic, stateless, cache-friendly interface between client and server. While REST can be applied to other protocols, we are only concerned with HTTP at this time. In simpler terms, a URL like “http://www.myserver.com/files/text.txt” describes a resource which is a file called text.txt, and lives in the myserver.com domain. Nothing fancy there; you can point your browser to that file, and your browser will send a GET request to the server to fetch it. You don’t even need to write any application to do that; any client and server will communicate in that manner. It gets more interesting with the other request methods; everyone reading this should be familiar with the POST method (typically used for forms). In a REST application, POSTing to a URL means you want to edit the resource at that URL. The less common PUT and DELETE methods are used to create and delete resources, respectively; for example, PUT http://www.myserver.com/files/text.txt should create a text file, usually with the contents of the body of the request. It is worth noting that in some systems, particularly ones which are meant to interact directly with a browser, the POST method is sometimes hijacked for these purposes, since some browsers don’t deal with these two very well. REST also makes use of headers, to control caching, or to determine which content types or languages the client is expecting; a request is, after all, a plain old HTTP request. It’s nice, it’s clean, it’s flexible, and it won’t do your head in with the amount of plumbing you’d need to provide the same functionality with, say, SOAP. The reason we went with it should be clear enough at this point. ANATOMY OF A REST RESOURCE CLASS While there’s plenty going on behind the scenes, Jersey does a very good job of hiding the complexity behind its nice clean annotations. Consider the following: @Path("/people/{code}") public class Individual {@GET @Produces({"application/json", "application/xml"}) public Object find(@PathParam("code") String code) { ... }@DELETE public void remove(@PathParam("code") String code) { ... } }This is a simple class which can look up or remove the entry for a person, based on a unique code. The first annotation, Path, specifies which URL this class (or method – you can override the path at method level, should you wish to do so) maps to. In this case, we’re saying that we want this class to handle requests made to “[whatever domain]/people”; we will also be expecting a value after “people”, which we shall consider to be the unique code for the person we want – that’s the value in braces, there. We can use multiple variables in the path; we could, for example, say “/team/{team_id}/{position}” or even “/team/{team_id}/staff/{position}” to get the details of the persons filling a given position in a team, depending on how verbose we want to be. We can also create impose restrictions on the parameters; if we wanted code to be a numeric value, for example, we can define it as “{code: [0-9]*}”; the definition accepts regular expression patterns. The GET and DELETE annotations specify which Java method should handle which HTTP method. There are also POST and PUT annotations for those methods. The PathParam annotation grabs parameters from the request URL and passes them on to the method – in this case, it grabs the code parameter. Self explanatory so far – there are even FormParam and HeaderParam versions to grab values from POSTed form fields or request headers, respectively. I found the Produces annotation quite interesting. The parameter to the annotation takes a collection of MIME types, which declare the return types which the method is capable of generating. In both cases above, we can serve JSON or XML responses – the one that gets returned is picked based on the value of the request’s ACCEPT header – if the request accepts more than one of the return types provided by the method, the first one to be listed in the ACCEPT header is preferred. Returning whatever When returning an instance of a class which is annotated with XmlRootElement, Jersey takes care of determining the return type, and converting the object into the required representation. No fuss, no muss needed. If you need to do some fancier formatting – converting to a PDF or an html page, for example, it should be as simple as writing and registering a marshaller for the given type, though I haven’t delved this deep yet. WIRING IT ALL UP WITH SPRING Of course, that resource class is just sitting pretty right now. To put it up in a web application, we’ll need to set it up, and Spring is what we’ll use for this. Bear with me; it’s not very verbose, for once. First, we need to tell Spring that our resource is a configurable component. To do this, we can just plonk the Component annotation on the class. Then, we need to define the scope; since REST is meant to be stateless, we’ll just go ahead and declare a request scope, using the Scope annotation with a value of “request”. No surprises yet! Our class declaration now looks like this: @Component @Scope("request") @Path("/people/{code}") public class Individual { ... }Finally, just tell Spring where to look for your components: <?xml version="1.0" encoding="UTF-8"?> <beans xmlns="http://www.springframework.org/schema/beans" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:context="http://www.springframework.org/schema/context" xsi:schemaLocation="http://www.springframework.org/schema/beanshttp://www.springframework.org/schema/beans/spring-beans.xsdhttp://www.springframework.org/schema/contexthttp://www.springframework.org/schema/context/spring-context.xsd"><context:component-scan base-package="your.namespace.here" /> </beans>the – surprise – Inject annotation to populate their fields. That’s all… you’ve got a REST resource ready to go. IT’S NOT ALL WALKING That’s really all there is to it. Apart from a few problems I’m still looking into – JAXB totally loses its marbles when it finds a circular reference, making some object models difficult to marshal – JSR 311 provides a really clean way to put this all together. There’s one gripe; collection return types seem to be a problem. This can be bypassed by wrapping collections in a container, but it does seem like an unnecessary step. THE SAMPLE APPLICATION The sample application can list, load or delete individual entries from an in-memory map on the server via jQuery ajax calls. It has been packaged in two WAR files (server and client). Due to browser sandboxing, make sure that both client and server packages are on the same domain; this restriction does not exist if you connect to the server programmatically. I suppose I could have shown an example of POST or PUT, but really, it’s straightforward enough and I really hate writing forms. Reference: RESTful Web Applications with Jersey and Spring from our JCG partner Karl Agius at the The Simple Part blog....
apache-tomcat-logo

Ubuntu: Installing Apache Portable Runtime (APR) for Tomcat

After reading “Introducing Apache Tomcat 6? presentation by Mladen Turk I decided to enable Apache Portable Runtime (APR) native library for Tomcat. It was supposed to be as easy as sudo ./configure sudo make sudo make installbut as you may guess, it was a little bit more than that. 1. Installing Apache APR. “Most Linux distributions will ship packages for APR” – those of Linode don’t, I had a barebone Ubuntu 10.10 box without even “gcc” and “make”, let alone Apache APR. Thanks God, networking was not an issue, unlike last time. wget http://apache.spd.co.il/apr/apr-1.4.5.tar.gz tar -xzf apr-1.4.5.tar.gz rm apr-1.4.5.tar.gz cd apr-1.4.5/ sudo apt-get install make sudo ./configure sudo make sudo make install2. Installing Tomcat Native. wget http://off.co.il/apache//tomcat/tomcat-connectors/native/1.1.20/source/tomcat-native-1.1.20-src.tar.gz tar -xzf tomcat-native-1.1.20-src.tar.gz rm tomcat-native-1.1.20-src.tar.gz cd tomcat-native-1.1.20-src/jni/native sudo ./configure --with-apr=/usr/local/aprThe result was checking build system type... x86_64-unknown-linux-gnu .. checking for APR... yes .. checking for JDK location (please wait)... checking Try to guess JDK location... configure: error: can't locate a valid JDK locationOuch! “Can’t locate a valid JDK location” ? On my machine? $ which java /home/user/java/jdk/bin/java $ echo $JAVA_HOME /home/user/java/jdk $ java -version java version "1.6.0_24" Java(TM) SE Runtime Environment (build 1.6.0_24-b07) Java HotSpot(TM) 64-Bit Server VM (build 19.1-b02, mixed mode)But for some reason “tomcat-native-1.1.20-src/jni/native/configure” script didn’t see my “JAVA_HOME” variable no matter what and even installing “sun-java6-jdk” didn’t help much. After patching the “configure” script to dump locations it was looking for “valid JDK” I had: .. configure: [/usr/local/1.6.1] configure: [/usr/local/IBMJava2-1.6.0] configure: [/usr/local/java1.6.0] configure: [/usr/local/java-1.6.0] configure: [/usr/local/jdk1.6.0] configure: [/usr/local/jdk-1.6.0] configure: [/usr/local/1.6.0] configure: [/usr/local/IBMJava2-1.6] configure: [/usr/local/java1.6] configure: [/usr/local/java-1.6] configure: [/usr/local/jdk1.6] configure: [/usr/local/jdk-1.6] ..Ok then, here you have it now: sudo ln -s ~/java/jdk/ /usr/local/jdk-1.6 sudo ./configure --with-apr=/usr/local/apr sudo make sudo make installAnd with .. export LD_LIBRARY_PATH='$LD_LIBRARY_PATH:/usr/local/apr/lib' ..I now had a beautiful log message in “catalina.out“: .. Mar 7, 2011 11:51:02 PM org.apache.catalina.core.AprLifecycleListener init INFO: Loaded APR based Apache Tomcat Native library 1.1.20. Mar 7, 2011 11:51:02 PM org.apache.catalina.core.AprLifecycleListener init INFO: APR capabilities: IPv6 [true], sendfile [true], accept filters [false], random [true]. Mar 7, 2011 11:51:03 PM org.apache.coyote.AbstractProtocolHandler init ..As soon as “evgeny-goldin.org” moves to its new location on the brand-new Linode box it will benefit from this performance optimization. I’ll describe the migration process and reasons for it a bit later, once it is done. Reference: Ubuntu: Installing Apache Portable Runtime (APR) for Tomcat from our JCG partner Evgeny Goldin at the Goldin++ blog....
software-development-2-logo

Open Source Rules of Engagement

The Eclipse Development Process (EDP) defines–in its section on principles–three open source rules of engagement: Openness, Transparency, and Meritocracy: Open – Eclipse is open to all; Eclipse provides the same opportunity to all. Everyone participates with the same rules; there are no rules to exclude any potential contributors which include, of course, direct competitors in the marketplace. Transparent – Project discussions, minutes, deliberations, project plans, plans for new features, and other artifacts are open, public, and easily accessible. Meritocracy – Eclipse is a meritocracy. The more you contribute the more responsibility you will earn. Leadership roles in Eclipse are also merit-based and earned by peer acclaim. In more concise terms, transparency is about inviting participation; openness is about actually accepting it; and meritocracy is a means of limiting participation to those individuals who have demonstrated the desire and means to actually participate. Transparency is one of those things that I believe most people understand in principle. Do everything in public: bug reports are public, along with all discussion; mailing lists are public; team meetings are public and minutes are captured (and disseminated). By operating transparently, the community around the project can gain insight into the direction that the project is moving and adjust their plans accordingly. In practice, however, transparency is difficult to motivate in the absence of openness. What is the value to the community of discussing every little detail of an implementation in public? Does anybody really care? The fact of the matter is that a lot of people really don’t care. Most users of Eclipse are blissfully unaware that we even have bug tracking software and mailing lists. But some people do care, and transparency is a great way to hook those people who do are and get them to participate. A lot of open source projects understand transparency. A lot, however, don’t understand openness. They’re not the same thing. To be “open” means that a project is “open” to participation. More than that, an “open” invites and actively courts participation. Participation in an open source project takes many forms. It starts by creating bug reports and providing patches, tests, and other bits of code. Overtime, contribution increases, and–eventually–some contributors become full-blown members of the project. Courting the community for contributors should be one of the first-class goals of every open source project. But openness isn’t just about getting more help to implement your evil plans for world domination. It’s also about allowing participants to change your evil plans for world domination. Openness is about being open to new ideas, even–as the EDP states–if those new ideas come from your direct competitors in the marketplace. A truly open project actively courts diversity. Having different interests working together is generally good for the overall health of an open source project and the community that forms around it. Reference: Open Source Rules of Engagement from our JCG partner Wayne Beaton at the Eclipse Hints, Tips, and Random Musings blog....
json-logo

Groovy 1.8.0 – meet JsonBuilder!

Groovy 1.8.0 released in April brought a lot of new features to the language, one of them is native JSON support through JsonSlurper for reading JSON and JsonBuilder for writing JSON. I recently used JsonBuilder in one of my projects and initially experienced some difficulties in understanding how it operates. My assumption was that JsonBuilder works similarly to MarkupBuilder but as I have quickly found out, it really doesn’t. Let’s take a simple example. Assume we have a class Message that we would like to serialize to XML markup and JSON. @groovy.transform.Canonical class Message { long id String sender String text } assert 'Message(23, me, some text)' == new Message( 23, 'me', 'some text' ).toString()Here I used Groovy 1.8.0 @Canonical annotation providing automatic toString(), equals() and hashCode() and a tuple (ordered) constructor. Let’s serialize a number of messages to XML. def messages = [ new Message( 23, 'me', 'some text' ), new Message( 24, 'me', 'some other text' ), new Message( 25, 'me', 'same text' )] def writer = new StringWriter() def xml = new groovy.xml.MarkupBuilder( writer ) xml.messages() { messages.each { Message m -> message( id : m.id, sender : m.sender, text : m.text )} } assert writer.toString() == """ <messages> <message id='23' sender='me' text='some text' /> <message id='24' sender='me' text='some other text' /> <message id='25' sender='me' text='same text' /> </messages>""".trim()Well, that was pretty straightforward. Let’s try to do the same with JSON. def json = new groovy.json.JsonBuilder() json.messages() { messages.each { Message m -> message( id : m.id, sender : m.sender, text : m.text )} } assert json.toString() == '{"messages":{"message":{"id":25,"sender":"me","text":"same text"}}}'Wow, where did all other messages go? Why only one last message in the list was serialized? How about this: json = new groovy.json.JsonBuilder() json.messages() { message { id 23 sender 'me' text 'some text' } message { id 24 sender 'me' text 'some other text' } } assert json.toString() == '{"messages":{"message":{"id":24,"sender":"me","text":"some other text"}}}'Same story. Initially I was puzzled, but then JsonBuilder source code showed that every invocation overrides the previous content: JsonBuilder(content = null) { this.content = content } def call(Map m) { this.content = m return content } def call(List l) { this.content = l return content } def call(Object... args) { this.content = args.toList() return this.content } def call(Closure c) { this.content = JsonDelegate.cloneDelegateAndGetContent(c) return content }As you see, one should invoke JsonBuilder exactly once, passing it a Map, List, varargs or Closure. This makes JsonBuilder very different from MarkupBuilder which can be updated as many times as needed. It could be caused by the JSON itself, whose format is stricter than free-form XML markup: something that started as a JSON map with a single Message, can not be made into array of Messages out of sudden. The argument passed to JsonBuilder (Map, List, varargs or Closure) can also be specified in constructor so there’s no need to invoke a builder at all. You can simply initialize it with the corresponding data structure and call toString() right away. Let’s try this! def listOfMaps = messages.collect{ Message m -> [ id : m.id, sender : m.sender, text : m.text ]} assert new groovy.json.JsonBuilder( listOfMaps ).toString() == '''[{"id":23,"sender":"me","text":"some text"}, {"id":24,"sender":"me","text":"some other text"}, {"id":25,"sender":"me","text":"same text"}]'''. readLines()*.trim().join()Now it works :) After converting the list of messages to the list of Maps and sending them to the JsonBuilder in one go, the String generated contains all messages from the list. All code above is available in Groovy web console so you are welcome to try it out. Btw, for viewing JSON online I recommend an excellent “JSON Visualization” application made by Chris Nielsen. “Online JSON Viewer” is another popular option, but I much prefer the first one. And for offline use “JSON Viewer” makes a good Fiddler plugin.P.S. If you need to read this JSON on the client side by sending, say, Ajax GET request, this can be easily done with jQuery.get(): <script type="text/javascript"> var j = jQuery; j( function() { j.get( 'url', { timestamp: new Date().getTime() }, function ( messages ){ j.each( messages, function( index, m ) { alert( "[" + m.id + "][" + m.sender + "][" + m.text + "]" ); }); }, 'json' ); }); </script>Here I use a neat trick of a j shortcut to avoid typing jQuery too many times when using $ is not an option. Reference: Groovy 1.8.0 – meet JsonBuilder! from our JCG partner Evgeny Goldin at the Goldin++ blog....
javafx-logo

Best Practices for JavaFX Mobile Applications, Part 1

As everybody who is interested in JavaFX will know by now, JavaFX Mobile was released a short while ago. It was a hell of a ride, that’s for sure. I felt so exhausted, I did not even have the energy to blog during the release… But by now I feel recovered and want to start a little series about lessons we have learned while preparing the release and give some hints how to improve the performance of JavaFX Mobile applications. WARNING: The tips I am giving here are true for the current version of JavaFX Mobile, which is part of the JavaFX 1.1 SDK. In future versions the behavior will change, the current bad performance of the mentioned artifacts will be optimized away or at least significantly improved. Everything I am writing about here is a snap-shot, nothing should be understood as final! Item 1: Avoid unnecessary bindings Bindings are very convenient, without any doubt one of the most valuable innovations in JavaFX Script. Unfortunately they come with a price. The generated boiler-plate code is usually not as small and fast as a manual implementation would be. Especially complex dependency-structures tend to impose a severe penalty on performance and footprint. For this reason it is recommended to avoid bindings as much as possible. Often the same functionality can be implemented with triggers. One should not use bindings to avoid the hassle of dealing with the initialization order. And it certainly makes no sense to bind to a constant value. Lazy bindings are most of the time (but not always!) faster if a bound variable is updated more often then read, but they are still not as fast as manual implementations. Example A common use-case is a number of nodes which positions and sizes depend on the stage-size. A typical implementation uses bindings to achieve that. Here we will look at a simple example, which resembles such a situation. The scene consists of three rectangles which are laid out diagonally from the top-left to the bottom-right. The size of the rectangle is a quarter of the screen-size. Code Sample 1 shows an implementation with bindings. def rectangleWidth: Number = bind stage.width * 0.25; def rectangleHeight: Number = bind stage.height * 0.25;def stage: Stage = Stage { scene: Scene { content: for (i in [0..2]) Rectangle { x: bind stage.width * (0.125 + 0.25*i) y: bind stage.height * (0.125 + 0.25*i) width: bind rectangleWidth height: bind rectangleHeight } } }Code Sample 1: Layout calculated with bindings The first question one should think about is wether the bindings are really necessary. On a real device the screen-size changes only when the screen orientation is switched (provided that the device supports this functionality). If our application does not support screen rotation, the layout can be defined constant. One possible solution to reduce the number of bindings is shown in Code Sample 2. Two variables width and height are introduced and bound to stage.width and stage.height respectively. Their only purpose is to provide triggers for stage.width and stage.height, since we do not want to override the original triggers. Position and size of the rectangles are calculated manually in the triggers. def r = for (i in [0..2]) Rectangle {}def stage = Stage { scene: Scene {content: r} }def height = bind stage.height on replace { def rectangleHeight = height * 0.25; for (i in [0..2]) { r[i].height = rectangleHeight; r[i].y = height * (0.125 + 0.25*i) } }def width = bind stage.width on replace { def rectangleWidth = width * 0.25; for (i in [0..2]) { r[i].width = rectangleWidth; r[i].x = width * (0.125 + 0.25*i) } }Code Sample 2: Layout calculated in trigger Without any doubt, the code in Code Sample 1 is more elegant. But measuring the performance of both snippets in the emulator, it turned out the code in Code Sample 2 is almost twice as fast. Further below we are going to see about the second tip to increase performance of JavaFX Mobile applications. I think this and the previous one are the most important ones. WARNING: The tips I am giving here are true for the current version of JavaFX Mobile, which is part of the JavaFX 1.1 SDK. In future versions the behavior will change, the current bad performance of the mentioned artifacts will be optimized away or at least significantly improved. Everything I am writing about here is a snap-shot, nothing should be understood as final! Item 2: Keep the scenegraph as small as possible Behind the scenes of the runtime a lot of communication takes place to update the variables of the nodes in a scenegraph. The more elements a scenegraph has, the more communication is required. Therefore it is critical to keep the scenegraph as small as possible. Especially animations tend to suffer from a large scenegraph. It is bad practice to keep a node in the scenegraph at all times and control its visibility via the visible-flag or its opacity. Invisible nodes in the scenegraph are still part of the communication-circus in the background. Instead one should remove nodes from the scenegraph and add them only when required. This approach has one drawback though. Adding or removing nodes takes longer than setting the visibility. Therefore it might not be appropriate in situations were immediate responses are critical. Example 1 Often one has a set of nodes of which only one is visible. These can be for example different pages, or nodes to visualize different states of an element. One might be tempted to add all nodes to the scenegraph and set only the current as visible. Code Sample 1 shows a simplified version of this approach. Three colored circles are created to visualize some kind of state (red, yellow, green). Only one node is visible at any time. (Let’s ignore for a second that this could simply be achieved by changing the fill-color of a single circle. In real life applications one would probably have images or more complex shapes for visualizations and simply changing the color would not work.) def colors = [Color.GREEN, Color.YELLOW, Color.RED];var state: Integer;Stage { scene: Scene { content: for (i in [0..2]) Circle { centerX: 10 centerY: 10 radius: 10 fill: colors[i] visible: bind state == i } } }Code Sample 1: Using visibility to switch between nodes This results in three nodes in the scenegraph although only one is shown. This should be refactored to ensure that only the visible node is in the scenegraph. Code Sample 2 shows one possible implementation. def colors = [Color.GREEN, Color.YELLOW, Color.RED];var state: Integer on replace oldValue { insert nodes[state] into stage.scene.content; delete nodes[oldValue] from stage.scene.content; }def nodes = for (i in [0..2]) Circle { centerX: 10 centerY: 10 radius: 10 fill: colors[i] }def stage = Stage {scene: Scene{}}Code Sample 2: Adding and removing nodes when required The code in Code Sample 1 is more compact, but Code Sample 2 reduced the number of nodes in the scenegraph from three to one. While tuning some of the demos for the JavaFX Mobile release, we were able to reduce the number of nodes in the scenegraph by 50% and more, simply by ensuring that only visible nodes are part of it. Example 2 If nodes are shown and hidden with some kind of animation, adding and removing the node to the scenegraph becomes extremely simple. One only needs to implement an action at the beginning of the fadeIn-animation and at the end of the fadeOut-animation to add respectively remove the node. Code Sample 3 shows such a usage where a simple message-box is shown and hidden by changing the opacity. def msgBox = Group { opacity: 0.0 content: [ Rectangle {width: 150, height: 40, fill: Color.GREY}, Text {x: 20, y: 20, content: "Hello World!"} ] }def fadeIn = Timeline { keyFrames: [ KeyFrame { action: function() {insert msgBox into stage.scene.content} }, at (1s) {msgBox.opacity => 1.0 tween Interpolator.LINEAR} ] }def fadeOut = Timeline { keyFrames: KeyFrame { time: 1s values: msgBox.opacity => 0.0 tween Interpolator.LINEAR action: function() {delete msgBox from stage.scene.content} } }def stage = Stage {scene: Scene{}}Code Sample 3: Using fadeIn- and fadeOut-animations to add and remove nodes. Reference: Best Practices for JavaFX Mobile Applications & Best Practices for JavaFX Mobile Applications 2 from our JCG partner Michael Heinrichs at the Mike’s Blog....
groovy-logo

Simple but powerful DSL using Groovy

In one of my projects we had very complicated domain model, which included more than hundred of different domain object types. It was a pure Java project and, honestly, Java is very verbose with respect to object instantiation, initialization and setting properties. Suddenly, the new requirement to allow users define and use own object models came up. So … the journey begun. We ended up with the idea that some kind of domain language for describing all those object types and relations is required. Here Groovy came on rescue. In this post I would like to demonstrate how powerful and expressive could be simple DSL written using Groovy builders. As always, let’s start with POM file for our sample project:4.0.0com.example dsl 0.0.1-SNAPSHOT jar UTF-8 junit junit 4.10 org.codehaus.groovy groovy-all 1.8.4 org.codehaus.gmaven gmaven-plugin 1.4 1.8 compile testCompile org.apache.maven.plugins maven-compiler-plugin 2.3.1 1.6 1.6I will use the latest Groovy version, 1.8.4. Our domain model will include three classes: Organization, User and Group. Each Organization has a mandatory name, some users and some groups. Each group can have some users as members. Pretty simple, so here are our Java classes. Organization.java package com.example;import java.util.Collection;public class Organization { private String name; private Collection< User > users = new ArrayList< User >(); private Collection< Group > groups = new ArrayList< Group >(); public String getName() { return name; }public void setName( final String name ) { this.name = name; }public Collection< Group > getGroups() { return groups; }public void setGroups( final Collection< Group > groups ) { this.groups = groups; }public Collection< User > getUsers() { return users; }public void setUsers( final Collection< User > users ) { this.users = users; } }User.java package com.example;public class User { private String name;public String getName() { return name; }public void setName( final String name ) { this.name = name; } }Group .java package com.example;import java.util.Collection;public class Group { private String name; private Collection< User > users = new ArrayList< User >();public void setName( final String name ) { this.name = name; }public String getName() { return name; }public Collection< User > getUsers() { return users; }public void setUsers( final Collection< User > users ) { this.users = users; } }Now, we have our domain model. Let think about the way regular user can describe own organization with users, groups and relations between all these objects. Primarily, we taking about some kind of human readable language simple enough for regular user to understand. Meet Groovy builders. package com.example.dsl.samplesclass SampleOrganization { def build() { def builder = new ObjectGraphBuilder( classLoader: SampleOrganization.class.classLoader, classNameResolver: "com.example" )return builder.organization( name: "Sample Organization" ) { users = [ user( id: "john", name: "John" ),user( id: "samanta", name: "Samanta" ),user( id: "tom", name: "Tom" ) ]groups = [ group( id: "administrators", name: "administrators", users: [ john, tom ] ), group( id: "managers", name: "managers", users: [ samanta ] ) ] } } }And here is small test case which verifies that our domain model is as expected: package com.example.dslimport static org.junit.Assert.assertEquals import static org.junit.Assert.assertNotNullimport org.junit.Testimport com.example.dsl.samples.SampleOrganizationclass BuilderTestCase { @Test void 'build organization and verify users, groups' () { def organization = new SampleOrganization().build()assertEquals 3, organization.users.size() assertEquals 2, organization.groups.size() assertEquals "Sample Organization", organization.name } }I am using this simple DSL again and again across many projects. It’s really simplifies a lot complex object models creation. Reference: Simple but powerful DSL using Groovy from our JCG partner Andrey Redko at the Andriy Redko {devmind} blog...
apache-hadoop-logo

The problems in Hadoop – When does it fail to deliver?

Hadoop is a great piece of software. It is not original but that certainly does not take away its glory. It builds on parallel processing, a concept that’s been around for decades. Although conceptually unoriginal, Hadoop shows the power of being free and open (as in beer!) and most of all shows about what usability is all about. It succeeded where most other parallel processing frameworks failed. So, now you know that I’m not a hater. On the contrary, I think Hadoop is amazing. But, it does not justify some blatant failures on the part of Hadoop, may it be architectural, conceptual or even documentation wise. Hadoop’s popularity should not shield it from the need to re-enginer and re-work problems in the Hadoop implementation. The point below are based on months of exploring and hacking around Hadoop. Do dig in.Did I hear someone say “Data Locality”? Hadoop harps over and over again on data locality. In some workshops conducted by Hadoop milkers, they just went on and on about this. They say whenever possible, Hadoop will attempt to start a task on a block of data that is stored locally on that node via HDFS. This sounds like a super feature, doesn’t it? It saves so much of bandwidth without having to transfer TBs of data, right? Hellll, no. It does not. What this means is that first you have to figure out a way of getting data into HDFS, the Hadoop Distributed File System. This is non trivial, unless you live in the last decade and all your data exists as files. Assuming that you do, let’s transfer the TBs of data over to HDFS. Now, it will start doing it’s whole “data locality” thing. Ermm, OK. Am I hit by a wave of brilliance or isn’t it what’s is supposed to do anyway? Let’s get our facts straight. To use Hadoop, our problem should be able to execute in parallel. If the problem or a at least a sub-problem can’t be parallelized it won’t gain much out of Hadoop. This means the task algorithm is independent of any specific part of the data it processes. Further simplifying this would be saying, any task can process any section of the data. So, doesn’t that mean the “data locality” thing is the obvious thing to do? Why, would the Hadoop developers even write some code that would make a task process data in another node unless something goes horribly wrong. The feature would be if it was doing otherwise! If a task has finished operating on the node’s local data and then would transfer data from another node and process this data, that would be a worthy feature of the conundrum. At least that would be worthy of noise. Would you please put everything back into files Do you have nicely structured data in databases? Maybe, you became a bit fancy and used the latest and greatest NoSQL data store? Now let me write down what you are thinking. “OK, let’s get some Hadoop jobs to run on this, cause I want to find all this hidden gold mines in my data, that will get me a front page of Forbes.” I hear you. Let’s get some Hadoop jobs rolling. But wait! What the …..? Why are all the samples in text files. A plethora of examples using CSV files, tab delimited files, space delimited files, and all other kind of neat files. Why is everyone going back a few decades and using files again? Haven’t all these guys heard of DBs and all that fancy stuff. It seems that you were too early an adopter of Data Stores. Files are the heroes of the Hadoop world. If you want to use Hadoop quickly and easily, the best path for you right is to export your data neatly into files and run all those snazzy word count samples (Pun intended!). Because without files Hadoop can’t do all that cool “data locality” shit. Everything has to be in HDFS first. So, what would you do to analyze your data in the hypothetical FUHadoopDB? First of all, implement about 10+ classes necessary to split and transfer data into the Hadoop nodes and run your tasks. Hadoop needs to know how to get data from FUHadoopDB, so let’s assume this is acceptable. Now, if you don’t store it in HDFS, you won’t get the data locality shit. If this is the case, when the task runs, they themselves will have to pull data from the FUHadoopDB to process the data. But, if you want the snazzy data locality shit you need to pull data from FUHadoopDB and store it in HDFS. You will not incur the penalty of pulling data while the tasks are running, but you pay it at the preparation stage of the job, in the form of transferring the data into HDFS. Oh and did I mention the additional disk space you would need to store the same data in HDFS. I wanted to save that disk space, so I chose to make my tasks pull data while running the tasks. The choice is yours. Java is OS independent, isn’t it? Java has its flaws but for the most part it runs smoothly on most OSs. Even if there are some OS issues, it can be ironed out easily. The Hadoop folks have issued document mostly based on Linux environments. They say Windows is supported, but ignored those ignorant people by not providing adequate documentation. Windows didn’t even make it to the recommended production environments. It can be used as a development platform, but then you will have to deploy it on Linux. I’m certainly not a windows fan. But if I write a Java program, I’d bother to make it run on Windows. If not, why the hell are you using Java? Why the trouble of coming up with freaking bytecode? Oh, the sleepless nights of all those good people who came up with byte code and JVMs and what not have gone to waste. CS 201: Object Oriented Programming If you are trying to integrate Hadoop into your platform, think again. Let me take the liberty of typing your thoughts. “Let’s just extend a few interfaces and plugin my authentication mechanism. It should be easy enough. I mean these guys designed the world’s greatest software that will end world hunger.”. I hear you again. If you are planning to do this, don’t. It’s like OOP anti patterns 101 in there. So many places that would say “if (kerberos)” and execute some security specific function. One of my colleagues went through this pain, and finally decided to that it’s easier to write keberos based authentication for his software and then make it work with Hadoop. With great power comes great responsibility. Hadoop fails to fulfil this responsibility. Even with these issues, Hadoop’s popularity seems to be catching significant attention, and its rightfully deserved. Its ability to commodotize big data analytics should be exalted. But it’s my opinion that it got way too popular way too fast. The Hadoop community needs to have another go at revamping this great piece of software. Reference: The problems in Hadoop – When does it fail to deliver? from our JCG partner Mackie Mathew at the dev_religion blog...
software-development-2-logo

15 Tenets For The Software Engineer

Many people talk about the things a software engineer needs to know in order to be successful in their job. Other people talk about the traits needed to be successful. Typically, these posts may read differently but there are many similarities between the two posts. In reality, a software can never really be successful without looking at both types of posts. The list of 15 tenets below is my hope to consolidate the ideas into one handy list for your review.Remember the basics. If you forget the basics of a programming language, you lose your foundational knowledge. That is never a good thing. Always assume the worst case. If you had formal computer science education, you learned about big-O notation. Knowing why an algorithm has no chance of performing well is a good thing. Figuring out why a particular use case seems much slower than others is how you stay successful. Test your code. Ensure you have tests for your code, whether you follow TDD or any other method. Depending on the type of test, you may want to target a different level of coverage, but you should still write as many tests as you can. Do not employ new technologies because they are new, use them because they solve a problem. As technologists, we tend to follow the hot new tools in the hope of finding a silver bullet. Utility is the key, not coolness. Read, a lot. If you are not reading about our industry, you will fall behind and that could have career threatening complications. Try new techniques and technologies, a lot. Yes, I said not to use new technologies just because they are new, but you do need to try new things in order to determine if something new is useful. Also, trying new things helps you learn and keep current in your industry. Fail, you will learn something. At the minimum, you will learn what does not work and you can refine your solutions. In some case, you can even consider the failure a small success. Ship the damn software. Sometimes you just need to get the job done, but you must be aware of technical debt. If you continuously just ship software without removing technical debt, you are well on your way to creating a nightmare when a major production issue arises. Do it the “right way”. Most developers have an idea of the “right way” to implement a design, but that may not always be what project management wants. This is almost a contradiction to the previous “ship the damn software” rule, but there is a balance that needs to be met. Leave the code better than how you found it. Instead of preaching the benefits of refactoring, think of whether you want to maintain the pile of code that keeps getting worse. If you clean it up a little each time you modify it, then it will not be a terrible mess. Think about concurrent access. If you are building a web application, and I don’t mean the scale of Facebook, weird issues may arise under load. Even an application with 100 concurrent users can start to see weird issues when there is concurrent reads and writes on things like HashMaps. This is just the start of the problems as well. Storage may be free, but I/O sucks. You may think that writing everything to disk is a great way to persist data. Generally it is, but if you use disk storage as a temporary storage area, your application could quickly grind to a slow crawl. Physical storage should be limited to that data that needs to persist for long periods of time, or when the data cannot reside in memory. Memory does not go as far as you may think. To start, many people will have their application and database residing on the same server. This is perfectly acceptable until both require a lot of RAM. As an example, you can easily run a Java application in Tomcat in 528MB. However, once you have to deal with scale of any sort and you add in the RAM required by the persistent storage (RDBMS, NoSQL, etc), you can quickly jump to 8GB. Obviously, this is highly dependent upon the number of users hitting the system and how much data you store in memory. Caching fixes everything until it crashes the server. If you are looking for ways to avoid a lot of database queries, you end up using some form of caching. The problem is that caching requires much more memory than your typical application usage, especially when dealing with data that scales with the number of users (see the previous point on memory). The worst problem with caching is that it can chew up so much memory that you run into an OutOfMemory error in java or similar errors in other languages. At that point, your server will either crash or become unresponsive and caching no longer helps because it has become part of the problem. Think like a consultant. As an employee, there tends to be an unwritten rule that the company can do things they would not do with consultants. Deadlines may be moved, scope may be increased, and the developer needs to find a way to meet these new constraints. As an employee, you need to use your power to state that the deadline can not move due to the amount of work required, or that scope cannot be increased without increasing the number of resources. Consultants tend to be allowed to manage a project differently than employees, and it is our job to change that.I know there are a bunch of other ideas that keep running through my head, but this is the best list I can create for now. What other rules would you include for software engineers? Reference: 15 Tenets For The Software Engineer from our JCG partner Rob Diana at the Regular Geek blog...
Java Code Geeks and all content copyright © 2010-2014, Exelixis Media Ltd | Terms of Use
All trademarks and registered trademarks appearing on Java Code Geeks are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries.
Java Code Geeks is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.

Sign up for our Newsletter

15,153 insiders are already enjoying weekly updates and complimentary whitepapers! Join them now to gain exclusive access to the latest news in the Java world, as well as insights about Android, Scala, Groovy and other related technologies.

As an extra bonus, by joining you will get our brand new e-books, published by Java Code Geeks and their JCG partners for your reading pleasure! Enter your info and stay on top of things,

  • Fresh trends
  • Cases and examples
  • Research and insights
  • Two complimentary e-books