Featured FREE Whitepapers

What's New Here?


Test-Driven Development (TDD)

What is Test-Driven Development (TDD)? Test-Driven Development is a process that relies on the repetition of very short development cycle. It is based on the test-first concept of Extreme Programming (XP) that encourages simple design with high level of confidence. The procedure of doing TDD is following:  Write a test Run all tests Write the implementation code Run all tests RefactorThis procedure is often called Red-Green-Refactor. While writing tests we are in the red state. Since test is written before the actual implementation, it is supposed to fail. If it doesn’t, test is wrong. It describes something that already exists or it was written incorrectly. Being in green while writing tests is a sign of false positive. Tests like that should be removed or refactored. Next comes the green state. When the implementation of the last test is finished, all tests should pass. If they don’t, implementation is wrong and should be corrected. The idea is not to make the implementation final, but to provide just enough code for tests to pass. Once everything is green we can proceed to refactor the existing code. That means that we are making the code more optimum without introducing new features. While refactoring is in place, all tests should be passing all the time. If one of them fails, refactor broke an existing functionality. Refactoring should not include new tests. Speed is the keyI tend to see TDD as a game of ping pong (or table tennis). The game is very fast. Same holds true for TDD. I tend not to spend more than a minute on either side of the table (test and implementation). Write a short test and run it (ping), write the implementation and run all tests (pong), write another test (ping), write implementation of that test (pong), refactor and confirm that all tests are passing (score), repeat. Ping, pong, ping, pong, ping, pong, score, serve again. Do not try to make the perfect code. Instead, try to keep the ball rolling until you think that the time is right to score (refactor).   It’s not about testing T in TDD is often misunderstood. TDD is the way we approach the design. It is the way to force us to think about the implementation before writing the code. It is the way to better structure the code. That does not mean that tests resulting from using TDD are useless. Far from that. They are very useful and allow us to develop with great speed without being afraid that something will be broken. That is especially true when refactoring takes place. Being able to reorganize the code while having the confidence that no functionality is broken is a huge boost to the quality of the code. The main objective of TDD is code design with tests as a very useful side product. Mocking In order for tests to run fast thus providing constant feedback, code needs to be organized in a way that methods, functions and classes can be easily mocked and stubbed. Speed of the execution will be severely affected it, for example, our tests need to communicate with the database. By mocking external dependencies we are able to increase that speed drastically. Whole unit tests suite execution should be measured in minutes if not seconds. More importantly than speed, designing the code in a way that it can be easily mocked and stubbed forces us to better structure that code by applying separation of concerns. With or without mocks, the code should be written in a way that we can, for example, easily replace one database for another. That “another” can be, for example, mocked or in-memory database. An example of mocking in Scala can be found in the Scala Test-Driven Development (TDD): Unit Testing File Operations with Specs2 and Mockito article. If your programming language of choice is not Scala, article can still be very useful in order to see the pattern that can be applied to any language. Watchers Very useful tool when working in the TDD fashion are watchers. They are frameworks or tools that are executed before we start working and are watching for any change in the code. When such a change is detected, all the tests are run. In case of JavaScript, almost all build systems and task runners allow this. Gulp (my favorite) and Grunt are two out of many examples. Scala has sbt-revolver (among others). Most of other programming languages have similar tools that recompile (if needed) and run all (or affected) tests when the code changes. I always end up having my screen split into two windows. One with the code I’m working on and the other with results of tests that are being executed continually. All I have to do is pay attention that the output of those watchers corresponds with the phase I’m in (red or green). Documentation Another very useful side effect of TDD (and well structured tests in general) is documentation. In most cases, it is much easier to find out what the code does by looking at tests than the implementation itself. Additional benefit that other types of documentation cannot provide is that tests are never outdated. If there is any discrepancy between tests and the implementation code, tests fail. Failed tests mean inaccurate documentation. Tests as documentation article provides a bit deeper reasoning behind the usage of tests instead of traditional documentation. Summary In my experience, TDD is probably the most important tool we have in our software toolbox. It takes a lot of time and patience to become proficient in TDD but once that art is mastered, productivity and quality increases drastically. The best way to both learn and practice TDD is in combination with pair programming. As in a game of ping pong that requires two participants, TDD can be in pairs where one coder writes tests and the other writes the implementation of those tests. Roles can switch after every test (as it’s often done in coding dojos). Give it a try and don’t give up when faced with obstacles since there will be many. Test Driven Development (TDD): Best Practices Using Java Examples is a good starting point. Even though it uses Java examples, same, if not all, practices can be applied to any programming language. For an example in Java (as in the previous case, it is easily aplicable to other languages) please take a look at TDD Example Walkthrough article. Another great way to perfection TDD skills are code katas (there are many on this site). What is your experience with TDD? There are as many variations as there are teams practicing it and I’d like to hear about your experience.Reference: Test-Driven Development (TDD) from our JCG partner Viktor Farcic at the Technology conversations blog....

Creating a Succession Plan for Your Technical Team

We often think about a succession plan for managers. But, if you’re not thinking about a succession plan for your technical team, you’re falling prey to local shortages, and hiring the same old kinds of people. You’re not getting diverse people. That means you may not be able to create innovative, great products. It also means your people might be stuck. As soon as they can, they might leave. Sometimes, when I coach people on their hiring process, I discover that they have all one kind of person. Everyone has five years of experience in one domain. Or, everyone has fifteen years. Or, everyone has the same background. Everyone all looks alike. Everyone—even though they were hired at different times—has exactly the same demographics. This is not good. You want a mixture of experience on your team. You want some people with less experience and some people with more. I once had a client who, through their hiring practices and attrition, ended up with people who had no less than 25 years of experience. Every single person had at least 25 years of experience in this particular domain. It was very interesting introducing change to that organization, especially to the managers. The technical staff had no problem with change. But the managers? Oh boy. They had worked in a particular way for so long they had problems thinking in any other way. That was a problem. It’s not that less or more experience leads to easier or more difficult change. It’s that heterogeneity in a team tends leads to more innovation and more acceptance of change. So, what can you do to create a succession plan for your team?Assess the number of entry-level, mid-level, senior, and principal technical staff you have. I think of entry-level as 0-2 years, mid-level as about 2-10 years, senior as about 10-20 years, principal as about 20 years and on. Your ranges may vary. If you have narrower ranges, ask yourself why. If you start senior engineers at 5 years of experience, I want to know how the heck you can. You can call them anything you want. Are they really senior? Or, do you have title inflation? If you don’t already have one, create an expertise criteria chart. That’s a chart that shows what the criteria are for each level. Because your people might just have a year of experience every year, and not really have acquired any valuable experience. You and I both know people like that, right? Take the qualities, preferences and non-technical skills that you value the most when you hire. Explain what you want in each level, and that’s how you create an expertise criteria chart for your team. Resolve the criteria across the organization, so that your team is on par with the rest of the organization. In your one-on-ones, have a conversation with each person about their career goals and how you see their career over time. Provide feedback. If they want coaching, provide that.Now, you have data. You have information about how people are performing against what you need. You have information about how you could “slot” people into the HR ranges, if you need to do so. And, if you need to hire people, you have the opportunity to hire people where you need to do so. I did this when I was a manager. I needed the data to bring one person to parity. I needed the data later to bring an entire testing team to parity with the developers. This is a ton of work. You can do it. It’s worth it.Reference: Creating a Succession Plan for Your Technical Team from our JCG partner Johanna Rothman at the Managing Product Development blog....

Beating The ARC

For the uninitiated, ARC is Apple’s term for Automatic Reference Counting. Objective-C uses a reference counting scenario to collect objects which is pretty painful to work with. Personally I preferred C/C++’s manual delete/free to the Objective-C semantics. But a couple of years ago Apple introduced ARC in which the compiler implicitly inserts the retain/release reference counting logic. While its a big improvement its still a reference counter with many of the implied limitations. It solves 95% of your memory handling logic leaving the hardest 5% for you to deal with manually but it does have one advantage over a GC: determinism. Since memory is deallocated immediately it provides consistent performance with no GC stalls. We briefly covered the garbage collection approach we took with the new iOS VM, however this time we’ll go more in-depth into the implementation details. Our goal with the garbage collection was not to create the fastest GC possible but the one that stalls the least. Up to now pretty much all open source/iOS VM’s used Boehm GC which is a conservative GC designed for C, it is state of the art and pretty cool but stalls. Boehm can’t really avoid stalling since it needs to stop all executing threads so it can traverse their stacks and this takes time… Unlike C, we can make a lot of assumptions in a Java application thanks to the type safety and clearly defined VM. This makes the process of collecting comparatively easy and makes it possible to collect without stopping the world. We do however need that threads yield the CPU shortly otherwise the GC will be blocked, this is generally a good practice and the EDT makes sure to follow that practice however if you do something like this: while(true) { System.out.println('WHeee"); }It would block our new GC from running unless you add a Thread.yield/sleep or wait() call (besides draining the CPU/battery). This might be considered a flaw but we mitigated that to some degree by incorporating a reference counting collector as well (similar to ARC) which deals with the “low hanging garbage” thus making the actual GC process far less important so our GC sweeps don’t need to be very fast. But this post is titled “beating the ARC”… How can we be faster than ARC? Simple, we don’t de-allocate. All objects that our reference counter deems to be garbage are sent to the garbage heap and finalized / deleted on the GC thread (as is custom in Java) hence we get the benefit of multi-core parallel cleanup logic on top of the fast performance. Our GC never actually pauses the world, it uses a simple mark sweep cycle where we iterate the thread stacks and mark all objects in use, we then iterate all the objects in the world and delete the living, unmarked objects. Since deletion of GC’d and reference counted objects is always done in the GC thread this is pretty easy and thread safe on the VM part. The architecture is actually rather simple and conservative. The benefit of the reference counting approach becomes very clear with the non-pausing GC, since the reference counting system still kicks out objects from RAM the GC serves only for the heavy lifting. So it can be executed less frequently and its OK for it to miss quite a few objects as the reference counting implementation will pick up the slack. We are still working on getting this into users hands ideally within the next couple of weeks (albeit in alpha state) and eventually open sourcing all of that code.Reference: Beating The ARC from our JCG partner Shai Almog at the Codename One blog....

The Caveats of Dual-Licensing

We’ve been in business for more than one year now with our dual-licensing strategy for jOOQ. While this strategy has worked very well for us, it has also been a bit of a challenge for some of our customers. Today, we’re going to show you what caveats of dual-licensing we’ve run into. Our dual-licensing strategy For those of you not acquainted with our license model, just a brief reminder to get you into the subject. We mainly consider ourselves as a vendor of Open Source software. However, contrary to a variety of other companies like Gradleware or Red Hat, we don’t want to build our business model on support, which is a much tougher business than licensing. Why?Support contracts need a lot more long-term trust by customers, and more outbound sales. There’s only little inbound interest for such contracts, as people don’t acquire support until they need it. Vendor-supplied support competes with third-party support (as provided by UWS, for instance), which we want to actively encourage. We’d love to generate business for an entirely new market, not compete with our allies.So we were looking for a solution involving commercial licensing. We wanted to keep an Open Source version of our product because:We’ll get traction with Open Source licensing much much more quickly than with commercial licensing While Open Source is a very tough competitor for vendors, it is also a great enabler for consumers. For instance, we’re using Java, Eclipse, H2, and much more. Great software for free!It wouldn’t be honest to say that we truly believe in “free as in freedom” (libre), but we certainly believe in “free as in beer” (gratis) – because, who doesn’t. So, one very simple solution to meet the above goals was to offer jOOQ as Open Source with Open Source databases, and to offer jOOQ under a commercial license with commercial databases. The Caveat This was generally well received with our user base as a credible and viable dual-licensing model. But there were also caveats. All of a sudden, we didn’t have access to these distribution channels any more:Maven Central GitHub… and our paying customers didn’t have access to these very useful OSS rights any more:Source Code ModificationsSolution 1 – Ship Source Code Well, we actually ship our source code with the commercial binaries. At first, this was done merely for documentation purposes. Regardless of the actual license constraints, when you’re in trouble, e.g. when your productive system is down and you have to urgently fix a bug, doesn’t it just suck if you don’t have access to the source code of third-party dependencies? You will just have to guess what it does. Or illegally de-compile it. We don’t want to be that company. We trust our customers to deal responsibly with our source code. Solution 2 – Allow Modifications Our commercial licenses come in two flavours: Yearly and Perpetual. We quickly realised that some of our customers do not want to be dependent on us as a vendor. Perpetual licenses obviously help making customers more independent. But the disadvantage of perpetual licenses is the fact that vendors will not support old versions forever, and customers won’t have the right to upgrade to the next major release. While they are probably fine with not having access to new features, they would still like to receive an occasional bugfix. The solution we’ve come to adopt is a very pragmatic one: Customers already have the source code (see above), so why not allow customers to also apply urgent fixes themselves? Obviously, such modifications will void the warranty offered by us, but if you buy jOOQ today and 5 years down the line, you discover a very subtle bug in what will then be an unsupported version of jOOQ… don’t you just want to fix it?Conclusion Dual-licensing is a tricky business. You partition your user-base into two:The paying / premium customers The “freemium” customersBy all means, you must prevent your premium customers from being at a disadvantage compared to your “freemium” customers. There are certain rights that are probably OK to remove (such as the right of free distribution). But there are rights that are just annoying not to have. And those rights are the rights that matter the most to the every day work of an engineer: To fix that bloody bug ! We’re very curious: What are your opinions towards dual-licensing?Reference: The Caveats of Dual-Licensing from our JCG partner Lukas Eder at the JAVA, SQL, AND JOOQ blog....

20 (Or So) Things Managers Should Stop Saying To Engineers

This post is a direct reply to an article I recently read with title : “20 things engineers should stop saying‘.I was so frustrated and irritated when I finished reading this article that I couldn’t believe in my eyes. I still wonder what kind of manager is suggesting these ideas and how their engineer would react after reading this post. All these 20 “evil” engineer phrases are not at all “technobabbles” and they are so clear that even a first year student would easily understand their meaning. Which part of that sentence don’t you understand:”What’s the ROI of that feature?” To be honest this is a question that managers should as before dropping ridiculous requirements to the dev team. You should feel really lucky that you have engineers asking such things! Moreover, some of them look so weird to me and I’ve never heard of them the last 16 years I’m working with software development teams so I’ll try to figure out what do they mean. But the most important question for me, is when and why an engineer are forced to say those “terrible” and “pissing” stuff. Let’s take them one by one, side by side with the root cause of each sentence (in bold) alongside with some comments of mine:We don’t work against dates: “I’ve promised to the customer that everything will be ready by 25th of December. Now we need to talk about their requirements.“ Of course we don’t work against dates when we don’t have a fixed scope. We need more resources: “You two guys will build,test,deploy and support our new ERP system“ Quality, speed, cost — pick two. It’s Friday afternoon: “I wanted all these features customer asked it until Monday but we can’t afford to pay overtime. Oh I almost forgot it. Please keep the quality of the project (unit tests, complexity, coding rules) at the highest level. Besides, it’s your responsibility!“ What’s the ROI of that feature: “One customer out of thousands wants a “tiny” and “cool” new feature that will send him a notification when it’s time to pickup the kids from school“ …Meh… We don’t need reporting: “I want to get every week an excel in my inbox containing a detailed report about your activities“. Wake up! There are plenty of ways to track developers work and definitely “reporting” is not the right one. The customer really doesn’t mean that: “Customer wants that the system to popup an alert every time they press the letter ‘A’ instead of the letter ‘S’“. They can use the command line: “I ‘know’ that you have more important things to do but can we implement by tomorrow a screen with some nice buttons and instructions so that customer can trigger the weekly report in pdf format?“. They can use the API: See #7 You wouldn’t understand. Actually the correct is: ”You don’t even try to understand”, or “You don’t want to understand”. “It’s the 3rd time I ask you why it’s so hard to write bug-free software according to what the customer wants!“. Note: Nobody knows what the customer wants, not even the customer! That’s a nice-to-have: See #4 We tried that before: “Can you please integrate their ‘legacy’ system with our ERP? I know they don’t provide any way of accessing their data but you’re the experts, right?“ I don’t understand the requirements (have you read them? no): “Why it’s so hard to write code based on customer requirements. You can find them in my hand-written notes taken when interviewing some end-users, in the last 6 emails I sent you and in a google document I shared with you!“. 99% of these “requirements” are controversial and lack clarity. Technical debt: “I don’t want to hear anything about code quality. Just throw some code here and there to make this work!“ Seriously? You don’t want engineers talk about Technical debt? And even worse, you don’t understand what technical debt is? Can you QA this? – Ok maybe this one is somewhat not clear. Does “Can you reproduce that?” make any sense? “Customers claim that the system, randomly, with no obvious reason, doesn’t allow them to delete an order!“ It’s not a bug, it’s a feature: “I know we haven’t discussed it yet, but can you please add a double confirmation dialog box when users delete a customer record? We need to fix this bug A.S.A.P.” That violates the CAP theorem . OK, this one and the following are the only sentences that can be considered as “obfuscating”: “Our new enterprise platform is going to be used by thousands of clients via web service invocation. So it’s really important to achieve 100% availability, data consistency and at the same time it continues operating even with arbitrary message loss.“ Rube Goldberg: “Let’s build a a calendar application that uses a distributed NoSQL database, soap & rest web services, elastic search, HTML5 responsive design, Node.js, SPRING MVC and many more ” Does KISS (Keep It Stupid Simple) make any sense to you? That’s the platform team’s responsibility: “Can we add a caching mechanism when fetching data from DB?“. This is a totally valid reply when there are in-house developed libraries and frameworks that consist the “platform”. Usually in such cases companies have a dedicated team to maintain the platform. It will take 30 points: “How long it will take to implement this feature?” Why “It will take one week is more meaningful to you?” Because you will pull trigger if it’s not ready at the time promised? What about technical difficulties that might occur? Random tasks that will delay me? or any other external factor that will make me lose some time? 30 points is a very decent answer. Would you prefer something like: ”This feature is 30 pounds heavy?” What I tell you is that I can estimate the size of this feature based on my previous experience but I can’t guarantee the delivery time. Is this is so hard to follow? Why would we do that? (See #4 or #6)Now that you’ve read a developer’s point of view who’s the winner? Manager or Engineers? Nobody!! Both of these sides should try to understand each other and come to a common way of communication. They are not enemies? They share a common aim which is (or should be) providing working and useful software for end-users / customers.Reference: 20 (Or So) Things Managers Should Stop Saying To Engineers from our JCG partner Patroklos Papapetrou at the Only Software matters blog....

Gradle Goodness: Running Groovy Scripts as Application

In a previous post we learned how to run a Java application in a Gradle project. The Java source file with a main method is part of the project and we use the JavaExec task to run the Java code. We can use the same JavaExec task to run a Groovy script file. A Groovy script file doesn’t have an explicit main method, but it is added when we compile the script file. The name of the script file is also the name of the generated class, so we use that name for the main property of the JavaExec task. Let’s first create simple Groovy script file to display the current date. We can pass an extra argument with the date format we wan’t to use.   // File: src/main/groovy/com/mrhaki/CurrentDate.groovy package com.mrhaki// If an argument is passed we assume it is the // date format we want to use. // Default format is dd-MM-yyyy. final String dateFormat = args ? args[0] : 'dd-MM-yyyy'// Output formatted current date and time. println "Current date and time: ${new Date().format(dateFormat)}" Our Gradle build file contains the task runScript of type JavaExec. We rely on the Groovy libraries included with Gradle, because we use localGroovy() as a compile dependency. Of course we can change this to refer to another Groovy version if we want to using the group, name and version notation together with a valid repository. // File: build.gradle apply plugin: 'groovy'dependencies { compile localGroovy() }task runScript(type: JavaExec) { description 'Run Groovy script'// Set main property to name of Groovy script class. main = 'com.mrhaki.CurrentDate'// Set classpath for running the Groovy script. classpath = sourceSets.main.runtimeClasspathif (project.hasProperty('custom')) { // Pass command-line argument to script. args project.getProperty('custom') } }defaultTasks 'runScript' We can run the script with or without the project property custom and we see the changes in the output: $ gradle -q Current date and time: 29-09-2014 $ gradle -q -Pcustom=yyyyMMdd Current date and time: 20140929 $ gradle -q -Pcustom=yyyy Current date and time: 2014 Code written with Gradle 2.1.Reference: Gradle Goodness: Running Groovy Scripts as Application from our JCG partner Hubert Ikkink at the JDriven blog....

Don’t just randomize, truly randomize!

The state of web application cryptography has changed, and each development language provides its own way of working with it. I will touch on the current state of random number generation and the differences found with it within the Java and JavaScript development languages. When designing and building web applications, security concerns obviously play a crucial role. The term security is pretty broad covering numerous areas including, but not limited to:      input validation authentication session management parameter manipulation protection cryptographyIn a previous DOD/government project, I was fortunate enough to work on a Java security component that dealt directly with that last-aforementioned security area: cryptography. Specifically, random number generation. Coming from a pure business Java development background, initially I had to take a step back from the technical design document and figure out how we wanted this web application to enforce confidentiality and integrity. More specifically, how should the application keep secrets? How should we provide seeds for cryptographically-strong random values? Should we allow the browser to handle the random number generation or keep that on the back end? Lastly, what is the best way to create a random key for encryption? Encryption and Randomness In secure web application development, that last question plays a big role in the security benefit of using randomness. Some may say, “Hey, in regards to encryption, what about just using Base64 or UTF8 encoding?”. To put it bluntly… that’s old school! Nowadays, most security analysts don’t even consider those two solid, secure encryption methods anymore. Other examples for the use of randomness could be the generation of random challenges upon logging in, creation of session IDs, or using secret keys for various encryption purposes. Regarding the last example, in order to generate such secret keys, an entropy pool is needed. An entropy pool that is unpredictable. This can be accomplished by verifying non-repeatable output of a cryptographically strong-random number generator (CSPRNG). A rule of thumb I learned from various government security analysts I had the pleasure of working with was to steer clear of random numbers supplied by a third party. Some may ask, “well, what’s wrong with http://random.org?” Nothing really… it does a great job of being a ‘true’ random number generator. Especially since it claims to generate randomness via atmospheric noise. However, if patterns are indeed detectable in this atmospheric noise, that sort of debunks its truly random claim. From a theoretical perspective, it’s a little difficult to find an unbiased measurement of physical sources of randomness. But I digress! Can’t someone just use Random.org via SSL over HTTPS? Yes, but that’s not a good idea if you’re implementing this inside a crypto-secure system. Even the http://random.org site itself says to not use it for cryptography. Then again, outsourcing your random number generation kind of defeats the purpose of a secure system. Java Let’s take a look at what the Java language has to offer. From a Java server-side perspective, the Java language comes standard with the traditional Random class. This works fine for conventional, non-secure applications you may want randomized like a coin flipping game or a interactive card shuffler. This class has a fairly short period (2^48) given its seed = (seed * multiplier + 0xbL) & ((1L << 48) – 1). This will only generate a small combination of values. To take a step back, a seed helps to recall, via your initial algorithm, the same sequence of random numbers that were generated. Another limitation, at least for Java versions prior to 1.4, is that when using the default Random constructor to generate random numbers, it defaults to the current system time since January 1, 1970 measured in milliseconds. Thus, an outside user can easily figure out the random numbers generated if they happen to know the running time of the application. SecureRandom to the rescue! SecureRandom, on the other hand, produces cryptographically-strong random numbers. CSPRNGS use a true random source: entropy. For example, a user’s mouse clicks, mouse movements, keyboard inputs, a specific timing event, or any other OS-related random data. So it comes close to becoming a TRUE random number generator. I say close because at least they are not being generated by a deterministic algorithm. But then again, theoretically, is anything ever truly random? Okay, nevermind that question. Keep in mind that SecureRandom uses PRNG implementations from other classes that are part of Java cryptographic service providers. For example, in Windows, the SUN CSP uses the SHA1PRNG by default. This is solid because utilizing this default SecureRandom() constructor, a user can retrieve a 128-bit seed. So based on its seed source, the chances of repeating are extremely less than the original java Random class. A simple, straightforward implementation of a self-seeding SecureRandom generator: // Very nice... // Instantiate secure random generator using the SHA1PRNG algorithm SecureRandom randomizer = SecureRandom.getInstance(“SHA1PRNG”);// Provide a call to the nextBytes method. This will force the SecureRandom object to seed itself. byte[] randomBytes = new byte[128]; randomizer.nextBytes(randomBytes); On the other hand… The famous ‘Guarantee to be Random’ standard method: // No bueno public int getRandomNumber() { return 7; // shhhh no one will know... teehee } Bottom line: the most important thing to take into consideration is…. the seed! All psuedo-random number generators are deterministic if one knows the seed. Thus, the fact that SecureRandom is much better suited for working with a strong entropy source than Random is, gives it the advantage in a crytopgraphy environment. The downside is, the bigger the entropy needed, the worse the performance. Something to consider. JavaScript So what about JavaScript? Can the client-side be trustworthy enough to handle cryptography? Overall, how truly secure is the JavaScript language? Now that JavaScript and single page applications are becoming more and more popular, these are valid and important questions. One of the most common insecurities on the client side is HTML injection, whereby an application may unknowingly allow third parties to inject JavaScript into its security context. Today, websites and many web applications need some sort of client-side encryption. Especially since browsers remain the tool of choice when interacting with remote servers. Fortunately for us JavaScripters, the most current browsers come packaged with sophisticated measures to combat these insecurities. When collecting entropy, it’s obvious that JavaScript can very easily collect keyboard input, mouse clicks, etc.,  as well. However, it is difficult to provide entropy for determining a strong random number via a browser without encountering a usability drawback such as having to ask for some user interaction to assist in the seeding for a pseudo-random number. That shouldn’t matter anyway, as JavaScript only comes with a Math.random() function that takes in no arguments. This Math.random() function comes automatically seeded similar to Java’s Random class, but what can one do to seed it manually? Nothing really. JavaScript doesn’t come with the ability to manually seed Math.random(). There are a few drop-ins, one of which is a popular RC-4 based one and is available on GitHub. It even has support for Node.js and Require.js. Let’s keep in mind that all browsers come packaged with different random number generators. Chrome uses the ‘Multiply-with-carry’ generator. Firefox and IE use linear congruential generators (the Java Random class is modified using a linear congruential formula as well), so it would not be wise to use Math.random() alone as an only source of entropy. Let’s face it, this function is not cryptographically strong. Some JavaScript crypto libraries do provide some help – one of which is CryptoJS. CryptoJS includes many secure cryptographic algorithms written in JavaScript. But let’s skip over the third party libraries and look at what Javascript has to offer. Lately MDN has not disappointed. They are making strides in the cryptography world. For instance, the window.crypto.getRandomValues() function does pretty well. All that is needed with this entropy collection function is to pass in a integer-based TypeArray and the function will populate the array with cryptographically random numbers. Keep in mind, it is an experimental API, meaning it is still in ‘Working Draft’ status, but appears to only work in a few of the latest browsers that use strong (pseudo) random number generators. Here is a great link that shows browser compatibility with this particular function. Conclusion Until insecurities like code injection, side-channel attacks, and cross-site scripting are no longer a threat, it’s difficult to rely on JavaScript as a serious crypto environment. Hold on, though! I’m not proposing relying on full blown in-browser cryptography. One could open up a whole can of worms if the client side was to handle the majority of application security features. However, this doesn’t mean that things won’t change in the future. Perhaps one day there will be a full cryptographic stack built into HTML/JavaScript. In addition, given the popularity of SPA and tools like Angular, I have a feeling window.crypto.getRandomValues() will be supported by all future browsers one day as well. Until then, nothing beats generating random numbers on the back end. I’m looking on the bright side however, and I’m always keeping an eye out to see what’s in store for future SPA secure web development!Reference: Don’t just randomize, truly randomize! from our JCG partner Vince Pendergrass at the Keyhole Software blog....

Estimates or #NoEstimates? that is the question

To estimate or not to estimate, to join the #NoEstimates bang-wagon or not, that is the question. Maybe it is a navel gazing exercise for agile-folk but it does seem to be the reoccurring theme. And I can’t get over this feeling that some of my peers think I’m a bit stupid for continuing to support estimates. Complicating matters I’m finding my own work and research is starting to be cited in support of #NoEstimate – Dear Customer (PDF version), my publicity for Vierdort’s Law, Notes on Estimation and More notes on estimation. Add my own #NoProjects / #BeyondProjects logic isn’t far removed from the whole estimates discussion. At Lean Agile Scotland a few weeks ago Seb Rose and I were discussing the subject, in the end Seb said something to the effect: “How can you continue to believe in estimation when your own arguments show it is pointless?” (I’m sure Seb will correct me if my memory is fault.) My reply to Seb was something along the lines: I continue to believe that estimation can be both useful and accurate, however I increasingly believe the conditions under which this holds are beyond most organizations.To which Seb challenged me to list those conditions. Well here is that list. I’ve blogged about this before (well, I’ve mentioned it in lots of blogs, see this one Conclusions and Hypothesis about estimates) and I’ve devoted a large section of the Xanpan book to talking about I see estimates working but I think its worth revisiting the subject. Before continuing I should say: I’m talking about Effort Estimates specifically. There is another discussion which needs to be had around business value/benefit estimation. Here is, a probably incomplete, list of conditions I think are required in order for effort estimates to be accurate:The people and teams who will undertake the work need to do the estimates Estimates go off if not used: estimates only remain valid for a short period of time (days), the longer the elapsed time between making the estimate and doing the work the less accurate they will prove Estimates will only be accurate if the teams are stable Estimates much be calibrated against past performance, i.e. data is needed Together #3 and #4 imply that only teams which have been working in this fashion for a while can produce accurate estimates Teams must have a record of delivering and must be largely able to undertake the work needed, i.e. there are few dependencies on other teams and few “call outs” to elsewhere in the organization Estimates must be used as is, they cannot be adjusted Estimates cannot be used as targets (Goodharts Law will cut if they are) Estimates made in units of time (hours, days, etc.) are not reliable The tracking and measurement process must measure all work done, not just “project” work Financial bonus should not be tied to estimates or work People outside the team should not coerce the team in any wayThere are probably some other conditions which need to be on this list but I haven’t realised. I’m happy to have additional suggestions. Perhaps this list is already so long enough as to be unachievable for most teams. Perhaps meeting this conditions are unachievable for many, even most, organizations. In which case the #NoEstimate’ers are right. So… I believe estimation can be useful, I also believe it can be accurate but I believe there are a lot of factors that can cause effort estimates to go wrong. In fact, I know one team, possibly two, who claim their estimates and planning processes is very accurate. Perhaps I cling to my belief in estimates because I know these teams. When estimates do work I don’t believe they can work far into the future. They are primarily a tool for teams to decide how much work to take on for the next couple of weeks. I don’t expect estimates further out will prove reliable. Estimates for 2 to 12 weeks have some value but beyond the 3 month mark I don’t believe they will prove accurate. So my advice: don’t estimate anything that isn’t likely to happen in the next 3 months, and don’t plan any work based on estimates which extend more than 3 months into the future. Which means: that even if you accept my argument that estimates work they may not tell you what you want to know, they may not have much value to you under these conditions. And to further complicate matters I suspect that for mature teams estimation becomes pointless too. As implied by the list above I would not expect a team new to this (agile) way of working to produce reliable estimates. With experience, and the conditions above, I think they can. One of the ways I think it works is by helping teams break down work into small pieces which flow. As a team get better I would expect the effort estimation to exhibit a very tight distribution. When this happens then simply counting the number of card (tasks, stories, whatever the thing you are estimating is) will have about the same information value for a fraction of the cost. (For example, suppose a team normally do 45 points of work per iteration, if the teams average size estimate is 5 with a standard deviation of 0.5 then they would be expected to accept 9 pieces of work per iteration. If these statistics are stable then estimation works. But at this point simply taking in 9 pieces of work would also prove a reliable guide.) So:Effort estimation doesn’t work for immature teams because they don’t exhibit the conditions above Effort estimation does work for mature teams but Effort estimation is pointless for very mature teamsEven given all this I think estimation is a worthwhile activity for teams of type 1 and 2 because it has other benefits. One benefit is that is promotes discussion – or at least it should. Another is that it forms part of a design activity that helps teams make pieces of work smaller. But there is another reason I want teams to do it: Credibility. Estimation is so enshrined in the way many businesses work that teams and those trying to introduce change/agile risk undermining their own credibility if they remove estimation early. And I don’t just mean credibility with “the business” I think many developers also expect estimation and if asked to adopt a process without it will be skeptical. So its just possible that estimation as we knowing – planning poker, velocity and such – is a placebo. It doesn’t actually help many teams but it helps people feel they are doing something right. In time they may find the placebo actually works or they may find they don’t need it. Another reason why I like developers to think about “how long will the take” is that I believe it helps them set their own deadline. It helps them focus their own work. Thus I keep advocating estimates because I think they are useful to the team, the fact that you might be able to tell when something might be “done” is a side effect. Since I find long range estimates questionable I advocate a cheap approach which might be usefulness or might just be a placebo. However, I do believe, that given the right conditions teams can estimate accurately, and can deliver to those estimates. Increasing I believe very few organizations can provide those conditions to their teams.Reference: Estimates or #NoEstimates? that is the question from our JCG partner Allan Kelly at the Agile, Lean, Patterns blog....

From Legacy Code To Testable Code–Introduction

The word “legacy” has a lot of connotations. Mostly bad ones. We seem to forget that our beautiful code gets to “legacy“ status three days after writing it. Michael Feathers, in his wonderful book “Working Effectively With Legacy Code” defined legacy code as code that doesn’t have tests, and there is truth in that, although it took me a while to fully understand it. Code that doesn’t have tests rots. It rots because we don’t feel confident to touch it, we’re afraid to break the “working” parts. Code rotting means that it doesn’t change, staying the way we first wrote it. I’ll be the first to admit that whenever I write code, it comes in its ugly form. It may not look ugly immediately after I wrote it, but if I wait a couple of days (or a couple of hours), I know I will find many ways to improve it. Without tests I can rely either on the automatic capabilities of refactoring tools, or pure guts (read: stupidity). Most code doesn’t look nice after writing it. But nice doesn’t matter. Because code costs, we’d like it to help us understand it, and minimize debugging time. Refactoring is essential to lower maintenance costs, and therefore tests are essentials. And this is where you start paying The problem of course, is that writing tests for legacy code is hard. Code is convoluted, full of dependencies both near and far, and without proper tests its risky to modify. On the other hand, legacy code is the one that needs tests the most. It is also the most common code out there – most of time we don’t write new code, we add it to an existing code base. We will need to change the code to test it, in most cases. Here are some examples why:We can’t create an instance of the tested object. We can’t decouple it from its dependencies Singletons that are created once, and impact the different scenarios Algorithms that are not exposed through public interface Dependencies in base classes of the tested code.Some tools, like PowerMockito in Java, or Typemock Isolator for C# allow us to bypass some of these problems, although they too come with a price: lower speed and code lock-down. The lower speed come from the way these tools work, which makes them slower compared to other mocking tools. The code lock-down comes as a side effect of extra coupling to the tested code – the more we use the power tools’ capabilities, they know more about the implementation. This leads to coupling between the tests and the code, and therefore make the tests more fragile. Fragile tests carry a bigger maintenance cost, and therefore people try not to change them, and the code. While this looks like a technology barrier, it manifests itself, and therefore can be overcome, by procedure and leadership (e.g., once we have the tests, encourage the developers to improve the code and the tests). Even with the power tools, we’ll be left with some work. We might even want to do some work up front. Some tweaks to the tested code before we write the tests (as long as they are not risky), can simplify the tests. Unless the code was written simple and readable the first time. Yeah, right. We’ll need to do some of the following:Expose interfaces Derive and implement classes Change accessibility Use dependency injection Add accessors Renaming Extract method Extract classSome of these changes to the code is introducing “seams” into it. Through these seams, we can enter probes to check the impact of the code changes.  Other changes just help us make sense of it. We can if these things are refactoring patterns or not. If we apply them wisely, and more important SAFELY, we can prepare the code to be tested more easily and make the tests more robust. In the upcoming posts I’ll look into these with more details.Reference: From Legacy Code To Testable Code–Introduction from our JCG partner Gil Zilberfeld at the Geek Out of Water blog....

Neo4j: COLLECTing multiple values

One of my favourite functions in Neo4j’s cypher query language is COLLECT which allows us to group items into an array for later consumption. However, I’ve noticed that people sometimes have trouble working out how to collect multiple items with COLLECT and struggle to find a way to do so. Consider the following data set:         create (p:Person {name: "Mark"}) create (e1:Event {name: "Event1", timestamp: 1234}) create (e2:Event {name: "Event2", timestamp: 4567})   create (p)-[:EVENT]->(e1) create (p)-[:EVENT]->(e2) If we wanted to return each person along with a collection of the event names they’d participated in we could write the following: $ MATCH (p:Person)-[:EVENT]->(e) > RETURN p, COLLECT(e.name); +--------------------------------------------+ | p | COLLECT(e.name) | +--------------------------------------------+ | Node[0]{name:"Mark"} | ["Event1","Event2"] | +--------------------------------------------+ 1 row That works nicely, but what about if we want to collect the event name and the timestamp but don’t want to return the entire event node? An approach I’ve seen a few people try during workshops is the following: MATCH (p:Person)-[:EVENT]->(e) RETURN p, COLLECT(e.name, e.timestamp) Unfortunately this doesn’t compile: SyntaxException: Too many parameters for function 'collect' (line 2, column 11) "RETURN p, COLLECT(e.name, e.timestamp)" ^ As the error message suggests, the COLLECT function only takes one argument so we need to find another way to solve our problem. One way is to put the two values into a literal array which will result in an array of arrays as our return result: $ MATCH (p:Person)-[:EVENT]->(e) > RETURN p, COLLECT([e.name, e.timestamp]); +----------------------------------------------------------+ | p | COLLECT([e.name, e.timestamp]) | +----------------------------------------------------------+ | Node[0]{name:"Mark"} | [["Event1",1234],["Event2",4567]] | +----------------------------------------------------------+ 1 row The annoying thing about this approach is that as you add more items you’ll forget in which position you’ve put each bit of data so I think a preferable approach is to collect a map of items instead: $ MATCH (p:Person)-[:EVENT]->(e) > RETURN p, COLLECT({eventName: e.name, eventTimestamp: e.timestamp}); +--------------------------------------------------------------------------------------------------------------------------+ | p | COLLECT({eventName: e.name, eventTimestamp: e.timestamp}) | +--------------------------------------------------------------------------------------------------------------------------+ | Node[0]{name:"Mark"} | [{eventName -> "Event1", eventTimestamp -> 1234},{eventName -> "Event2", eventTimestamp -> 4567}] | +--------------------------------------------------------------------------------------------------------------------------+ 1 row During the Clojure Neo4j Hackathon that we ran earlier this week this proved to be a particularly pleasing approach as we could easily destructure the collection of maps in our Clojure code.Reference: Neo4j: COLLECTing multiple values from our JCG partner Mark Needham at the Mark Needham Blog blog....
Java Code Geeks and all content copyright © 2010-2014, Exelixis Media Ltd | Terms of Use | Privacy Policy | Contact
All trademarks and registered trademarks appearing on Java Code Geeks are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries.
Java Code Geeks is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.
Do you want to know how to develop your skillset and become a ...
Java Rockstar?

Subscribe to our newsletter to start Rocking right now!

To get you started we give you two of our best selling eBooks for FREE!

Get ready to Rock!
You can download the complementary eBooks using the links below: