Featured FREE Whitepapers

What's New Here?

software-development-2-logo

Fixing Bugs – there’s no substitute for experience

We’ve all heard that the only way to get good at fixing bugs in through experience – the school of hard knocks. Experienced programmers aren’t afraid, because they’ve worked on hard problems before, and they know what to try when they run into another one – what’s worked for them in the past, what hasn’t, what they’ve seen other programmers try, what they learned from them. They’ve built up their own list of bug patterns and debugging patterns, and checklists and tools and techniques to follow. They know when to try a quick-and-dirty approach, use their gut, and when to be methodical and patient and scientific.They understand how to do binary slicing to reduce the size of the problem set. They know how to read traces and dump files. And they know the language and tools that they are working with. It takes time and experience to know where to start looking, how to narrow in on a problem; what information is useful, what isn’t and how to tell the difference. And how to do all of this fast. We’re back to knowing where to tap the hammer again. But how much of a difference does experience really make? Steve McConnell’s Code Complete is about programmer productivity: what makes some programmers better than others, and what all programmers can do to get better. His research shows that there can be as much as a 10x productivity difference in the quality, amount and speed of work that top programmers can do compared to programmers who don’t know what they are doing. Debugging is one of the areas that really show this difference, that separates the men from the boys and the women from the girls. Studies have found a 20-to-1 or even 25-to-1 difference in the time it takes experienced programmers to find the same set of defects found by inexperienced programmers. That’s not all. The best programmers also find significantly more defects and make far fewer mistakes when putting in fixes. What’s more important: experience or good tools? In Applied Software Measurement, Capers Jones looks at 4 different factors that affect the productivity of programmers finding and fixing bugs:Experience in debugging and maintenance How good – or bad – the code structure is The language and platform Whether the programmers have good code management and debugging tools – and know how to use them.Jones measures the debugging and bug fixing ability of a programmer by measuring assignment scope – the average amount of code that one programmer can maintain in a year. He says that the average programmer can maintain somewhere around 1,000 function points per year – about 50,000 lines of Java code. Let’s look at some of this data to understand how much of a difference experience makes in fixing bugs. Inexperienced staff, poor structure, high-level-language, no maintenance toolsWorst Average Best150 300 500Experienced staff, poor structure, high-level language, no maintenance toolsWorst Average Best1150 1850 2800This data shows a roughly 20:1 difference between experienced programmers and inexperienced programmers, on teams working with badly structured code and without good maintenance tools. Now let’s look at the difference good tools can make: Inexperienced staff, poor structure, high-level language, good toolsWorst Average Best900 1400 2100Experienced staff, poor structure, high-level language, good toolsWorst Average Best2100 2800 4500Using good tools for code navigating and refactoring, reverse engineering, profiling and debugging can help to level the playing field between novice programmers and experts. You’d have to be an idiot to ignore your tools (debuggers are for losers? Seriously?). But even with today’s good tools, an experienced programmer will still win out – 2x more efficient on average, 5x from best to worst case. The difference can be effectively infinite in some cases. There are some bugs that an inexperienced programmer can’t solve at all – they have no idea where to look or what to do. They just don’t understand the language or the platform or the code or the problem well enough to be of any use. And they are more likely to make things worse by introducing new bugs trying to fix something than they are to fix the bug in the first place. There’s no point in even asking them to try. You can learn a lot about debugging from a good book like Debug It! or Code Complete. But when it comes to fixing bugs, there’s no substitute for experience. Reference: Fixing Bugs – there’s no substitute for experience from our JCG partner Jim Bird at the Building Real Software blog....
software-development-2-logo

25 things you’ve said in your career as a software engineer. Admit it!

This article is inspired by an older blog post. I’ve updated it to reflect modern languages and technologies.“It works fine on MY computer. Come and see it in action if you don’t believe me” “Who did you login as? Are you an administrator?” “It’s not a bug, it’s a feature” “That’s weird…” “It’s never done that before.” “It worked yesterday.” “How is that possible?” “Have you checked your network connection /settings.” (Especially when the application is too sloooow) “You must entered wrong data and crashed it?” “There is something funky in your data.” “I haven’t touched that part of the code for weeks!” “You must have the wrong library version.” “It’s just some unlucky coincidence, so don’t bother” “I can’t unit test everything!” “It’s not my fault. It must be that opensource library I’ve used.” “It works, but I didn’t write any unit tests.” “Somebody must have changed my code.” “Did you check for a virus on your system?” “Even though it doesn’t work, how does it feel?” “You can’t use that version on your operating system.” “Why do you want to do it that way?” “Where were you when the program blew up?” “I’m pretty sure I’ve already fixed that.” “Have you restarted your Application Server/DB Server/Machine after upgrading?” ”Which version of JRE / JDK / JVM have you installed?”Feel free to add your own. I’m sure there are plenty!! Reference: 25 things you’ve said in your career as a software engineer. Admit it! from our JCG partner Papapetrou P. Patroklos at the Only Software matters blog....
jenkins-logo

Build Flow Jenkins Plugin

Most of us we are using Jenkins/Hudson to implement Continuous Integration/Delivery, and we manage job orchestration combining some Jenkins plugins like build pipeline, parameterized-build, join or downstream-ext. We require configuring all of them which implies polluting the job configuration through multiple jobs, which takes the system configuration very complex to maintain. Build Flow enables us to define an upper level flow item to manage job orchestration and link up rules, using a dedicated DSL. Let’s see a very simple example: First step is installing the plugin. Go to Jenkins -> Manage Jenkins -> Plugin Manager -> Available and find for CloudBees Build Flow plugin.Then you can go to Jenkins -> New Job and you will see a new kind of job called Build Flow. In this example we are going to name it build-all-yy.And now you only have to program using flow DSL how this job should orchestrate the other jobs. In ‘ Define build flow using flow DSL‘ input text you can specify the sequence of commands to execute.In current example I have already created two jobs, one executing clean compile goal ( yy-compile job name) and the other one executing javadoc goal ( yy-javadoc job name). I know that this deployment pipeline is not real in a true environment but for now it is enough. Then we want javadoc job running after project is compiled. To configure this we don’t have to create any upstream or downstream actions, simply add next lines at DSL text area: build(‘yy-compile’); build(‘yy-javadoc’); Save and execute build-all-yy job and both projects will be built in a sequential way. Now suppose that we add a third job called yy-sonar which runs sonar goal that generates code quality sonar report. In this case it seems obvious that after project is compiled, generation of javadocs and code quality jobs can be run in parallel. So script is changed to:build(‘yy-compile’) parallel ( {build(‘yy-javadoc’)}, {build(‘yy-sonar’)} ) This plugin also supports more operations like retry (similar behaviour of retry-failed-job plugin) or guard-rescue, that it works mostly like a try+finally block. Also you can create parameterized builds, accessing to build execution or printing to Jenkins console. Next example will print build number of yy-compile job execution: b = build(‘yy-compile’) out.println b.build.numberAnd finally you can also have a quick graphical overview of the execution in Status section. It is true that could be improved more, but for now it is acceptable, and can be used without any problem.Build Flow plugin is in its early stages, in fact it is only at version 0.4. But will be a plugin to be considered in future, and I think it is good to know that it exists. Moreover is being developed by CloudBees folks so it is a guarantee of being fully supported by Jenkins. Reference: Build Flow Jenkins Plugin from our JCG partner Alex Soto at the One Jar To Rule Them All blog....
java-logo

Java Executor Service Types

ExecutorService feature was come with Java 5. It extends Executor interface and provides thread pool feature to execute asynchronous short tasks. There are five ways to execute the tasks asyncronously by using ExecutorService interface provided Java 6. ExecutorService execService = Executors.newCachedThreadPool(); This method of the approach creates a thread pool that creates new threads as needed, but will reuse previously constructed threads when they are available. These pools will typically improve the performance of programs that execute many short-lived asynchronous tasks. If no existing thread is available, a new thread will be created and added to the pool. Threads that have not been used for 60 seconds are terminated and removed from the cache. ExecutorService execService = Executors.newFixedThreadPool(10); This method of the approach creates a thread pool that reuses a fixed number of threads. Created nThreads will be active at the runtime. If additional tasks are submitted when all threads are active, they will wait in the queue until a thread is available. ExecutorService execService = Executors.newSingleThreadExecutor(); This method of the approach creates an Executor that uses a single worker thread operating off an unbounded queue. Tasks are guaranteed to execute sequentially, and no more than one task will be active at any given time. Methods of the ExecutorService : execute(Runnable) : Executes the given command at some time in the future. submit(Runnable) : submit method returns a Future Object which represents executed task. Future Object returns null if the task has finished correctly. shutdown() : Initiates an orderly shutdown in which previously submitted tasks are executed, but no new tasks will be accepted. Invocation has no additional effect if already shut down. shutdownNow() : Attempts to stop all actively executing tasks, halts the processing of waiting tasks, and returns a list of the tasks that were awaiting execution. There are no guarantees beyond best-effort attempts to stop processing actively executing tasks. For example, typical implementations will cancel via Thread.interrupt, so any task that fails to respond to interrupts may never terminate. A sample application is below : STEP 1 : CREATE MAVEN PROJECT A maven project is created as below. (It can be created by using Maven or IDE Plug-in).STEP 2 : CREATE A NEW TASK A new task is created by implementing the Runnable interface(creating Thread) as below. TestTask Class specifies business logic which will be executed. package com.otv.task;import org.apache.log4j.Logger;/** * @author onlinetechvision.com * @since 24 Sept 2011 * @version 1.0.0 * */ public class TestTask implements Runnable { private static Logger log = Logger.getLogger(TestTask.class); private String taskName;public TestTask(String taskName) { this.taskName = taskName; }public void run() { try { log.debug(this.taskName + " is sleeping..."); Thread.sleep(3000); log.debug(this.taskName + " is running..."); } catch (InterruptedException e) { e.printStackTrace(); } }public String getTaskName() { return taskName; }public void setTaskName(String taskName) { this.taskName = taskName; }}STEP 3 : CREATE TestExecutorService by using newCachedThreadPool TestExecutorService is created by using the method newCachedThreadPool. In this case, created thread count is specified at the runtime. package com.otv;import java.util.concurrent.ExecutorService; import java.util.concurrent.Executors;import com.otv.task.TestTask;/** * @author onlinetechvision.com * @since 24 Sept 2011 * @version 1.0.0 * */ public class TestExecutorService {public static void main(String[] args) { ExecutorService execService = Executors.newCachedThreadPool(); execService.execute(new TestTask("FirstTestTask")); execService.execute(new TestTask("SecondTestTask")); execService.execute(new TestTask("ThirdTestTask"));execService.shutdown(); } }When TestExecutorService is run, the output will be seen as below : 24.09.2011 17:30:47 DEBUG (TestTask.java:21) - SecondTestTask is sleeping... 24.09.2011 17:30:47 DEBUG (TestTask.java:21) - ThirdTestTask is sleeping... 24.09.2011 17:30:47 DEBUG (TestTask.java:21) - FirstTestTask is sleeping... 24.09.2011 17:30:50 DEBUG (TestTask.java:23) - ThirdTestTask is running... 24.09.2011 17:30:50 DEBUG (TestTask.java:23) - FirstTestTask is running... 24.09.2011 17:30:50 DEBUG (TestTask.java:23) - SecondTestTask is running...STEP 4 : CREATE TestExecutorService by using newFixedThreadPool TestExecutorService is created by using the method newFixedThreadPool. In this case, created thread count is specified at the runtime. package com.otv;import java.util.concurrent.ExecutorService; import java.util.concurrent.Executors;import com.otv.task.TestTask;/** * @author onlinetechvision.com * @since 24 Sept 2011 * @version 1.0.0 * */ public class TestExecutorService {public static void main(String[] args) { ExecutorService execService = Executors.newFixedThreadPool(2); execService.execute(new TestTask("FirstTestTask")); execService.execute(new TestTask("SecondTestTask")); execService.execute(new TestTask("ThirdTestTask"));execService.shutdown(); } }When TestExecutorService is run, ThirdTestTask is executed after FirstTestTask and SecondTestTask’ s executions are completed. The output will be seen as below : 24.09.2011 17:33:38 DEBUG (TestTask.java:21) - FirstTestTask is sleeping... 24.09.2011 17:33:38 DEBUG (TestTask.java:21) - SecondTestTask is sleeping... 24.09.2011 17:33:41 DEBUG (TestTask.java:23) - FirstTestTask is running... 24.09.2011 17:33:41 DEBUG (TestTask.java:23) - SecondTestTask is running... 24.09.2011 17:33:41 DEBUG (TestTask.java:21) - ThirdTestTask is sleeping... 24.09.2011 17:33:44 DEBUG (TestTask.java:23) - ThirdTestTask is running...STEP 5 : CREATE TestExecutorService by using newSingleThreadExecutor TestExecutorService is created by using the method newSingleThreadExecutor. In this case, only one thread is created and tasks are executed sequentially. package com.otv;import java.util.concurrent.ExecutorService; import java.util.concurrent.Executors;import com.otv.task.TestTask;/** * @author onlinetechvision.com * @since 24 Sept 2011 * @version 1.0.0 * */ public class TestExecutorService {public static void main(String[] args) { ExecutorService execService = Executors.newSingleThreadExecutor(); execService.execute(new TestTask("FirstTestTask")); execService.execute(new TestTask("SecondTestTask")); execService.execute(new TestTask("ThirdTestTask"));execService.shutdown(); } }When TestExecutorService is run, SecondTestTask and ThirdTestTask is executed after FirstTestTask’ s execution is completed. The output will be seen as below : 24.09.2011 17:38:21 DEBUG (TestTask.java:21) - FirstTestTask is sleeping... 24.09.2011 17:38:24 DEBUG (TestTask.java:23) - FirstTestTask is running... 24.09.2011 17:38:24 DEBUG (TestTask.java:21) - SecondTestTask is sleeping... 24.09.2011 17:38:27 DEBUG (TestTask.java:23) - SecondTestTask is running... 24.09.2011 17:38:27 DEBUG (TestTask.java:21) - ThirdTestTask is sleeping... 24.09.2011 17:38:30 DEBUG (TestTask.java:23) - ThirdTestTask is running...STEP 6 : REFERENCES http://download.oracle.com/javase/6/docs/api/java/util/concurrent/ExecutorService.html http://tutorials.jenkov.com/java-util-concurrent/executorservice.html  Reference: Java Executor Service Types from our JCG partner Eren Avsarogullari at the Online Technology Vision blog....
agile-logo

Build documentation to last – choose the agile way

Lately I wondered what the best way to document a project is?My documentation experience vary among different tools and methodologies. I would like to share some observation I have and a conclusion about the best way to document a project.The documentation could be classified to the following categories:Documentation place:In line code/ version control system (via code commit description) In separate server linked to the code In separate server, decoupled from the code (no direct linkage)Documentation done by:Developer Product team/ Design team / architects Technical writersDocumentation too:IDE Design documents Modeling tool, Process Flow Wiki version control (e.g. git, svn,..) commits Interface documentationNot surprisingly there is a direct correlation between the tool used , the person who document the code, the amount of documentation, the “distance” of documentation from the code and the accuracy of that documentation.Given the categories above it could be organized in the following manner:DevelopersIn line code/ version control system (via code commit description)IDE/ version controlProduct team/ Design team / architectsIn separate server linked to the codeDesign documents, Interface documentationTechnical writersIn separate server, decoupled from the code (no direct linkage)Modeling tool, Process Flo, wikiDevelopers tend to write short inline documentation using IDE, well interface semantics and complementary well written code commits. As long as the person who document the functionality has more distance from the code, the documentation would usually be in places more decoupled from where the code exist and more comprehensive.From my experience, even good design tend to change a bit and even if the documentation is good but is decoupled from the code, most chances are that it won’t catch up with code change.In real life, when requirements keep coming from the business into development, it sometimes brings with it not only additional code to directly support functionality, but often we see the need for some structural or infra change and refactoring.The inline code documentation is agile and change with minimum effort along the change in functionality. If the developer submit the code grouped by functionality and provide good explanation about changes that were done it would the most updated and accurate documentation .I know that some of you might wonder about heavy duty design or complex functionality documentation. I would recommend tackle these issues as much as possible inline the code, for example, assuming you read some pattern or some bug solution in the web put a link to that solution near the method/class which implement the solution. Try to model your code by known patterns so it would avoid documentation. Try to use conventions so it would reduce amount of configuration and make your code flow more predictable and discoverable.This approach is even more important when managing a project in agile methodology. Usually such methodology would rather direct communication with product/business to understand requirements rather than documented PRDs. This makes it even more important to have the code self explanatory easy for orientation. Moreover, frequent changes in design and business change would cause decoupled documentation soon be obsolete (or will drag hard maintenance)Although it sounds easier said than done and it is not a silver bullet for every project, writing documentation as close as possible to the code itself should be taken as a guideline / philosophy when developing a project.Reference: Build documentation to last – choose the agile way from our JCG partner Gal Levinsky at the Gal Levinsky’s blog blog....
agile-logo

The Demise of IT Business Analysts

IT departments typically staff themselves with a group of folks known as business analysts.These folks have the responsibility of talking to folks that work in the business and translating their needs into a format that more technical folks can understand, because for some reason or other, most managers think that developers are unable to speak the same English that everyone else does.IT is consistently cursed with getting the requirements wrong. Users, it seems, are horrible at actually reading the requirements documents they are given, worse they always seem to hate the great software that IT gives them. IT management typically has two solutions to the problem1 – put a change management system in place that prevents those customers from actually changing their minds 2 – invest in better capability in requirements facilitation and documentationNone of these assumption address the underlying cause of churn in IT requirements, which is that today’s customer economy has created an environment where the benefit of many IT requirements are based entirely on assumptions. When these assumptions turn out to be wrong (and they often do) the business has no choice but to change direction, and fast.Faced with this challenge IT either jumps into heroics mode or shuts down into change request mode, or worse some weird mixture of both.No amount of requirements gathering expertise can fix this problem. For the IT Analyst function to be relevant in today’s world requires a different set of skills. What is required is an IT Analyst who is well versed in how to reverse engineer a business plan and associated IT solution into a set of testable assumptions that can be implemented, and validated by order of highest risk. This analyst needs to be able to combine business knowledge with enough technical savvy to walk business customers along a journey of validated learning, guiding stakeholder through pivot or persue decisions. They need to employ techniques like Business Model Generation, Service Design Thinking, and Customer Development to provide the business with a richer ecosystem of tools to brainstorm what the future should look like. In short Business Analysts need to become Business Technology Innovation Experts if they want to more than mere order takers.Reference: The Demise of IT Business Analysts from our JCG partner Alexis Hui at the Lean Transformation blog....
career-logo

What’s Your Geek Number? My Points System To Rate Software Engineers

The ability to quickly and accurately rate the ability of software engineers after reading a résumé and conducting a relatively brief screening (as compared to full interviews) is very useful for those of us who recruit technologists. This skill is one of the elements that separates great recruiters from the rest, and it is admittedly not an exact science. As an experienced recruiter, I often work with candidates that I have represented in the past, so I will have the benefit of historical data to assess how strong a candidate’s skills are and how well he/she will perform. This data typically will include feedback from past job interviews, referrals from other engineers, discretionary ‘word on the street’ reference checks, and knowledge of a candidate’s job history. But how do I make a judgment on a new candidate that has no experience or history with me? I receive a résumé, we have a conversation, and I have to make the call as to whether this person is a top candidate. Without going into a deep technical interview, how do I surmise how strong this engineer’s technical skills are? This is one of the reasons clients hire me, as my screening should save them time to weed out those that are not qualified. After fourteen years in the business, I have noticed some distinct patterns regarding behaviors, traits, and small details that are shared by many of the top software engineers. Conversely, there are some noticeable trends that are indicative of engineers that are probably not in the top tier. For the sake of clarity, my definition of ‘top’ engineers is based on successful job interviews with clients, on the job performance while working for clients, and substantial anecdotal evidence from an engineer’s peers (the peer piece often being the most substantial factor). Below I have compiled a ‘points system’ for software engineers, with an assigned value for each characteristic. As you can see, some items are worth positive points, while other traits take points away. Again, this is by no means an exact science, and I imagine we will get some false positives (i.e. it is possible that some average engineers could score high), but I sincerely doubt many great engineers would score low. Although I have used some of these characteristics to help loosely evaluate engineering talent in the past, I have never assigned any specific points values until now. What does my points score mean? A score in the high 20?s should be where we start seeing engineers that probably take their career pretty seriously and are willing and able to invest some time in bettering themselves. A score in the 30?s would certainly be indicative of a candidate I would expect to perform well more often than not, and above 40 I would expect all candidates to be in the top tier. I know a few engineers who score in the 50 range and higher, and I don’t think you will find any individuals that are not widely admired and respected at this point level. To be clear, I’m certainly not saying that engineers with blogs are always technically superior, or that having a five page résumé somehow makes you a bad engineer. The assigned points value are not how much that characteristic should be ‘valued’ by an employer, but rather the strength of the indicator. (Note the highest point value is assigned to Linux kernel work, which I have found in my experience to be the strongest indicator of good engineers – not all good engineers hack the kernel, but all engineers who hack the kernel are good.) These are simply patterns that I have observed while talking to thousands of engineers, and I’m sure many of you will disagree. Critiques are welcome – even encouraged (just be polite). If you have found indicators of great (or poor) engineers, I’d love to hear them – even if they are odd. Let’s debate! (As my business is focused on engineers working within a specific set of languages primarily geared towards the open source world, this list is focused on fit for my purposes. Those who work in Microsoft shops or develop embedded software, for example, might not score as well due to the open source leanings here.) POSITIVE POINTSLinux kernel hacking, just for fun +7 Committer/contributor to open source project(s) +6 Technical patent +6 Speaker at conferences/group events +5 Blog about tech (even if only occasionally) +4 Attend user groups/meetups at least once a month +4 Regular reader of tech books/blogs +4 Speaker at in-house company trainings +4 Master’s in Computer Science +3 Tech hobby with mobile platforms, robotics, Arduino +3 Run Linux or other Unix flavor at home +3 Published author/editor of book or web material +3 articles +8 books Start-up/entrepreneurial experience +3 Active projects on GitHub, Sourceforge, Heroku, similar +2 each In addition to your main ‘paid’ language/platform, some level of fluency in any of the following: Java, C++, Scala, Clojure, Lisp, Ruby, Erlang, Python, Haskell, iPhone/Android platform +5 per languageQUICK ASIDE: Another element I tend to include in the ‘plus’ category is work experience for (+5) or a job offer (+2) from a certain subset of employers. Why? Because I know firsthand that these employers have very high standards in hiring and rarely make bad hires, so odds are if you have worked for or have been offered employment by one of these companies you are very likely good at what you do. I can’t make this list public, but I share this just to let you know another contributing factor. NEGATIVE POINTSWorked over 12 years for current employer (few exceptions) -10 Worked over 6 years for current employer (few exceptions) -5 Only run Windows at home -5 40+ tech buzzwords or acronyms on your résumé -5 AOL email address -4 Résumé is over three pages -4 Having three or more permanent jobs in a given two year period -4 GPA/SAT score on résumé and 10+ years experience -2Reference: What’s Your Geek Number? My Points System To Rate Software Engineers (without a full technical interview) from our JCG partner Dave Fecak at the Job Tips For Geeks blog....
disruptor-logo

Single Writer Principle

When trying to build a highly scalable system the single biggest limitation on scalability is having multiple writers contend for any item of data or resource. Sure, algorithms can be bad, but let’s assume they have a reasonable Big O notation so we’ll focus on the scalability limitations of the systems design. I keep seeing people just accept having multiple writers as the norm. There is a lot of research in computer science for managing this contention that boils down to 2 basic approaches. One is to provide mutual exclusion to the contended resource while the mutation takes place; the other is to take an optimistic strategy and swap in the changes if the underlying resource has not changed while you created the new copy. Mutual Exclusion Mutual exclusion is the means by which only one writer can have access to a protected resource at a time, and is usually implemented with a locking strategy. Locking strategies require an arbitrator, usually the operating system kernel, to get involved when the contention occurs to decide who gains access and in what order. This can be a very expensive process often requiring many more CPU cycles than the actual transaction to be applied to the business logic would use. Those waiting to enter the critical section, in advance of performing the mutation must queue, and this queuing effect ( Little’s Law) causes latency to become unpredictable and ultimately restricts throughput. Optimistic Concurrency Control Optimistic strategies involve taking a copy of the data, modifying it, then copying back the changes if data has not mutated in the meantime. If a change has happened in the meantime you repeat the process until successful. This repeating of the process increases with contention and therefore causes a queuing effect just like with mutual exclusion. If you work with a source code control system, such as Subversion or CVS, then you are using this algorithm every day. Optimistic strategies can work with data but do not work so well with resources such as hardware because you cannot take a copy of the hardware! The ability to perform the changes atomically to data is made possible by CAS instructions offered by the hardware. Most locking strategies are composed from optimistic strategies for changing the lock state or mutual exclusion primitive. Managing Contention vs. Doing Real Work CPUs can typically process one or more instructions per cycle. For example, modern Intel CPU cores each have 6 execution units that can be doing a combination of arithmetic, branch logic, word manipulation and memory loads/stores in parallel. If while doing work the CPU core incurs a cache miss, and has to go to main memory, it will stall for hundreds of cycles until the result of that memory request returns. To try and improve things the CPU will make some speculative guesses as to what a memory request will return to continue processing. If a second miss occurs the CPU will no longer speculate and simply wait for the memory request to return because it cannot typically keep the state for speculative execution beyond 2 cache misses. Managing cache misses is the single largest limitation to scaling the performance of our current generation of CPUs. Now what does this have to do with managing contention? Well if two or more threads are using locks to provide mutual exclusion, at best they will be going to the L3 cache, or over a socket interconnect, to access share state of the lock using CAS operations. These lock/CAS instructions cost 10s of cycles in the best case when un-contended, plus they cause out-of-order execution for the CPU to be suspended and load/store buffers to be flushed. At worst, collisions occur and the kernel will need to get involved and put one or more of the threads to sleep until the lock is released. This rescheduling of the blocked thread will result in cache pollution. The situation can be even worse when the thread is re-scheduled on another core with a cold cache resulting in many cache misses. For highly contended data it is very easy to get into a situation whereby the system spends significantly more time managing contention than doing real work. The table below gives an idea of basic costs for managing contention when the program state is very small and easy to reload from the L2/L3 cache, never mind main memory.Method Time (ms)One Thread 300One Thread with Memory Barrier 4,700One Thread with CAS 5,700Two Threads with CAS 18,000One Thread with Lock 10,000Two Threads with Lock 118,000This table illustrates the costs of incrementing a 64-bit counter 500 million times using a variety of techniques on a 2.4Ghz Westmere processor. I can hear people coming back with “but this is a trivial example and real-world applications are not that contended”. This is true but remember real-world applications have way more state, and what do you think happens to all that state which is warm in cache when the context switch occurs??? By measuring the basic cost of contention it is possible to extrapolate the scalability limits of a system which has contention points. As multi-core becomes ever more significant another approach is required. My last post illustrates the micro level effects of CAS operations on modern CPUs, whereby Sandybridge can be worse for CAS and locks. Single Writer Designs Now, what if you could design a system whereby any item of data, or resource, is only mutated by a single writer/thread? It is actually easier than you think in my experience. It is OK if multiple threads, or other execution contexts, read the same data. CPUs can broadcast read only copies of data to other cores via the cache coherency sub-system. This has a cost but it scales very well. If you have a system that can honour this single writer principle then each execution context can spend all its time and resources processing the logic for its purpose, and not be wasting cycles and resource on dealing with the contention problem. You can also scale up without limitation until the hardware is saturated. There is also a really nice benefit in that when working on architectures, such as x86/x64, where at a hardware level they have a memory model, whereby load/store memory operations have preserved order, thus memory barriers are not required if you adhere strictly to the single writer principle. On x86/x64 ‘ loads can be re-ordered with older stores‘ according to the memory model so memory barriers are required when multiple threads mutate the same data across cores. The single writer principle avoids this issue because it never has to deal with writing the latest version of a data item that may have been written by another thread and currently in the store buffer of another core. So how can we drive towards single writer designs? I’ve found it is a very natural thing. Consider how humans, or any other autonomous creatures of nature, operate with their model of the world. We all have our own model of the world contained in our own heads, i.e. We have a copy of the world state for our own use. We mutate the state in our heads based on inputs (events/messages) we receive via our senses. As we process these inputs and apply them to our model we may take action that produces outputs, which others can take as their own inputs. None of us reach directly into each other’s heads and mess with the neurons. If we did this it would be a serious breach of encapsulation! Originally, Object Oriented (OO) design was all about message passing, and somehow along the way we bastardised the message passing to be method calls and even allowed direct field manipulation – Yuk! Who’s bright idea was it to allow public access to fields of an object? You deserve your own special hell. At university I studied transputers and interesting languages like Occam. I thought very elegant designs appeared by having the nodes collaborate via message passing rather than mutating shared state. I’m sure some of this has inspired the Disruptor. My experience with the Disruptor has shown that is it possible to build systems with one or more orders of magnitude better throughput than locking or contended state based approaches. It also gives much more predictable latency that stays constant until the hardware is saturated rather than the traditional J-curve latency profile. It is interesting to see the emergence of numerous approaches that lend themselves to single writer solutions such as Node.js, Erlang, Actor patterns, and SEDA to name a few. Unfortunately most use queue based implementations underneath, which breaks the single writer principle, whereas the Disruptor strives to separate the concerns so that the single writer principle can be preserved for the common cases. Now I’m not saying locks and optimistic strategies are bad and should not be used. They are excellent for many problems. For example, bootstrapping a concurrent system or making major state stages in configuration or reference data. However if the main flow of transactions act on contended data, and locks or optimistic strategies have to be employed, then the scalability is fundamentally limited. The Principle at Scale This principle works at all levels of scale. Mandelbrot got this so right. CPU cores are just nodes of execution and the cache system provides message passing for communication. The same patterns apply if the processing node is a server and the communication system is a local network. If a service, in SOA architecture parlance, is the only service that can write to its data store it can be made to scale and perform much better. Let’s say that underlying data is stored in a database and other services can go directly to that data, without sending a message to the service that owns the data, then the data is contended and requires the database to manage the contention and coherence of that data. This prevents the service from caching copies of the data for faster response to the clients and restricts how the data can be sharded. Encapsulation has just been broken at a more macro level when multiple different services write to the same data store. Summary If a system is decomposed into components that keep their own relevant state model, without a central shared model, and all communication is achieved via message passing then you have a system without contention naturally. This type of system obeys the single writer principle if the messaging passing sub-system is not implemented as queues. If you cannot move straight to a model like this, but are finding scalability issues related to contention, then start by asking the question, “How do I change this code to preserve the Single Writer Principle and thus avoid the contention?” The Single Writer Principle is that for any item of data, or resource, that item of data should be owned by a single execution context for all mutations. Reference: Single Writer Principle from our JCG partner Martin Thompson at the Mechanical Sympathy blog....
zk-logo

ZK in Action: Styling and Layout

In the previous ZK in Action posts we went through implementing a CRUD feature using ZK MVVM. We also quickly went through some styling code that may deserve more explanation. In this post, we’ll go over how to append new CSS styling rules onto ZK widgets and how to override the existing styling. We’ll also introduce some basics of UI layout in ZK. ObjectiveUse ZK’s layout and container widgets to host the inventory CRUD feature we built in the previous posts. Style the ZK widgetsZK Features in ActionBorderlayout Hlayout Tabbox Include sclass zclassUsing Layouts and Containers The Borderlayout and Hlayout The Borderlayout divides the window into 5 sections as shown below:Without further ado let’s dissect the markup and see how it works: <window ...'> <borderlayout width='100%' height='100%'> <north size='15%'> <hlayout width='100%' height='100%'> <label hflex='9' value='Alpha Dental' /> <label hflex='1' value='Sign Out' ></label> </hlayout> </north> <east size='10%'></east> <center> <tabbox width='100%' height='100%' orient='vertical'> <tabs width='15%'> <tab label='Inventory' /> <tab label='TBD' /> <tab label='TBD'/> </tabs> <tabpanels> <tabpanel> <include src='inventory.zul'/> </tabpanel> <tabpanel></tabpanel> <tabpanel></tabpanel> </tabpanels> </tabbox> </center> <west size='10%' ></west> <south size='10%'></south> </borderlayout>line 3 and 27, the north and south widgets can be adjusted for height but not width line 9 and 26, the east and west widgets can be adjusted for width but not height line 10, the center widget’s dimensions are dependent on those entered for the north, west, south, and east widgets from line 4 through 7, we wrap the two labels with an Hlayout so they’ll be displayed side by side proportionally with respect to the ‘hflex’ attribute we specified. That is, the Label assigned with hflex=’9′ has 9 times the width of the Label assigned with hflex=’1′. each inner widget (north, west, etc.) can accept only a single child component, hence, multiple widgets must be wrapped by a single container widget such as Hlayout before placed inside the Borderlayout inner widgets (north, west, etc.) line 11, we place a Tabbox element and set its orientation as vertical in anticipation of embedding our inventory CRUD feature inside it line 12 to 16, we put the heading for each tab line 18, a Tabpanel is a container that holds a tab’s content line 19, we embed our inventory CRUD feature inside an Include tag. The widgets on inventory.zul will be attached to this pageOverriding the Existing ZK Styling Rules The ZK default font properties and the background colours were modified so headings would be presented more prominently. Let’s quickly explain how this is accomplished. Using Chrome Developer Tool or the Firebug extension, we could easily inspect the source of our Borderlayout and find the ZK styling class for the ZK widgets as shown below: From here we learned that the naming pattern for the highlighted region is z-north-body. Similarly, we could do the same for all the markup of interest and go ahead overriding their CSS styling rules: <zk> <style> .z-tab-ver .z-tab-ver-text { font-size: 18px; } .z-north-body, .z-south-body { background:#A3D1F0 } .z-east-body, .z-west-body { background:#F8F9FB } </style> <window border='none' width='100%' height='100%'> <borderlayout width='100%' height='100%'> <north size='15%'>...</north> <east size='10%'></east> <center>...</center> <west size='10%'></west> <south size='10%'></south> </borderlayout> </window> </zk> Appending Additional Styling Rules via Style Attribute Here we’re modifying the styling of the Labels contained in the North widget. Since we only want these two Labels, not all of them, to be affected by our new styling, it does not make sense for us to override the original styling as we did before. For these isolated modifications, it suffice to simply assign the styling rules to the ‘style’ attribute that comes with the ZK widgets: <north size='15%'> <hlayout width='100%' height='100%'> <label value='Alpha Dental' style='font-size: 32px; font-style: italic; font-weight:bold; color:white; margin-left:8px;'/> <label value='Sign Out' style='font-size: 14px; font-weight:bold; color:grey; line-height:26px'></label> </hlayout> </north>... Appending Additional Styling Rules via Sclass An alternative to assign styling rules directly in the markup and pollute the code is to declare a styling class, abbreviated as ‘sclass’, and assign the rules to the ‘sclass’ attribute as shown here: <zk> <style> .company-heading { font-size: 32px; font-style: italic; font-weight:bold; color:white; margin-left:8px; } </style> <window ...> <borderlayout ...> <north ...> <label value='Alpha Dental' sclass='company-heading'/></north> ... </borderlayout> </window> </zk>In a NutshellThree ways to modify the default ZK styling is covered in this post: override the existing ZK styling class, assign styling rules directly to a widget’s style attribute, or define a CSS class in a CSS file or inside a Style tag then assign the class to the widget’s sclass attribute Use a developer tool(such as Firebug) to inspect the ZK widgets and find out which ZK style class to override The hlex attribute allows developers to define widgets’ width proportionally with respect to each other Layout widgets help developers to divide the presentation window into sectionsRelated links: ZK Styling Guide Borderlayout Hlayout Hflex Reference: ZK in Action [4] : Styling and Layout from our JCG partner Lance Lu at the Tech Dojo blog....
enterprise-java-logo

Oracle Service Bus – Stuck Thread Case Study

This case study describes the complete root cause analysis process of a stuck thread problem experienced with Oracle Service Bus 11g running on AIX 6.1 and IBM Java VM 1.6. This article is also a great opportunity for you to improve your thread dump analysis skills and I highly recommend that you study and properly understand the following analysis approach. It will also demonstrate the importance of proper data gathering as opposed to premature middleware (Weblogic) restarts.  Environment specificationsJava EE server: Oracle Service Bus 11g OS: AIX 6.1 JDK: IBM JRE 1.6.0 @64-bit RDBMS: Oracle 10g Platform type: Enterprise Service BusTroubleshooting toolsQuest Software Foglight for Java (monitoring and alerting) Java VM Thread Dump (IBM JRE javacore format)Problem overview   Major performance degradation was observed from our Oracle Service Bus Weblogic environment. Alerts were also sent from the Foglight agents indicating a significant surge in Weblogic threads utilization. Gathering and validation of facts   As usual, a Java EE problem investigation requires gathering of technical and non technical facts so we can either derived other facts and/or conclude on the root cause. Before applying a corrective measure, the facts below were verified in order to conclude on the root cause:What is the client impact? HIGH Recent change of the affected platform? Yes, logging level changed in OSB console for a few business services prior to outage report Any recent traffic increase to the affected platform? No Since how long this problem has been observed? New problem observed following logging level changes Did a restart of the Weblogic server resolve the problem? YesConclusion #1 : The logging level changes applied earlier on some OSB business services appear to have triggered this stuck thread problem. However, the root cause remains unknown at this point. Weblogic threads monitoring: Foglight for Java   Foglight for Java (from Quest Software) is a great monitoring tool allowing you to completely monitor any Java EE environment along with full alerting capabilities. This tool is used in our production environment to monitor the middleware (Weblogic) data, including threads, for each of the Oracle Service Bus managed servers. You can see below a consistent increase of the threads along with a pending request queue.For your reference, Weblogic slow running threads are identified as “Hogging Threads” and can eventually be promoted to “STUCK” status if running for several minutes (as per your configured threshold). Now what should be your next course of action? Weblogic restart? Definitely not… Your first “reflex” for this type of problems is to capture a JVM Thread Dump. Such data is critical for you to perform proper root cause analysis and understand the potential hanging condition. Once such data is captured, you can then proceed with Weblogic server recovery actions such as a full managed server (JVM) restart. Stuck Threads: Thread Dump to the rescue!   The next course of action in this outage scenario was to quickly generate a few thread dump snapshots from the IBM JVM before attempting to recover the affected Weblogic instances. Thread dump was generated using kill -3 <Java PID> which did generate a few javacore files at the root of the Weblogic domain. javacore.20120610.122028.15149052.0001.txtOnce the production environment was back up and running, the team quickly proceeded with the analysis of the captured thread dump files as per below steps. Thread Dump analysis step #1 – identify a thread execution pattern   The first analysis step is to quickly go through all the Weblogic threads and attempt to identify a common problematic pattern such as threads waiting from remote external systems, threads in deadlock state, threads waiting from other threads to complete their tasks etc. The analysis did quickly reveal many threads involved in the same blocking situation as per below. In this sample, we can see an Oracle Service Bus thread in blocked state within the TransactionManager Java class (OSB kernel code). [ACTIVE] ExecuteThread: '292' for queue: 'weblogic.kernel.Default (self-tuning)'" J9VMThread:0x0000000139B76B00, j9thread_t:0x000000013971C9A0, java/lang/Thread:0x07000000F9D80630, state:B, prio=5 (native thread ID:0x2C700D1, native priority:0x5, native policy:UNKNOWN) Java callstack:at com/bea/wli/config/transaction/TransactionManager._beginTransaction(TransactionManager.java:547(Compiled Code)) at com/bea/wli/config/transaction/TransactionManager.beginTransaction(TransactionManager.java:409(Compiled Code)) at com/bea/wli/config/derivedcache/DerivedResourceManager.getDerivedValueInfo(DerivedResourceManager.java:339(Compiled Code)) at com/bea/wli/config/derivedcache/DerivedResourceManager.get(DerivedResourceManager.java:386(Compiled Code)) at com/bea/wli/sb/resources/cache/DefaultDerivedTypeDef.getDerivedValue(DefaultDerivedTypeDef.java:106(Compiled Code)) at com/bea/wli/sb/pipeline/RouterRuntimeCache.getRuntime(RouterRuntimeCache.java(Compiled Code)) at com/bea/wli/sb/pipeline/RouterManager.getRouterRuntime(RouterManager.java:640(Compiled Code)) at com/bea/wli/sb/pipeline/RouterContext.getInstance(RouterContext.java:172(Compiled Code)) at com/bea/wli/sb/pipeline/RouterManager.processMessage(RouterManager.java:579(Compiled Code)) at com/bea/wli/sb/transports/TransportManagerImpl.receiveMessage(TransportManagerImpl.java:375(Compiled Code)) at com/bea/wli/sb/transports/local/LocalMessageContext$1.run(LocalMessageContext.java:179(Compiled Code)) at weblogic/security/acl/internal/AuthenticatedSubject.doAs(AuthenticatedSubject.java:363(Compiled Code)) at weblogic/security/service/SecurityManager.runAs(SecurityManager.java:146(Compiled Code)) at weblogic/security/Security.runAs(Security.java:61(Compiled Code)) at com/bea/wli/sb/transports/local/LocalMessageContext.send(LocalMessageContext.java:144(Compiled Code)) at com/bea/wli/sb/transports/local/LocalTransportProvider.sendMessageAsync(LocalTransportProvider.java:322(Compiled Code)) at sun/reflect/GeneratedMethodAccessor980.invoke(Bytecode PC:58(Compiled Code)) at sun/reflect/DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37(Compiled Code)) at java/lang/reflect/Method.invoke(Method.java:589(Compiled Code)) at com/bea/wli/sb/transports/Util$1.invoke(Util.java:83(Compiled Code)) at $Proxy111.sendMessageAsync(Bytecode PC:26(Compiled Code)) ……………………………Thread Dump analysis step #2 – review the blocked threads chain   The next step was to review the affected and blocked threads chain involved in our identified pattern. As we saw in the thread dump analysis part 4, the IBM JVM thread dump format contains a separate section that provides a full breakdown of all thread blocked chains e.g. the Java object monitor pool locks. A quick analysis did reveal the following thread culprit as per below. As you can see, Weblogic thread #16 is the actual culprit with 300+ threads waiting to acquire a lock on a shared object monitor TransactionManager@0x0700000001A51610/0x0700000001A51628. 2LKMONINUSE sys_mon_t:0x000000012CCE2688 infl_mon_t: 0x000000012CCE26C8: 3LKMONOBJECT com/bea/wli/config/transaction/TransactionManager@0x0700000001A51610/0x0700000001A51628: Flat locked by "[ACTIVE] ExecuteThread: '16' for queue: 'weblogic.kernel.Default (self-tuning)'" (0x000000012CA3C800), entry count 1 3LKWAITERQ Waiting to enter: 3LKWAITER "[STUCK] ExecuteThread: '0' for queue: 'weblogic.kernel.Default (self-tuning)'" (0x000000011C785C00) 3LKWAITER "[STUCK] ExecuteThread: '1' for queue: 'weblogic.kernel.Default (self-tuning)'" (0x000000011CA93200) 3LKWAITER "[STUCK] ExecuteThread: '2' for queue: 'weblogic.kernel.Default (self-tuning)'" (0x000000011B3F2B00) 3LKWAITER "[STUCK] ExecuteThread: '3' for queue: 'weblogic.kernel.Default (self-tuning)'" (0x000000011619B300) 3LKWAITER "[STUCK] ExecuteThread: '4' for queue: 'weblogic.kernel.Default (self-tuning)'" (0x000000012CBE8000) 3LKWAITER "[STUCK] ExecuteThread: '21' for queue: 'weblogic.kernel.Default (self-tuning)'" (0x000000012BE91200) ..................Thread Dump analysis step #3 – thread culprit deeper analysis   Once you identify a primary culprit thread, the next step is to perform a deeper review of the computing task this thread is currently performing. Simply go back to the raw thread dump data and start analyzing the culprit thread stack trace from bottom-up. As you can see below, the thread stack trace for our problem case was quite revealing. It did reveal that thread #16 is currently attempting to commit a change made at the Weblogic / Oracle Service Bus level. The problem is that the commit operation is hanging and taking too much time, causing thread #16 to retain the shared object monitor lock from TransactionManager for too long and “starving” the other Oracle Service Bus Weblogic threads. "[ACTIVE] ExecuteThread: '16' for queue: 'weblogic.kernel.Default (self-tuning)'" J9VMThread:0x000000012CA3C800, j9thread_t:0x000000012C9F0F40, java/lang/Thread:0x0700000026FCE120, state:P, prio=5 (native thread ID:0x35B0097, native priority:0x5, native policy:UNKNOWN) Java callstack: at sun/misc/Unsafe.park(Native Method) at java/util/concurrent/locks/LockSupport.park(LockSupport.java:184(Compiled Code)) at java/util/concurrent/locks/AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:822) at java/util/concurrent/locks/AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:853(Compiled Code)) at java/util/concurrent/locks/AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1189(Compiled Code)) at java/util/concurrent/locks/ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:911(Compiled Code)) at com/bea/wli/config/derivedcache/DerivedCache$Purger.changesCommitted(DerivedCache.java:80) at com/bea/wli/config/impl/ResourceListenerNotifier.afterEnd(ResourceListenerNotifier.java:120) at com/bea/wli/config/transaction/TransactionListenerWrapper.afterEnd(TransactionListenerWrapper.java:90) at com/bea/wli/config/transaction/TransactionManager.notifyAfterEnd(TransactionManager.java:1154(Compiled Code)) at com/bea/wli/config/transaction/TransactionManager.commit(TransactionManager.java:1519(Compiled Code)) at com/bea/wli/config/transaction/TransactionManager._endTransaction(TransactionManager.java:842(Compiled Code)) at com/bea/wli/config/transaction/TransactionManager.endTransaction(TransactionManager.java:783(Compiled Code)) at com/bea/wli/config/deployment/server/ServerDeploymentReceiver$2.run(ServerDeploymentReceiver.java:275) at weblogic/security/acl/internal/AuthenticatedSubject.doAs(AuthenticatedSubject.java:321(Compiled Code)) at weblogic/security/service/SecurityManager.runAs(SecurityManager.java:120(Compiled Code)) at com/bea/wli/config/deployment/server/ServerDeploymentReceiver.commit(ServerDeploymentReceiver.java:260) at weblogic/deploy/service/internal/targetserver/DeploymentReceiverCallbackDeliverer.doCommitCallback(DeploymentReceiverCallbackDeliverer.java:195) at weblogic/deploy/service/internal/targetserver/DeploymentReceiverCallbackDeliverer.access$100(DeploymentReceiverCallbackDeliverer.java:13) at weblogic/deploy/service/internal/targetserver/DeploymentReceiverCallbackDeliverer$2.run(DeploymentReceiverCallbackDeliverer.java:68) at weblogic/work/SelfTuningWorkManagerImpl$WorkAdapterImpl.run(SelfTuningWorkManagerImpl.java:528(Compiled Code)) at weblogic/work/ExecuteThread.execute(ExecuteThread.java:203(Compiled Code)) at weblogic/work/ExecuteThread.run(ExecuteThread.java:171(Compiled Code))Root cause: connecting the dots   At this point the collection of the facts and thread dump analysis did allow us to determine the chain of events as per below:Logging level change applied by the production Oracle Service Bus administrator Weblogic deployment thread #16 was unable to commit the change properly Weblogic runtime threads executing client requests quickly started to queue up and wait for a lock on the shared object monitor (TransactionManager) The Weblogic instances ran out of threads, generating alerts and forcing the production support team to shut down and restart the affected JVM processesOur team is planning to open an Oracle SR shortly to share this OSB deployment behaviour along with hard dependency between the client requests (threads) and OSB logging layer. In the meantime, no OSB logging level change will be attempted outside the maintenance window period until further notice. Conclusion   I hope this article has helped you understand and appreciate how powerful thread dump analysis can be to pinpoint root cause of stuck thread problems and the importance for any Java EE production support team to capture such crucial data in order to prevent future re-occurrences. Please do not hesitate to post any comment or question. Reference: Oracle Service Bus – Stuck Thread Case Study from our JCG partner Pierre-Hugues Charbonneau at the Java EE Support Patterns & Java Tutorial blog....
Java Code Geeks and all content copyright © 2010-2014, Exelixis Media Ltd | Terms of Use | Privacy Policy
All trademarks and registered trademarks appearing on Java Code Geeks are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries.
Java Code Geeks is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.
Do you want to know how to develop your skillset and become a ...
Java Rockstar?

Subscribe to our newsletter to start Rocking right now!

To get you started we give you two of our best selling eBooks for FREE!

Get ready to Rock!
You can download the complementary eBooks using the links below:
Close