What's New Here?


Working with Google Analytics API v4 for Android

For v4 of the Google Analytics API for Android, Google has moved the implementation into Google Play Services. As part of the move the EasyTracker class has been removed, but it still possible to get a fairly simple ‘automatic’ Tracker up and running with little effort. In this post I’ll show you how.               Assumptions:You’re already using the Google Analytics v3 API EasyTracker class and just want to do a basic migration to v4 – or - You just want to set up a basic analytics Tracker that sends a Hit when the user starts an activity You already have the latest Google Play Services up and running in your Android appLet’s get started. Because you already have the Google Play Services library in your build, all the necessary helper classes will already be available to your code (if not see here). In the v4 Google Analytics API has a number of helper classes and configuration options which can make getting up and running fairly straight forwards, but I found the documentation to be a little unclear, so here’s what to do… Step 1. Create the following global_tracker.xml config file and add it to your android application’s res/xml folder. This will be used by GoogleAnalytics class as it’s basic global config. You’ll need to customise screen names for your app. Note that there is no ‘Tracking ID’ in this file – that comes later. Of note here is the ga_dryRun element which is used to switch on or off the sending of tracking reports to Google Analytics. You can use this setting in debug to prevent live and debug data getting mixed up. <?xml version="1.0" encoding="utf-8"?> <resources xmlns:tools="http://schemas.android.com/tools" <span style="line-height: 1.5; font-style: inherit; font-weight: inherit;">tools:ignore="TypographyDashes"></span><!-- the Local LogLevel for Analytics --> <string name="ga_logLevel">verbose</string><!-- how often the dispatcher should fire --> <integer name="ga_dispatchPeriod">30</integer><!-- Treat events as test events and don't send to google --> <bool name="ga_dryRun">false</bool><!-- The screen names that will appear in reports --> <string name="com.mycompany.MyActivity">My Activity</string> </resources> Step 2. Now add a second file, “app_tracker.xml” to the same folder location (res/xml). There are a few things of note in this file. You should change the ga_trackingId to the Google Analytics Tracking Id for your app (you get this from the analytics console). Setting ga_autoActivityTracking to ‘true’ is important for this tutorial – this makes setting-up and sending tracking hits from your code much simpler. Finally, be sure to customise your screen names, add one for each activity where you’ll be adding tracking code. <?xml version="1.0" encoding="utf-8"?> <resources xmlns:tools="http://schemas.android.com/tools" <span style="line-height: 1.5; font-style: inherit; font-weight: inherit;">tools:ignore="TypographyDashes"></span><!-- The apps Analytics Tracking Id --> <string name="ga_trackingId">UX-XXXXXXXX-X</string><!-- Percentage of events to include in reports --> <string name="ga_sampleFrequency">100.0</string><!-- Enable automatic Activity measurement --> <bool name="ga_autoActivityTracking">true</bool><!-- catch and report uncaught exceptions from the app --> <bool name="ga_reportUncaughtExceptions">true</bool><!-- How long a session exists before giving up --> <integer name="ga_sessionTimeout">-1</integer><!-- If ga_autoActivityTracking is enabled, an alternate screen name can be specified to substitute for the full length canonical Activity name in screen view hit. In order to specify an alternate screen name use an <screenName> element, with the name attribute specifying the canonical name, and the value the alias to use instead. --> <screenName name="com.mycompany.MyActivity">My Activity</screenName></resources> Step 3. Last in terms of config, modify your AndroidManifest.xml by adding the following line within the ‘application’ element. This configures the GoogleAnalytics class (a singleton whick controls the creation of Tracker instances) with the basic configuration in the res/xml/global_tracker.xml file. <!-- Google Analytics Version v4 needs this value for easy tracking --> <meta-data android:name="com.google.android.gms.analytics.globalConfigResource" android:resource="@xml/global_tracker" /> That’s all the basic xml configuration done. Step 4. We can now add (or modify) your application’s ‘Application’ class so it contains some Trackers that we can reference from our activity. package com.mycompany;import android.app.Application;import com.google.android.gms.analytics.GoogleAnalytics; import com.google.android.gms.analytics.Tracker;import java.util.HashMap;public class MyApplication extends Application {// The following line should be changed to include the correct property id. private static final String PROPERTY_ID = "UX-XXXXXXXX-X";//Logging TAG private static final String TAG = "MyApp";public static int GENERAL_TRACKER = 0;public enum TrackerName { APP_TRACKER, // Tracker used only in this app. GLOBAL_TRACKER, // Tracker used by all the apps from a company. eg: roll-up tracking. ECOMMERCE_TRACKER, // Tracker used by all ecommerce transactions from a company. }HashMap<TrackerName, Tracker> mTrackers = new HashMap<TrackerName, Tracker>();public MyApplication() { super(); }synchronized Tracker getTracker(TrackerName trackerId) { if (!mTrackers.containsKey(trackerId)) {GoogleAnalytics analytics = GoogleAnalytics.getInstance(this); Tracker t = (trackerId == TrackerName.APP_TRACKER) ? analytics.newTracker(R.xml.app_tracker) : (trackerId == TrackerName.GLOBAL_TRACKER) ? analytics.newTracker(PROPERTY_ID) : analytics.newTracker(R.xml.ecommerce_tracker); mTrackers.put(trackerId, t);} return mTrackers.get(trackerId); } } Step 5. At last we can now add some actual hit tracking code to our activity. First, import the class com.google.android.gms.analytics.GoogleAnalytics and initialise the application level tracker in your activities onCreate() method. Do this in each activity you want to track. //Get a Tracker (should auto-report) ((MyApplication) getApplication()).getTracker(MyApplication.TrackerName.APP_TRACKER); Then, in onStart() record a user start ‘hit’ with analytics when the activity starts up. Do this in each activity you want to track. //Get an Analytics tracker to report app starts & uncaught exceptions etc. GoogleAnalytics.getInstance(this).reportActivityStart(this); Finally, record the end of the users activity by sending a stop hit to analytics during the onStop() method of our Activity. Do this in each activity you want to track. //Stop the analytics tracking GoogleAnalytics.getInstance(this).reportActivityStop(this); And Finally… If you now compile and install your app on your device and start it up, assuming you set ga_logLevel to verbose and ga_dryRun to false, in logCat you should see some of the following log lines confirming your hits being sent to Google Analytics. 04-11 13:25:05.026 13287-13304/com.mycompany.myapp V/GAV3? Thread[GAThread,5,main]: connecting to Analytics service 04-11 13:25:05.030 13287-13304/com.mycompany.myapp V/GAV3? Thread[GAThread,5,main]: connect: bindService returned false for Intent { act=com.google.android.gms.analytics.service.START cmp=com.google.android.gms/.analytics.service.AnalyticsService (has extras) } 04-11 13:25:05.030 13287-13304/com.mycompany.myapp W/GAV3? Thread[GAThread,5,main]: Service unavailable (code=1), will retry. 04-11 13:25:05.043 13287-13304/com.mycompany.myapp V/GAV3? Thread[GAThread,5,main]: Loaded clientId 04-11 13:25:05.045 13287-13304/com.mycompany.myapp I/GAV3? Thread[GAThread,5,main]: No campaign data found. 04-11 13:25:05.045 13287-13304/com.mycompany.myapp V/GAV3? Thread[GAThread,5,main]: Initialized GA Thread 04-11 13:25:05.067 13287-13304/com.mycompany.myapp V/GAV3? Thread[GAThread,5,main]: putHit called ... 04-11 13:25:10.106 13287-13304/com.mycompany.myapp V/GAV3? Thread[GAThread,5,main]: Dispatch running... 04-11 13:25:10.623 13287-13304/com.mycompany.myapp V/GAV3? Thread[GAThread,5,main]: sent 1 of 1 hits Even better, if you’re logged into the Google Analytics console’s reporting dashboard, on the ‘Real Time – Overview’ page, you may even notice the following…Reference: Working with Google Analytics API v4 for Android from our JCG partner Ben Wilcock at the Ben Wilcock’s blog blog....

Design Your Agile Project, Part 3

What do you do  for geographically distributed teams, if you want to move to agile? First question: does the team want to move to agile? Or, does the management want to move to agile? I am serious. I might take the same actions, but for different different reasons. In either case, the team needs to learn about what agile and lean means, and how to do agile. In both cases, the team needs help and protection from management.       Why Does a Geographically Distributed Team Need Help and Protection from Management?Managers create geographically distributed teams for many reasons. Some reasons are that there are smart people all over the world. In that case, managers create feature teams. When managers create dispersed teams, teams with one and two people in disparate locations in far-flung locations (more than 60 meters away), managers are under the impression that “experts” can perform jobs better than teams can. When managers think that “experts” can perform jobs better, they create bottlenecks in the work. Bottlenecks prevent flow. Bottlenecks prevent agile or lean from working. It does not matter if you want agile or lean. You won’t get either with a bottleneck. You have to overcome the bottleneck. Can you make a geographically distributed team work in an agile way? Yes. Let’s go back to the principles. Our principles of agile or lean:You need to see all the work in progress. You want to flow work through the entire team. You want to see working software, sooner, rather than later.If you, like me, don’t care how we get there, we have options. How Could a Team Take These Principles and Create Their Own Agile Approach? Let’s take one thing at a time. Let’s assume the team is not accustomed to working on small items of value. If you are like many of my clients, this is the case. What would it take for the team to start working on small features that deliver value? Let’s think about the product again:What kind of potential for release frequency does the team have? That colors the team’s thinking. The more potential for continuous deployment, the easier it is to work on small items of value. This is where I like to introduce the notion of spikes and timeboxes. I ask the team to take their smallest feature, and work together on it. They don’t always have any notion of “together,” either. They say things such as, “We need …” and list the impediments to their ability to work together. Now we see the need for management support. Project Management is Not a Dirty Word; Neither is Management I know that some of you dislike the idea of agile project managers. I know some of you positively hate the idea of management in agile organizations. Let me suggest that for teams transitioning to agile, there is a place for management. That place is creating an environment in which the team learns how to self-manage. There is no place for command-and-control project managers—never has been, never will be. Unless it’s time for lunch. Sometimes, teams need people to say, “Lunch-time!” But even that isn’t command-and-control. That’s someone saying, “I’m taking care of my need to eat. Want to come along?” It’s the same thing for a team with a lot of risk and a lot of unknowns. A team where the normal, out-of-the-box agile solutions don’t work. Why would you let a team like that flounder? That team needs everyone to lead. And, it often, but not always, needs someone to represent that team to management. Why? Because management doesn’t understand agile yet. That part might come now, and it might come later. But in an agile transition with many unknowns, it almost never happens at the beginning, even if management is saying, “Please go agile.” A geographically distributed team needs management support. It does not need command-and-control. That team does need support. That’s when I ask the person working as the project manager to start removing impediments. The team creates their own visual board. (Yes, a distributed team almost always needs a tool. I like to start with cards on a wall first, take pictures of it. Once a team knows how they like to work, then they choose a tool.) The team decides what the length of their timebox is for this feature, if they want to use iterations. They decide how to spike it. They start making decisions. That team especially needs to understand the problem of bottlenecks, experts, and how to stop needing experts. After they practice this a couple of times, they have the confidence they need to do this more times on their project. They can call this agile, but it might not have a real name. It’s a mishmash of timeboxes and kanban, but it works for them. Does it matter what it’s called? The Team Needs Management to Remove Obstacles Teams might need management support. For example, I often discover geographically distributed teams don’t have enough testers. Well, you say, that team flunks the “we have all the cross-functional roles to do the work” part of agile. Yes, and what if they don’t know that? What if they want to try agile? What if they want to work through their obstacles and impediments? They need someone to represent them to their management, while they try to test as they proceed, right? You could substitute “database admin” or “GUI designer” or whatever it is you need for tester in the above paragraph. Whenever you need someone to advocate on behalf of the team to management, you might want an agile project manager. Not someone to say, “When is the project going to be done?” Nope, not that kind of a PM. Someone to say, “What does the team need? I can help acquire that.” PMs who provide servant leadership to the team, and represent what the team has accomplished to the rest of management can be invaluable. They can help the team understand its process and facilitate what the team can do if the team gets stuck. These are agile project management skills. At this point, the team can try several approaches I suggested in these posts:Agile Lifecycles for Geographically Distributed Teams, Part 1 is for iterations and silo’d teams and a project manager. Agile Lifecycles for Geographically Distributed Teams, Part 2 is for kanban and silo’d teams and a project manager. Agile Lifecycles for Geographically Distributed Teams, Part 3 is for iterations and kanban and silo’d teams and a project manager.You might have an even better alternative than the ones I suggested. Do you need a project manager? No. Do you need a servant leader? In my experience, yes. Maybe in your experience, no. I would love to hear from you, if you have a geographically distributed team that does not have a servant leader. How Does This Team Evolve? Some of my clients who are committed to agile have evolved their dispersed teams to be feature teams in multiple places. That has worked very well for them. That has allowed each team to transition from the Complex to the Complicated. They now have collocated agile or lean teams. They can design their agile projects, as in Part 1. They retain the value of smart people all over the world. They don’t have the aggravation of trying to meet in different time zones for a project. They still have it for the program. Some of my clients are still trying to make the dispersed teams work. It’s really hard. You might want to read my paper about the common problems on these teams. Where are we now? In Design Your Agile Project, Part 1, we discussed a straight-on approach to using whatever approach to agile, because it was obvious where you were. In Design Your Agile Project, Part 2, we discussed looking at your system of work, and thinking about your unknowns. You need to think about your risks, and consider what you need to experiment with, to design your own agile project. This part, part 3, is a continuation of part 2. It matters because you might need a servant leader who might be called a project manager. The title matters, because especially on a geographically distributed team, you are bridging the gap and the culture between the old way of working and the new way of working. I still think it’s Complex. Some of my clients think it’s Chaotic because they have so many impediments. Whatever it is, you need data to take your next step.Reference: Design Your Agile Project, Part 3 from our JCG partner Johanna Rothman at the Managing Product Development blog....

How to Level Up

I regularly hear from and read about technologists in a career rut. Unless one is both lucky and adept at predicting the future, experiencing some temporary stall can happen to professionals at any career stage. It may be the feeling of being stuck in an unchallenging role, feeling burdened by an undesirable skill set, or trapped in a company that seems difficult to escape. Career stagnation in technology could be defined as a prolonged period characterized by limited project variety, no advancement or even lateral movement, few tangible accomplishments, and little exposure to any emerging trends. Some managers are aware that workers in these situations generally leave, so the managers may proactively try to satisfy staff by shuffling teams and providing more interesting tasks. Many managers have to focus on deliverables and may give little thought to the career development of their charges, perhaps throwing money at retention problems instead of providing challenges.To “level up” could mean a promotion into management or technical leadership, a new start at a firm with increased opportunity, a role with autonomy and decision-making responsibility, or the ability to make significant improvements to skills and marketability. People that think about the leveling up concept often know what they want (or sometimes what they don’t want), but don’t necessarily see the best paths to get there. Leverage the skills you have to get the skills you want Most professionals view their current skills as a means to getting new jobs, but it’s useful to also think about skills as the key to acquiring other new skills. This tactic is most relevant when a skill set is dated and a previously strong marketability level is now questionable. Some will attempt to make a clean and immediate break from their old technologies or responsibilities into the new, usually with mixed results. As an example, many COBOL programmers tried to enter the stronger Java job market following Y2K. Some applied to jobs with no Java experience hoping their COBOL years would be deemed transferrable, while others pursued certifications and self-study to ideally be viewed as a somewhat “experienced” hire. One overlooked strategy was to approach companies that were using both COBOL and Java in some capacity, with the understanding that the candidate was willing to write COBOL if provided the ability to also work with Java. Most job seeking technologists have at least one ability that will help them contribute immediately to any other team or organization. It could be an obscure technical skill, leadership experience, or domain knowledge. Even if the skill is not something the person wants to use forever, it could be a key component to getting hired. Try to identify companies that may be looking for some specific expertise you can provide, even if it isn’t the most attractive tool in your bag, and be transparent about your willingness to do that less desirable work in exchange for exposure to skills that are in demand. DIY For those in the most stagnant of technical environments, taking on independent projects or open source may be the best way to gain experience and increase marketability. It’s usually preferable to learn new things on the job (because money), but being proactive about your career and keeping abreast of current marketable technologies will also show initiative to potential employers. The level up from personal projects almost always comes from an employment change. Sometimes to level up you need to take a step back – or sideways Careers aren’t always linear, and the expectation that trajectory needs to follow a strict continuous and incremental level of responsibilities is perhaps naive and potentially dangerous. Job seekers are often prone to placing too much weight on a position’s salary or (gasp) title without fully considering the position’s potential opportunity as it relates to future marketability and career development. Somewhat frequent movement between jobs is accepted in our industry, so positions can be viewed as stepping stones to future opportunities. When evaluating new roles, whether with a current employer or another firm, imagine what a three or four year tenure in that role at that company will look like on future résumés. Will the skills likely to be acquired or improved in that role be marketable and transferrable to other firms? Accepting positions that come with lateral responsibility and compensation is usually a wise long-term decision when provided a more favorable learning and growth environment.Reference: How to Level Up from our JCG partner Dave Fecak at the Job Tips For Geeks blog....

Open Source Completely Underestimates Contributor License Agreements

Reddit’s /r/ProgrammerHumor has recently treated us to this politically incorrect and quite childish little Open Source rant                Obviously, like most “discussions” on reddit and specifically those discussions about Open Source, things got quickly very serious with people referring to Richard Stallman and how these critiques are childish and immature and what’s-wrong-with-our-industry™ etc. Let’s not delve into useless polemics but let’s have another look at a real problem in Open Source: Two types of Open Source There are essentially two types of Open Source:Hobbyist’s Open Source Professional Open SourceHobbyist’s Open Source Hobbyist’s Open Source projects are geeky side-projects by some engineers / hackers / script kiddies / etc. who have fun experimenting with 1-2 things and who are hoping to have “the public” comment / use / profit from their work. They most often have no interest in money / fame / rewards. They’re just doing it for the fun. Most often, they also choose funny licenses, like the Beer License. There’s nothing wrong with that. Professional Open Source Professional Open Source may evolve from the above (as in our case), or it may be conceived as professional Open Source from the beginning (as most Apache, Red Hat, or Oracle projects). When doing professional Open Source, choosing the right license is of the essence, as it is almost impossible to change that license again. Why? Because all contributors effectively own their copyrights of their contributions under the terms of the original license or worse under their own terms. This is less of a problem if you’re choosing a license like the ASL 2.0, which also manages contributions (and any trademarks, patents foregoing a contribution) in section 5: 5. Submission of Contributions. Unless You explicitly state otherwise, any Contribution intentionally submitted for inclusion in the Work by You to the Licensor shall be under the terms and conditions of this License, without any additional terms or conditions. Notwithstanding the above, nothing herein shall supersede or modify the terms of any separate license agreement you may have executed with Licensor regarding such Contributions. However, you may still not change the license form ASL 2.0 to anything else (like GPL, or commercial) without the express permission of all your contributors. At least not for the contributed code. Managing contribution restrictions with CLAs At Data Geekery, we want to stay in full control of both source code maintenance, but also copyright. We take copyright very seriously, which is why we have bought / internalised all essential contributions by our contributors through a CLA (Contributor License Agreement). This allows us to continue publishing our jOOQ code under the terms of the ASL 2.0 and our more restrictive commercial license. Few Open Source projects / platforms actually do this. Among the most famous ones areThe Free Software Foundation The Eclipse Foundation The jQuery Foundation Joyent (Node.js) The Apache Foundation Our own (jooq.org)If you’re serious about your Open Source project, do take due diligence seriously as well. Copyright is not an option, you have to get this right, also in the best of interest of your customers / users. If you do not agree on contributor license terms with your contributors (e.g. by blindly merging all sorts of pull requests in your GitHub repository), you will:Restrict yourself to the current license forever … this includes not being able to dual-license your software … this includes not being able to commercially license your software … this may include the risk of having to pay royalties later on (not with ASL 2.0) … this may include the risk of your users having to pay royalties later on (not with ASL 2.0)CLAs on GitHub GitHub has become the number one hosting service for Open Source projects, world-wide. Yet, many projects are not getting licensing right. While GitHub offers a simple way to specify a popular license for your repository, there is currently no easy way to have your contributors sign CLAs. I’ve recently had a short discussion on Twitter with Assaf Arkin and Stefan Tilkov: If you fork a repository, submit a pull request that doesn’t change the license, do you still need a CLA? — Assaf Arkin (@assaf) March 22, 2014We all agreed that this is currently not handled correctly, in most cases. I have thus sent off a feature request to GitHub: I would like to enable a checkbox which all contributors have to check (after reading a document), to confirm that they’re complying with the contribution terms. You already have a similar feature with licenses when creating a new repository, which is great. But many people inevitably run into due dilience cases, because they just blindly merge all pull requests offered by anyone. So it would be useful to have a couple of default texts to chose from, and a possibility to create our own. Right now, I’m sending off PDFs by E-Mail for a signature. I hadn’t thought of Google Docs, yet, good idea. One source of inspiration could be the Eclipse Foundation, which has a fully automated CLA process, integrated into BugZilla. If a user submits a patch, you can immediately see if they have already signed the Eclipse Foundation CLA. I think that this would be a killer feature for GitHub, at least for the more professional OSS repositories. This feature request was well-received with GitHub support. If you think that this is a good idea, send them some love as well. It would really be great to finally get this right. Conclusion If you’re a user of Open Source software, beware of the above. Don’t just integrate any geeky script / tool that you happened to have found on the internet in your corporate, enterprise software. You are putting your employer at great legal risk if you do so. Open Source is not an excuse to pretend everything is free (of charge and of obligations). It is as well-defined a business as anything else. We cannot say this enough: Choose wisely when integrating third-party software. Even if it is Open SourceReference: Open Source Completely Underestimates Contributor License Agreements from our JCG partner Lukas Eder at the JAVA, SQL, AND JOOQ blog....

Services, Microservices, Nanoservices – oh my!

Apparently there’s this new distributed architecture thing called microservices out and about – so last week I went ahead and read Martin Fowler’s & James Lewis’s extensive article on the subject . and my reaction to this was basically:       I guess it is easier to use a new name (Microservices) rather than say that this is what SOA actually meant – re http://t.co/gvhxDfDWLG — Arnon Rotem-Gal-Oz (@arnonrgo) March 16, 2014Similar arguments (nothing new here) were also expressed after Martin’s tweet of his article e.g. Clemens Vasters’ comment: @martinfowler @boicy but these are the very principles of SOA before vendors does pushed the hub in the middle, i.e. ESB — Clemens Vasters (@clemensv) March 16, 2014Or Steve Jones’ post “Microservices is SOA, for those who know what SOA is.” Autonomy, smart endpoints, events etc. that the article talks about are all SOA concepts – If you happen to have read my book and you read this article you’ve probably identified these as patterns like inversion of communication, Service Host,Service Instance, Service Watchdog and others. So microservices as they appear from Martin’s & James’s article is pretty much service orientation without some of the bad misconception that tied into the SOA moniker like WS*, ESBs as a must, etc. -perhaps that’s a good enough reason for a new name but personally I doubt it. However, the story doesn’t end here there are various other opinions as to what microservices are such as Chris Ford’s view who says that “My own opinion is that microservice architectures can be understood through a single abstract architectural constraint which can be interpreted along many different degrees of freedom. X can be varied independently of the rest of the system.” The idea that something is separate  and can be varied independently from the rest of the system is  good but I would hardly say it is a definition of anything or at least anything new. CSCI (Computer Software Configuration Item) which I first heard of as part  of  DOD-STD-2167A (published in 1988) essentially means the same thing a CSCI is a component that can be varied independently from the rest of the system. In 2167A eyes it also means a very detailed , waterfall,  documentation laden process,  which isn’t what anyone thinks service or microservices should entail but it does demonstrate that “being varied independently” doesn’t mean much. I am sure some reader goes something like “but wait, we’re talking about micro-services here  - so they should also be small”  - indeed  there are posts like James Hughes’s on microservice with a quote like “First things first what actually is a micro service? Well there really isn’t a hard and fast definition but from conversations with various people there seems to be a consensus that a micro service is a simple application that sits around the 10-100 LOC mark.” (truth be told is that in the next sentence James says that LOC is an atrocious way to compare implementations but I thought it was worth repeating due to the use of the  words “there seems to be a consensus”) So how can you have 100 LOC services? you can get there if you rely on frameworks (like Finagle or Sinatra James mention) generate serialization/deserialization code (protobuff, thrift, avro etc.)  - this is essentially building on a smart service host. Another example for this would be developing in Erlang with its supervisor hierarchies which also brings us to another way to reduce LOC by use languages that are less verbose (like the aforementioned Erlang,  python or scala vs. say, Java). I would say., however, that if you find you have a 10  lines of code service, you are more likely than not, implementing a function as a service and you don’t have a real service micro or not- e.g. I can’t see you having a decentralized storage (and autonomy) as mentioned in Martin’s and Lewis’s article above or having monitoring and instrumentation that Hughes mention. You should also keep in mind that while better and cheaper networks allow us to push the limits – the fallacies of distributed computing still exist furthermore having a lot of small services that you need to manage along with the performance hits for serializations and deserializations, security etc. may very well mean that you moved from valid smaller “micro” services into the realm of headache which I called “Nano services” “Nanoservice is an antipattern where a service is too fine-grained. A nanoservice is a service whose overhead (communications, maintenance, and so on) outweighs its utility.” So there we have it, for the most part microservices is just another name for the principle of SOA, anther name might have been appropriate in the hype days of SOA, but I think these days most of that vapor has cleared and people understand better what’s needed. Furthermore if we do want to name proper SOA by a new name I think microservices is a poor term as it leads toward the slippery slope into nano-services and 10 lines of code which are just your old web-service method executed by a fancy chic host using a hot serialization format. Micro or not, services should be more useful than the overhear they incur.Reference: Services, Microservices, Nanoservices – oh my! from our JCG partner Arnon Rotem Gal Oz at the Cirrus Minor blog....

Oracle Drops Collection Literals in JDK 8

In a posting on the OpenJDK JEP 186 Oracle’s Brian Goetz informs that Oracle will not be pursuing collection literals as a language feature in JDK8. A collection literal is a syntactic expression form that evaluates to an aggregate type as an array, List or Map. Project Coin proposed collection literals, which also complements the library additions in Java SE8. The assumption was that collection literals would increase productivity, code readability, and code safety. As an alternative Oracle suggests a library-based proposal based on the concept of static methods on interfaces. The Implementation would ideally be via new dedicated immutable classes. Following are the major points behind this library-based approach.The basic solution of this feature works only for Sets, Lists and Maps so it is not very satisfying or popular. The advanced solution to cover an extensible set of other collection types is open-ended, messy, and virtually guaranteed to way overrun its design budget. The library-based changes would remove much of the requirement for the “collection literals” change discussed in Project Coin. The library-based approach gives X% of the benefit for 1% of the cost, where X >> 1. The value types are coming and the behavior of this new feature (collection literals) with the value types is not known. It is better not to try collection literal before the value types. It is better off focusing Oracle’s language-design bandwidth on addressing foundational issues underlying a library-based version. This includes more efficient varargs, array constants in the constant pool, immutable arrays, and support for caching (and reclaiming under pressure) intermediate immutable results.According to Oracle’s Brian Goetz, the real pain is in Maps not Lists, Sets or Arrays. The library-based solutions are more acceptable for Lists, Sets and Arrays. But this approach still lacks a reasonable way to describe pair literals as Maps. The Static methods in an interface make the library-based solution more practical. The value types make library-based solutions for Map far more practical too. The proof of concept patch for the library-based solution is also available.Reference: Oracle Drops Collection Literals in JDK 8 from our JCG partner Kaushik Pal at the TechAlpine – The Technology world blog....

Testing Lucene’s index durability after crash or power loss

One of Lucene’s useful transactional features is index durability which ensures that, once you successfully call IndexWriter.commit, even if the OS or JVM crashes or power is lost, or you kill -KILL your JVM process, after rebooting, the index will be intact (not corrupt) and will reflect the last successful commit before the crash. Of course, this only works if your hardware is healthy and your IO devices implement fsync properly (flush their write caches when asked by the OS). If you have data-loss issues, such as a silent bit-flipper in your memory, IO or CPU paths, thanks to the new end-to-end checksum feature ( LUCENE-2446), available as of Lucene 4.8.0, Lucene will now detect that as well during indexing or CheckIndex. This is similar to the ZFS file system‘s block-level checksums, but not everyone uses ZFS yet (heh), and so Lucene now does its own checksum verification on top of the file system. Be sure to enable checksum verification during merge by calling IndexWriterConfig.setCheckIntegrityAtMerge. In the future we’d like to remove that option and always validate checksums on merge, and we’ve already done so for the default stored fields format in LUCENE-5580 and (soon) term vectors format in LUCENE-5602, as well as set up the low-level IO APIs so other codec components can do so as well, with LUCENE-5583, for Lucene 4.8.0. FileDescriptor.sync and fsync Under the hood, when you call IndexWriter.commit, Lucene gathers up all newly written filenames since the last commit, and invokes FileDescriptor.sync on each one to ensure all changes are moved to stable storage. At its heart, fsync is a complex operation, as the OS must flush any dirty pages associated with the specified file from its IO buffer cache, work with the underlying IO device(s) to ensure their write caches are also flushed, and also work with the file system to ensure its integrity is preserved. You can separately fsync the bytes or metadata for a file, and also the directory(ies) containing the file. This blog post is a good description of the challenges. Recently we’ve been scrutinizing these parts of Lucene, and all this attention has uncovered some exciting issues! In LUCENE-5570, to be fixed in Lucene 4.7.2, we discovered that the fsync implementation in our FSDirectory implementations is able to bring new 0-byte files into existence. This normally isn’t a problem by itself, because IndexWriter shouldn’t fsync a file that it didn’t create. However, it exacerbates debugging when there is a bug in IndexWriter or in the application using Lucene (e.g., directly deleting index files that it shouldn’t). In these cases it’s confusing to discover these 0-byte files so much later, versus hitting a FileNotFoundException at the point when IndexWriter tried to fsync them. In LUCENE-5588, to be fixed in Lucene 4.8.0, we realized we must also fsync the directory holding the index, otherwise it’s possible on an OS crash or power loss that the directory won’t link to the newly created files or that you won’t be able to find your file by its name. This is clearly important because Lucene lists the directory to locate all the commit points ( segments_N files), and of course also opens files by their names.Since Lucene does not rely on file metadata like access time and modify time, it is tempting to use fdatasync (or FileChannel.force(false) from java) to fsync just the file’s bytes. However, this is an optimization and at this point we’re focusing on bugs. Furthermore, it’s likely this won’t be any faster since the metadata must still be sync’d by fdatasync if the file length has changed, which is always the case in Lucene since we only append to files when writing (we removed Indexoutput.seek in LUCENE-4399). In LUCENE-5574, to be fixed as of Lucene 4.7.2, we found that a near-real-time reader, on closing, could delete files even if the writer it was opened from has been closed. This is normally not a problem by itself, because Lucene is write-once (never writes to the same file name more than once), as long as you use Lucene’s APIs and don’t modify the index files yourself. However, if you implement your own index replication by copying files into the index, and if you don’t first close your near-real-time readers, then it is possible closing them would remove the files you had just copied. During any given indexing session, Lucene writes many files and closes them, many files are deleted after being merged, etc., and only later, when the application finally calls IndexWriter.commit, will IndexWriter then re-open the newly created files in order to obtain a FileDescriptor so we can fsync them. This approach (closing the original file, and then opening it again later in order to sync), versus never closing the original file and syncing that same file handle you used for writing, is perhaps risky: the javadocs for FileDescriptor.sync are somewhat vague as to whether this approach is safe. However, when we check the documentation for fsync on Unix/Posix and FlushFileBuffers on Windows, they make it clear that this practice is fine, in that the open file descriptor is really only necessary to identify which file’s buffers need to be sync’d. It’s also hard to imagine an OS that would separately track which open file descriptors had made which changes to the file. Nevertheless, out of paranoia or an abundance of caution, we are also exploring a possible patch on LUCENE-3237 to fsync only the originally opened files. Testing that fsync really works With all these complex layers in between your application’s call to IndexWriter.commit and the laws of physics ensuring little magnets were flipped or a few electrons were moved into a tiny floating gate in a NAND cell, how can we reliably test that the whole series of abstractions is actually working? In Lucene’s randomized testing framework we have a nice evil Directory implementation called MockDirectoryWrapper. It can do all sorts of nasty things like throw random exceptions, sometimes slow down opening, closing and writing of some files, refuse to delete still-open files (like Windows), refuse to close when there are still open files, etc. This has helped us find all sorts of fun bugs over time. Another thing it does on close is to simulate an OS crash or power loss by randomly corrupting any un-sycn’d files and then confirming the index is not corrupt. This is useful for catching Lucene bugs where we are failing to call fsync when we should, but it won’t catch bugs in our implementation of sync in our FSDirectory classes, such as the frustrating LUCENE-3418 (first appeared in Lucene 3.1 and finally fixed in Lucene 3.4).So, to catch such bugs, I’ve created a basic test setup, making use of a simple Insteon on/off device, along with custom Python bindings I created long ago to interact with Insteon devices. I already use these devices all over my home for controlling lights and appliances, so also using this for Lucene is a nice intersection of two of my passions! The script loops forever, first updating the sources, compiling, checking the index for corruption, then kicking off an indexing run with some randomization in the settings, and finally, waiting a few minutes and then cutting power to the box. Then, it restores power, waits for the machine to be responsive again, and starts again. So far it’s done 80 power cycles and no corruption yet. Good news! To “test the tester”, I tried temporarily changing fsync to do nothing, and indeed after a couple iterations, the index became corrupt. So indeed the test setup seems to “work”. Currently the test uses Linux on a spinning magnets hard drive with the ext4 file system. This is just a start, but it’s better than no proper testing for Lucene’s fsync. Over time I hope to test different combinations of OS’s, file systems, IO hardware, etc.Reference: Testing Lucene’s index durability after crash or power loss from our JCG partner Michael Mc Candless at the Changing Bits blog....

Attempt to map WCF to Java terms

By writing this post I’m taking a huge risk of being rejected by both .NET and Java communities. This is an attempt to explain what WCF, which stands for Windows Communication Foundation, is in Java terms. WCF-to-Java mapping is not really trivial. I’m lacking understanding to what extend WFC consumer should be aware about the type of communication with the service: request/response or asynchronous messaging. I have difficulties imagining this is completely transparent for the consumer… unless WCF framework “removes” asynchronousity of messaging and takes care of waiting for a response message(s). If the latest happens, then there is actually no asynchronous messaging! As usual with Java (and I was truly missing it working with .NET), there are Specifications of technologies and there are various Implementations of these specifications. Although normally the applications are being tested with and therefore claim to support explicit Implementations of used Specifications, in theory the final selection of those is done during deployment or just before the application starts. Whenever we talk about a service, we have the actual service and its consumers. Let’s start with consumers. For sending asynchronous messages they’d better be written against JMS – Specification for Java Messaging System. Consumers of JMS just need to know logical name of the target queue or topic. For request/response communication consumers should be written against a plain Interface of service. This Interface is agnostic to the technologies used on the service side and in the transportation layer. To obtain an explicit implementation of the Interface at run-time the consumer uses an externally configurable Factory. This factory will use something like JAX-WS for Web Services, JAX-RS for RESTful services, RMI for remote EJBs (Enterprise Java Beans) or plain object (POJO) for in-process services. Are you still here? Then let’s move to the service side. If the service consumes messages, it can be implemented using JMS directly or as Message-Driven Bean (flavor of EJB). The last option provides you with all that transactivity and scalability from Application Server (something like IIS). If the service should provide responses (including failures), the golden rule is to let them implement a plain Interface – the one, which will be used by the service consumer. Then either by adding annotations to the Interface Implementation code or by using external configuration in Application Server your implementation becomes accessible as Web Service or Session EJB. Actually nowadays most of the Servers are capable of exposing Session EJBs as Web Services. If you use Proxy pattern, you also have a clean, unspoiled implementation of the Interface, which can be used by in-process consumers. This is a very lengthy explanation. A shorter translation of “All cross-layer entities are WCF services” would be: “All entities are defined by their Interfaces and written against Interfaces of other entities. Implementations of the entities are Plain Old Java Objects (POJOs), possibly wrapped by EJB Proxies“Reference: Attempt to map WCF to Java terms from our JCG partner Viktor Sadovnikov at the jv-ration blog....

MongoDB 2.6 is out

Introduction MongoDB is evolving rapidly. The 2.2 version introduced the aggregation framework as an alternative to the Map-Reduce query model. Generating aggregated reports is a recurrent requirement for enterprise systems and MongoDB shines in this regard. If you’re new to it you might want to check this aggregation framework introduction or the performance tuning and the data modelling guides. Let’s reuse the data model I first introduced while demonstrating the blazing fast MongoDB insert capabilities:   { "_id" : ObjectId("5298a5a03b3f4220588fe57c"), "created_on" : ISODate("2012-04-22T01:09:53Z"), "value" : 0.1647851116706831 } MongoDB 2.6 Aggregation enhancements In the 2.4 version, if I run the following aggregation query: db.randomData.aggregate( [ { $match: { "created_on" : { $gte : new Date(Date.UTC(2012, 0, 1)), $lte : new Date(Date.UTC(2012, 0, 10)) } } }, { $group: { _id : { "minute" : { $minute : "$created_on" } }, "values": { $addToSet: "$value" } } }]); I hit the 16MB aggregation result limitation: { "errmsg" : "exception: aggregation result exceeds maximum document size (16MB)", "code" : 16389, "ok" : 0 } MongoDB documents are limited to 16MB, and prior to the 2.6 version, the aggregation result was a BSON document. The 2.6 version replaced it with a cursor instead. Running the same query on 2.6 yields the following result: db.randomData.aggregate( [ { $match: { "created_on" : { $gte : new Date(Date.UTC(2012, 0, 1)), $lte : new Date(Date.UTC(2012, 0, 10)) } } }, { $group: { _id : { "minute" : { $minute : "$created_on" } }, "values": { $addToSet: "$value" } } }]) .objsLeftInBatch(); 14 I used the cursor-based objsLeftInBatch method to test the aggregation result type and the 16MB limitation no longer applies to the overall result. The cursor inner results are regular BSON documents, hence they are still limited to 16MB, but this is way more manageable than the previous overall result limit. The 2.6 version also addresses the aggregation memory restrictions. A full collection scan such as: db.randomData.aggregate( [ { $group: { _id : { "minute" : { $minute : "$created_on" } }, "values": { $addToSet: "$value" } } }]) .objsLeftInBatch(); can end up with the following error: { "errmsg" : "exception: Exceeded memory limit for $group, but didn't allow external sort. Pass allowDiskUse:true to opt in.", "code" : 16945, "ok" : 0 } So, we can now perform large sort operations using the allowDiskUse parameter: db.randomData.aggregate( [ { $group: { _id : { "minute" : { $minute : "$created_on" } }, "values": { $addToSet: "$value" } } }] , { allowDiskUse : true }) .objsLeftInBatch(); The 2.6 version allows us to save the aggregation result to a different collection using the newly added $out stage. db.randomData.aggregate( [ { $match: { "created_on" : { $gte : new Date(Date.UTC(2012, 0, 1)), $lte : new Date(Date.UTC(2012, 0, 10)) } } }, { $group: { _id : { "minute" : { $minute : "$created_on" } }, "values": { $addToSet: "$value" } } }, { $out : "randomAggregates" } ]); db.randomAggregates.count(); 60 New operators have been added such as let, map, cond, to name a few. The next example will append AM or PM to the time info of each specific event entry. var dataSet = db.randomData.aggregate( [ { $match: { "created_on" : { $gte : new Date(Date.UTC(2012, 0, 1)), $lte : new Date(Date.UTC(2012, 0, 2)) } } }, { $project: { "clock" : { $let: { vars: { "hour": { $substr: ["$created_on", 11, -1] }, "am_pm": { $cond: { if: { $lt: [ {$hour : "$created_on" }, 12 ] } , then: 'AM',else: 'PM'} } }, in: { $concat: [ "$$hour", " ", "$$am_pm"] } } } } }, { $limit : 10 } ]); dataSet.forEach(function(document) { printjson(document); }); Resulting in: "clock" : "16:07:14 PM" "clock" : "22:14:42 PM" "clock" : "21:46:12 PM" "clock" : "03:35:00 AM" "clock" : "04:14:20 AM" "clock" : "03:41:39 AM" "clock" : "17:08:35 PM" "clock" : "18:44:02 PM" "clock" : "19:36:07 PM" "clock" : "07:37:55 AM" Conclusion MongoDB 2.6 version comes with a lot of other enhancements such as bulk operations or index intersection. MongoDB is constantly evolving, offering a viable alternative for document-based storage. At such a development rate, there’s no wonder it was named 2013 database of the year.Reference: MongoDB 2.6 is out from our JCG partner Vlad Mihalcea at the Vlad Mihalcea’s Blog blog....

Yet another way to handle exceptions in JUnit: catch-exception

There are many ways of handling exceptions in JUnit (3 ways of handling exceptions in JUnit. Which one to choose?, JUnit ExpectedException rule: beyond basics). In this post I will introduce catch-exception library that I was recommended to give a try. In short, catch-exceptions is a library that catches exceptions in a single line of code and makes them available for further analysis. Install via Maven In order to get started quickly, I used my Unit Testing Demo project with a set of test dependencies (JUnit, Mocito, Hamcrest, AssertJ) and added catch-exceptions: <dependency> <groupId>com.googlecode.catch-exception</groupId> <artifactId>catch-exception</artifactId> <version>1.2.0</version> <scope>test</scope> </dependency> So the dependency tree looks as follows: [INFO] --- maven-dependency-plugin:2.1:tree @ unit-testing-demo --- [INFO] com.github.kolorobot:unit-testing-demo:jar:1.0.0-SNAPSHOT [INFO] +- org.slf4j:slf4j-api:jar:1.5.10:compile [INFO] +- org.slf4j:jcl-over-slf4j:jar:1.5.10:runtime [INFO] +- org.slf4j:slf4j-log4j12:jar:1.5.10:runtime [INFO] +- log4j:log4j:jar:1.2.15:runtime [INFO] +- junit:junit:jar:4.11:test [INFO] +- org.mockito:mockito-core:jar:1.9.5:test [INFO] +- org.assertj:assertj-core:jar:1.5.0:test [INFO] +- org.hamcrest:hamcrest-core:jar:1.3:test [INFO] +- org.hamcrest:hamcrest-library:jar:1.3:test [INFO] +- org.objenesis:objenesis:jar:1.3:test [INFO] \- com.googlecode.catch-exception:catch-exception:jar:1.2.0:test Getting started System under test (SUT): class ExceptionThrower { void someMethod() { throw new RuntimeException("Runtime exception occurred"); } void someOtherMethod() { throw new RuntimeException("Runtime exception occurred", new IllegalStateException("Illegal state")); } void yetAnotherMethod(int code) { throw new CustomException(code); } } The basic catch-exception BDD-style approach example with AssertJ assertions: import org.junit.Test; import static com.googlecode.catchexception.CatchException.*; import static com.googlecode.catchexception.apis.CatchExceptionAssertJ.*; public class CatchExceptionsTest { @Test public void verifiesTypeAndMessage() { when(new SomeClass()).someMethod(); then(caughtException()) .isInstanceOf(RuntimeException.class) .hasMessage("Runtime exception occurred") .hasMessageStartingWith("Runtime") .hasMessageEndingWith("occured") .hasMessageContaining("exception") .hasNoCause(); } } Looks good. Concise, readable. No JUnit runners. Please note, that I specified which method of SomeClass I expect to throw an exception. As you can imagine, I can check multiple exceptions in one test. Although I would not recommend this approach as it may feel like violating a single responsibility of a test. By the way, if you are working with Eclipse this may be handy for you: Improve content assist for types with static members while creating JUnit tests in Eclipse Verify the cause I think there is no comment needed for the below code: import org.junit.Test; import static com.googlecode.catchexception.CatchException.*; import static com.googlecode.catchexception.apis.CatchExceptionAssertJ.*; public class CatchExceptionsTest { @Test public void verifiesCauseType() { when(new ExceptionThrower()).someOtherMethod(); then(caughtException()) .isInstanceOf(RuntimeException.class) .hasMessage("Runtime exception occurred") .hasCauseExactlyInstanceOf(IllegalStateException.class) .hasRootCauseExactlyInstanceOf(IllegalStateException.class); } } Verify custom exception with Hamcrest To verify a custom exception I used the Hamcrest matcher code from my previous post: class CustomException extends RuntimeException { private final int code; public CustomException(int code) { this.code = code; } public int getCode() { return code; } } class ExceptionCodeMatches extends TypeSafeMatcher<CustomException> { private int expectedCode; public ExceptionCodeMatches(int expectedCode) { this.expectedCode = expectedCode; } @Override protected boolean matchesSafely(CustomException item) { return item.getCode() == expectedCode; } @Override public void describeTo(Description description) { description.appendText("expects code ") .appendValue(expectedCode); } @Override protected void describeMismatchSafely(CustomException item, Description mismatchDescription) { mismatchDescription.appendText("was ") .appendValue(item.getCode()); } } And the test: import org.junit.Test; import static com.googlecode.catchexception.CatchException.*; import static org.junit.Assert.*; public class CatchExceptionsTest { @Test public void verifiesCustomException() { catchException(new ExceptionThrower(), CustomException.class).yetAnotherMethod(500); assertThat((CustomException) caughtException(), new ExceptionCodeMatcher(500)); } } Summary catch-exception looks really good. It is easy to get started quickly. I see some advantages over method rule in JUnit. If I have a chance, I will investigate the library more thoroughly, hopefully in a real-world project.The source code of this article can be found here: Unit Testing DemoIn case you are interested, please have a look at my other posts:3 ways of handling exceptions in JUnit. Which one to choose? JUnit ExpectedException rule: beyond basics HOW-TO: Test dependencies in a Maven project (JUnit, Mocito, Hamcrest, AssertJ) Improve content assist for types with static members while creating JUnit tests in EclipseReference: Yet another way to handle exceptions in JUnit: catch-exception from our JCG partner Rafal Borowiec at the Codeleak.pl blog....
Java Code Geeks and all content copyright © 2010-2014, Exelixis Media Ltd | Terms of Use
All trademarks and registered trademarks appearing on Java Code Geeks are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries.
Java Code Geeks is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.

Sign up for our Newsletter

15,153 insiders are already enjoying weekly updates and complimentary whitepapers! Join them now to gain exclusive access to the latest news in the Java world, as well as insights about Android, Scala, Groovy and other related technologies.

As an extra bonus, by joining you will get our brand new e-books, published by Java Code Geeks and their JCG partners for your reading pleasure! Enter your info and stay on top of things,

  • Fresh trends
  • Cases and examples
  • Research and insights
  • Two complimentary e-books