Featured FREE Whitepapers

What's New Here?

mongodb-logo

How to connect to MongoDB from a Java EE stateless application

In this post I will present how to connect to MongoDB from a stateless Java EE application, to take advantage of the built-in pool of connections to the database offered by the MongoDB Java Driver. This might be the case if you develop a REST API, that executes operations against a MongoDB. Get the Java MongoDb Driver To connect from Java to MongoDB, you can use the Java MongoDB Driver.  If you are building your application with Maven, you can add the dependency to the pom.xml file: MongoDB java driver dependencyorg.mongodb mongo-java-driver 2.12.3The driver provides a MongoDB client (com.mongodb.MongoClient) with internal pooling. The MongoClient class is designed to be thread safe and shared among threads. For most applications, you should have one MongoClient instace for the entire JVM. Because of that you wouldn’t want create a new MongoClient instace with each request in your Java EE stateless application. Implement a @Singleton EJB A simple solution is to use a @Singleton EJB to hold the MongoClient: Singleton to hold the MongoClient package org.codingpedia.demo.mongoconnection;import java.net.UnknownHostException;import javax.annotation.PostConstruct; import javax.ejb.ConcurrencyManagement; import javax.ejb.ConcurrencyManagementType; import javax.ejb.Lock; import javax.ejb.LockType; import javax.ejb.Singleton;import com.mongodb.MongoClient;@Singleton @ConcurrencyManagement(ConcurrencyManagementType.CONTAINER) public class MongoClientProvider { private MongoClient mongoClient = null; @Lock(LockType.READ) public MongoClient getMongoClient(){ return mongoClient; } @PostConstruct public void init() { String mongoIpAddress = "x.x.x.x"; Integer mongoPort = 11000; try { mongoClient = new MongoClient(mongoIpAddress, mongoPort); } catch (UnknownHostException e) { // TODO Auto-generated catch block e.printStackTrace(); } } } Note:@Singleton – probably the most important line of code in this class. This annotation specifies that there will be exactly one singleton of this type of bean in the application. This bean can be invoked concurrently by multiple threads. It comes also with a @PostConstruct annotation. This annotation is used on a method that needs to be executed after dependency injection is done to perform any initialization – in our case is to initialize the MongoClient the @ConcurrencyManagement(ConcurrencyManagementType.CONTAINER) declares a singleton session bean’s concurrency management type. By default it is set to Container, I use it here only to highlight its existence. The other option ConcurrencyManagementType.BEAN specifies that the bean developer is responsible for managing concurrent access to the bean instance. the @Lock(LockType.READ) specifies the concurrency lock type for singleton beans with container-managed concurrency. When set to LockType.READ, it enforces the method to permit full concurrent access to it (assuming no write locks are held). This permits several threads to access the same MongoClient instance and take advantage of the internal pool of connections to the database. This is VERY IMPORTANT, because the other more conservative option @Lock(LockType.WRITE), is the DEFAULT and enforces exclusive access to the bean instance. This should make the method slower in a highly concurrent environment…Use the @Singleton EJB Now that you have the MongoClient “persisted” in the application, you can inject the MongoClientProvider to access the MongoDB (to get the collection names for example): Access MongoClient from other beans example package org.codingpedia.demo.mongoconnection;import java.util.Set;import javax.ejb.EJB; import javax.ejb.Stateless;import com.mongodb.BasicDBObject; import com.mongodb.DB; import com.mongodb.DBCollection; import com.mongodb.DBObject; import com.mongodb.MongoClient; import com.mongodb.util.JSON;@Stateless public class TestMongoClientProvider { @EJB MongoClientProvider mongoClientProvider; public Set<String> getCollectionNames(){ MongoClient mongoClient = mongoClientProvider.getMongoClient(); DB db = mongoClient.getDB("myMongoDB"); Set<String> colls = db.getCollectionNames(); for (String s : colls) { System.out.println(s); } return colls; } } Note: The db object will be a connection to a MongoDB server for the specified database. With it, you can do further operations.  I encourage you to read the Getting Started with Java Driver for more on that… Be aware One aspect to bear mind: “For every request to the DB (find, insert, etc) the Java thread will obtain a connection from the pool, execute the operation, and release the connection. This means the connection (socket) used may be different each time. Additionally in the case of a replica set with slaveOk option turned on, the read operations will be distributed evenly across all slaves. This means that within the same thread, a write followed by a read may be sent to different servers (master then slave). In turn the read operation may not see the data just written since replication is asynchronous. If you want to ensure complete consistency in a “session” (maybe an http request), you would want the driver to use the same socket, which you can achieve by using a “consistent request”. Call requestStart() before your operations and requestDone() to release the connection back to the pool: Ensuring complete consistency in a DB db...; db.requestStart(); try { db.requestEnsureConnection();code.... } finally { db.requestDone(); } DB and DBCollection are completely thread safe. In fact, they are cached so you get the same instance no matter what.” [3] ResourcesJava MongoDB Driver Getting Started with Java Driver Java Driver Concurrency GitHub – mongodb / mongo-java-driver examplesReference: How to connect to MongoDB from a Java EE stateless application from our JCG partner Adrian Matei at the Codingpedia.org blog....
java-logo

Reducing the frequency of major GC pauses

This post will discuss a technique to reduce the burden garbage collection pauses put on the latency of your application. As I have written couple of years ago, disabling garbage collection is not possible in JVM. But there is a clever trick that can be used to significantly reduce the length and frequency of the long pauses. As you are aware, there are two different GC events taking place within the JVM, called minor and major collections. There is a lot of material available about what takes place during those collections, so I will not focus on describing the mechanics in detail. I will just remind that in Hotspot JVM – during minor collection, eden and survivor spaces are collected, in major collection the tenured space also gets cleaned and (possibly) compacted. If you turn on the GC logging (-XX:+PrintGCDetails for example) then you immediately notice that the major collections are the ones you should focus. The length of a major garbage collection taking place is typically several times larger than the one cleaning young space. During a major GC there are two aspects requiring more time to complete. First and foremost, the survivors from young space are copied to old. Next, besides cleaning the unused references from the old generation, most of the GC algorithms also compact the old space, again requiring precious CPU cycles to be burnt. Having lots of objects in old space also increases the likelihood of having more references from old space to young space. This results in larger card tables, keeping track of the references and increases the length of the minor GC pauses, when these tables are checked to decide whether objects in young space are eligible for GC. So, if we cannot turn off the garbage collection, can we make sure these lengthy major GCs run less often and the reference count from Tenured space to Young stays low? The answer is yes. There are even some crazy configurations which have managed to get rid of the major GC altogether. Getting rid of major GC events  is truly a complex exercise, but reducing the frequency of those long pauses is something every deployment can achieve. The strategy we are looking at is limiting the number of objects which get tenured. In a typical web application for example, most of the objects created are useful only during the HttpRequest. There is and always will be shared state having longer life span, but key is in the fact that there is a very high ratio of short lived objects versus long lived shared state. The tricky part for any deployment out there now is to understand how much elbow room to give for the short-lived objects, so thatYou can guarantee that the short lived objects do not get promoted to Tenured space You are not over-provisioning, increasing the cost of your infrastructureOn conceptual level, achieving this is easy. You just need to measure the amount of memory allocated for short-lived objects during the requests and multiply it with the peak load time. What you will end up is the amount of memory you would want to fit either into eden or into a single survivor space. This will allow the GC to run truly efficiently without any accidental promotions to tenured. Zooming in from the conceptual level surfaces several complex technical issues, which I will open up in the forthcoming posts. So what to conclude from here? First and foremost – determining the perfect GC configuration for your application is a complex exercise. This is both bad and good news. Bad in regard that – it needs a lot of experiments from your side. Good in regard that – we like difficult problems and we are currently crafting experiments to investigate the domain further. Some day, not too far in the future, Plumbr is able to do it for you, saving you from boring plumbing job and allowing you to focus on the actual problem at hand.Reference: Reducing the frequency of major GC pauses from our JCG partner Nikita Salnikov Tarnovski at the Plumbr Blog blog....
java-logo

Dead simple configuration

Whole frameworks have been written with the purpose of handling the configuration of your application. I prefer a simpler way. If by configuration we mean “everything that is likely to vary between deploys“, it follows that we should try and keep configuration simple. In Java, the simplest option is the humble properties file. The downside of a properties file is that you have to restart your application when you want it to pick up changes. Or do you? Here’s a simple method I’ve used on several projects:     public class MyAppConfig extends AppConfiguration {private static MyAppConfig instance = new MyAppConfig();public static MyAppConfig instance() { return instance; }private MyAppConfig() { this("myapp.properties"); }public String getServiceUrl() { return getRequiredProperty("service.url"); }public boolean getShouldStartSlow() { return getFlag("start-slow", false); } public int getHttpPort(int defaultPort) { return getIntProperty("myapp.http.port", defaultPort); }} The AppConfiguration class looks like this: public abstract class AppConfiguration {private static Logger log = LoggerFactory.getLogger(AppConfiguration.class);private long nextCheckTime = 0; private long lastLoadTime = 0; private Properties properties = new Properties(); private final File configFile;protected AppConfiguration(String filename) { this.configFile = new File(filename); }public String getProperty(String propertyName, String defaultValue) { String result = getProperty(propertyName); if (result == null) { log.trace("Missing property {} in {}", propertyName, properties.keySet()); return defaultValue; } return result; }public String getRequiredProperty(String propertyName) { String result = getProperty(propertyName); if (result == null) { throw new RuntimeException("Missing property " + propertyName); } return result; }private String getProperty(String propertyName) { if (System.getProperty(propertyName) != null) { log.trace("Reading {} from system properties", propertyName); return System.getProperty(propertyName); } if (System.getenv(propertyName.replace('.', '_')) != null) { log.trace("Reading {} from environment", propertyName); return System.getenv(propertyName.replace('.', '_')); }ensureConfigurationIsFresh(); return properties.getProperty(propertyName); }private synchronized void ensureConfigurationIsFresh() { if (System.currentTimeMillis() < nextCheckTime) return; nextCheckTime = System.currentTimeMillis() + 10000; log.trace("Rechecking {}", configFile);if (!configFile.exists()) { log.error("Missing configuration file {}", configFile); }if (lastLoadTime >= configFile.lastModified()) return; lastLoadTime = configFile.lastModified(); log.debug("Reloading {}", configFile);try (FileInputStream inputStream = new FileInputStream(configFile)) { properties.clear(); properties.load(inputStream); } catch (IOException e) { throw new RuntimeException("Failed to load " + configFile, e); } } } This reads the configuration file in an efficient way and updates the settings as needed. It supports environment variables and system properties as defaults. And it even gives a pretty good log of what’s going on.For the full source code and a magic DataSource which updates automatically, see this gist: https://gist.github.com/jhannes/b8b143e0e5b287d73038Enjoy!Reference: Dead simple configuration from our JCG partner Johannes Brodwall at the Thinking Inside a Bigger Box blog....
akka-logo

Akka Notes – Actor Logging and Testing

In the first two parts (one, two), we briefly talked about Actors and how messaging works. In this part, let’s look at fixing up Logging and Testing our TeacherActor. Recap This is how our Actor from the previous part looked like :           class TeacherActor extends Actor {val quotes = List( "Moderation is for cowards", "Anything worth doing is worth overdoing", "The trouble is you think you have time", "You never gonna know if you never even try")def receive = {case QuoteRequest => {import util.Random//Get a random Quote from the list and construct a response val quoteResponse=QuoteResponse(quotes(Random.nextInt(quotes.size)))println (quoteResponse)} } } Logging Akka with SLF4J You notice that in the code we are printing the quoteResponse to the standard output which you would obviously agree is a bad idea. Let’s fix that up by enabling logging via the SLF4J Facade. 1. Fix the Class to use Logging Akka provides a nice little trait called ActorLogging to achieve it. Let’s mix that in : class TeacherLogActor extends Actor with ActorLogging {val quotes = List( "Moderation is for cowards", "Anything worth doing is worth overdoing", "The trouble is you think you have time", "You never gonna know if you never even try")def receive = {case QuoteRequest => {import util.Random//get a random element (for now) val quoteResponse=QuoteResponse(quotes(Random.nextInt(quotes.size))) log.info(quoteResponse.toString()) } }//We'll cover the purpose of this method in the Testing section def quoteList=quotes} A small detour here : Internally, when we log a message, the logging methods in the ActorLogging (eventually) publish the log message to an EventStream. Yes, I did say publish. So, what actually is an EventStream? EventStream and Logging EventStream behaves just like a message broker to which we could publish and receive messages. One subtle distinction from a regular MOM is that the subscribers of the EventStream could only be an Actor. In case of logging messages, all log messages would be published to the EventStream. By default, the Actor that subscribes to these messages is the DefaultLogger which simply prints the message to the standard output. class DefaultLogger extends Actor with StdOutLogger { override def receive: Receive = { ... case event: LogEvent ⇒ print(event) } } So, that’s the reason when we try to kick off the StudentSimulatorApp, we see the log message written to the console. That said, EventStream isn’t suited only for logging. It is a general purpose publish-subscribe mechanism available inside the ActorWorld inside a VM (more on that later). Back to SLF4J setup : 2. Configure Akka to use SLF4J akka{ loggers = ["akka.event.slf4j.Slf4jLogger"] loglevel = "DEBUG" logging-filter = "akka.event.slf4j.Slf4jLoggingFilter" } We store this information in a file called application.conf which should be in your classpath. In our sbt folder structure, we would throw this in your main/resources directory. From the configuration, we could derive that:the loggers property indicates the Actor that is going to subscribe to the log events. What Slf4jLogger does is to simply consume the log messages and delegate that to the SLF4J Logger facade. the loglevel property simply indicates the minimum level that should be considered for logging. the logging-filter compares the currently configured loglevel and incoming log message level and chucks out any log message below the configured loglevel before publishing to the EventStream.But why didn’t we have an application.conf for the previous example? Simply because Akka provides some sane defaults so that we needn’t build a configuration file before we start playing with it. We’ll revisit this file too often here on for customizing various things. There are a whole bunch of awesome parameters that you could use inside the application.conf for logging alone. They are explained in detail here. 3. Throw in a logback.xml We’ll be configuring an SLF4J logger backed by logback now. <?xml version="1.0" encoding="UTF-8"?> <configuration> <appender name="FILE" class="ch.qos.logback.core.rolling.RollingFileAppender"> <encoder> <pattern>%d{HH:mm:ss.SSS} [%thread] %-5level %logger{36} - %msg%n</pattern> </encoder><rollingPolicy class="ch.qos.logback.core.rolling.TimeBasedRollingPolicy"> <fileNamePattern>logs\akka.%d{yyyy-MM-dd}.%i.log</fileNamePattern> <timeBasedFileNamingAndTriggeringPolicy class="ch.qos.logback.core.rolling.SizeAndTimeBasedFNATP"> <maxFileSize>50MB</maxFileSize> </timeBasedFileNamingAndTriggeringPolicy> </rollingPolicy> </appender><root level="DEBUG"> <appender-ref ref="FILE" /> </root> </configuration> I threw this inside the main/resources folder too along with application.conf. Please ensure that the main/resources is now in your eclipse or other IDE’s classpath. Also include logback and slf4j-api to your build.sbt. And when we kick off our StudentSimulatorApp and send a message to our new TeacherLogActor, the akkaxxxxx.log file that we configured looks like this.Testing Akka Please note that this is by no means an exhaustive coverage of Testing Akka. We would be building our tests on more features of Testing in the following parts under their respective topic headers. These testcases are aimed to cover the Actors we wrote earlier. While the StudentSimulatorApp does what we need, you would agree that it should be driven out of testcases. To ease the testing pain, Akka came up with an amazing testing toolkit with which we could do some magical stuff like probing directly into the Actor implementation’s internals. Enough talk, let’s see the testcases. Let’s first try to map the StudentSimulatorApp to a Testcase.Let’s look at the declaration alone now. class TeacherPreTest extends TestKit(ActorSystem("UniversityMessageSystem")) with WordSpecLike with MustMatchers with BeforeAndAfterAll { So, from the definition of the TestCase class we see that :The TestKit trait accepts an ActorSystem through which we would be creating Actors. Internally, the TestKit decorates the ActorSystem and replaces the default dispatcher too. We use WordSpec which is one of the many fun ways to write testcases with ScalaTest. The MustMatchers provide convenient methods to make the testcase look like natural language We mixin the BeforeAndAfterAll to shutdown the ActorSystem after the testcases are complete. The afterAll method that the trait provides is more like our tearDown in JUnit1, 2 – Sending message to ActorsThe first testcase just sends a message to the PrintActor. It doesn’t assert anything! The second case sends message to the Log actor which uses the log field of the ActorLogging to publish the message to the EventStream. This doesn’t assert anything too!//1. Sends message to the Print Actor. Not even a testcase actually "A teacher" must {"print a quote when a QuoteRequest message is sent" in {val teacherRef = TestActorRef[TeacherActor] teacherRef ! QuoteRequest } }//2. Sends message to the Log Actor. Again, not a testcase per se "A teacher with ActorLogging" must {"log a quote when a QuoteRequest message is sent" in {val teacherRef = TestActorRef[TeacherLogActor] teacherRef ! QuoteRequest } 3 – Asserting internal state of Actors The third case uses the underlyingActor method of the TestActorRef and calls upon the quoteList method of the TeacherActor. The quoteList method returns the list of quotes back. We use this list to assert its size. If reference to quoteList throws you back, refer to the TeacherLogActor code listed above and look for: //From TeacherLogActor //We'll cover the purpose of this method in the Testing section def quoteList=quotes //3. Asserts the internal State of the Log Actor. "have a quote list of size 4" in {val teacherRef = TestActorRef[TeacherLogActor] teacherRef.underlyingActor.quoteList must have size (4) teacherRef.underlyingActor.quoteList must have size (4) } 4 – Asserting log messages As we discussed earlier in the EventStream and Logging section (above), all log messages go to the EventStream and the SLF4JLogger subscribes to it and uses its appenders to write to the log file/console etc. Wouldn’t it be nice to subscribe to the EventStream directly in our testcase and assert the presence of the log message itself? Looks like we can do that too. This involves two steps :You need to add an extra configuration to your TestKit like so : class TeacherTest extends TestKit(ActorSystem("UniversityMessageSystem", ConfigFactory.parseString("""akka.loggers = ["akka.testkit.TestEventListener"]"""))) with WordSpecLike with MustMatchers with BeforeAndAfterAll {Now that we have a subscription to the EventStream, we could assert it from our testcase as : //4. Verifying log messages from eventStream "be verifiable via EventFilter in response to a QuoteRequest that is sent" in {val teacherRef = TestActorRef[TeacherLogActor] EventFilter.info(pattern = "QuoteResponse*", occurrences = 1) intercept { teacherRef ! QuoteRequest } }The EventFilter.info block just intercepts for 1 log message which starts with QuoteResponse (pattern='QuoteResponse*). (You could also achieve it by using a start='QuoteResponse'. If there is no log message as a result of sending a message to the TeacherLogActor, the testcase would fail. 5 – Testing Actors with constructor parameters Please note that the way we create Actors in the testcase is via the TestActorRef[TeacherLogActor] and not via system.actorOf. This is just so that we could get access to the Actor’s internals through the underlyingActor method in the TeacherActorRef. We wouldn’t be able to achieve this via the ActorRef that we have access during the regular runtime. (That doesn’t give us any excuse to use TestActorRef in production. You’ll be hunted down). If the Actor accepts parameters, then the way we create TestActorRef would be like : val teacherRef = TestActorRef(new TeacherLogParameterActor(quotes)) The entire testcase would then look something like : //5. have a quote list of the same size as the input parameter " have a quote list of the same size as the input parameter" in {val quotes = List( "Moderation is for cowards", "Anything worth doing is worth overdoing", "The trouble is you think you have time", "You never gonna know if you never even try")val teacherRef = TestActorRef(new TeacherLogParameterActor(quotes)) //val teacherRef = TestActorRef(Props(new TeacherLogParameterActor(quotes)))teacherRef.underlyingActor.quoteList must have size (4) EventFilter.info(pattern = "QuoteResponse*", occurrences = 1) intercept { teacherRef ! QuoteRequest } } Shutting down ActorSystem And finally, the afterAll lifecycle method: override def afterAll() { super.afterAll() system.shutdown() } CODEAs always, the entire project could be downloaded from github here.Reference: Akka Notes – Actor Logging and Testing from our JCG partner Arun Manivannan at the Rerun.me blog....
software-development-2-logo

Visualizing engineering fields

Civil engineering.                      Mechanical engineering.    Electronic engineering.    Software engineering.    Summary. Seriously, software engineering? Photo credit attribution. Image Clemuel Ricketts House drawing 1 courtesy of wikimedia. Image Bequet-Ribault House Transverse Section with Details courtesy of wikimedia. Image Grand Central Terminal courtesy of wikimedia. Image Jackhammer_blow courtesy of wikimedia. Image OMS Pod schematic courtesy of wikimedia. Image Nikon f-mount courtesy of wikimedia. Image Kidule 2550 courtesy of wikimedia. Image AT89C55WD courtesy of wikimedia. Image WPEVCContactorCharge2B courtesy of wikimedia.Reference: Visualizing engineering fields from our JCG partner Edmund Kirwan at the A blog about software. blog....
software-development-2-logo

R: A first attempt at linear regression

I’ve been working through the videos that accompany the Introduction to Statistical Learning with Applications in R book and thought it’d be interesting to try out the linear regression algorithm against my meetup data set. I wanted to see how well a linear regression algorithm could predict how many people were likely to RSVP to a particular event. I started with the following code to build a data frame containing some potential predictors:         library(RNeo4j) officeEventsQuery = "MATCH (g:Group {name: \"Neo4j - London User Group\"})-[:HOSTED_EVENT]->(event)<-[:TO]-({response: 'yes'})<-[:RSVPD]-(), (event)-[:HELD_AT]->(venue) WHERE (event.time + event.utc_offset) < timestamp() AND venue.name IN [\"Neo Technology\", \"OpenCredo\"] RETURN event.time + event.utc_offset AS eventTime,event.announced_at AS announcedAt, event.name, COUNT(*) AS rsvps"   events = subset(cypher(graph, officeEventsQuery), !is.na(announcedAt)) events$eventTime <- timestampToDate(events$eventTime) events$day <- format(events$eventTime, "%A") events$monthYear <- format(events$eventTime, "%m-%Y") events$month <- format(events$eventTime, "%m") events$year <- format(events$eventTime, "%Y") events$announcedAt<- timestampToDate(events$announcedAt) events$timeDiff = as.numeric(events$eventTime - events$announcedAt, units = "days") If we preview ‘events’ it contains the following columns: > head(events) eventTime announcedAt event.name rsvps day monthYear month year timeDiff 1 2013-01-29 18:00:00 2012-11-30 11:30:57 Intro to Graphs 24 Tuesday 01-2013 01 2013 60.270174 2 2014-06-24 18:30:00 2014-06-18 19:11:19 Intro to Graphs 43 Tuesday 06-2014 06 2014 5.971308 3 2014-06-18 18:30:00 2014-06-08 07:03:13 Neo4j World Cup Hackathon 24 Wednesday 06-2014 06 2014 10.476933 4 2014-05-20 18:30:00 2014-05-14 18:56:06 Intro to Graphs 53 Tuesday 05-2014 05 2014 5.981875 5 2014-02-11 18:00:00 2014-02-05 19:11:03 Intro to Graphs 35 Tuesday 02-2014 02 2014 5.950660 6 2014-09-04 18:30:00 2014-08-26 06:34:01 Hands On Intro to Cypher - Neo4j's Query Language 20 Thursday 09-2014 09 2014 9.497211 We want to predict ‘rsvps’ from the other columns so I started off by creating a linear model which took all the other columns into account: > summary(lm(rsvps ~., data = events))   Call: lm(formula = rsvps ~ ., data = events)   Residuals: Min 1Q Median 3Q Max -8.2582 -1.1538 0.0000 0.4158 10.5803   Coefficients: (14 not defined because of singularities) Estimate Std. Error t value Pr(>|t|) (Intercept) -9.365e+03 3.009e+03 -3.113 0.00897 ** eventTime 3.609e-06 2.951e-06 1.223 0.24479 announcedAt 3.278e-06 2.553e-06 1.284 0.22339 event.nameGraph Modelling - Do's and Don'ts 4.884e+01 1.140e+01 4.286 0.00106 ** event.nameHands on build your first Neo4j app for Java developers 3.735e+01 1.048e+01 3.562 0.00391 ** event.nameHands On Intro to Cypher - Neo4j's Query Language 2.560e+01 9.713e+00 2.635 0.02177 * event.nameIntro to Graphs 2.238e+01 8.726e+00 2.564 0.02480 * event.nameIntroduction to Graph Database Modeling -1.304e+02 4.835e+01 -2.696 0.01946 * event.nameLunch with Neo4j's CEO, Emil Eifrem 3.920e+01 1.113e+01 3.523 0.00420 ** event.nameNeo4j Clojure Hackathon -3.063e+00 1.195e+01 -0.256 0.80203 event.nameNeo4j Python Hackathon with py2neo's Nigel Small 2.128e+01 1.070e+01 1.989 0.06998 . event.nameNeo4j World Cup Hackathon 5.004e+00 9.622e+00 0.520 0.61251 dayTuesday 2.068e+01 5.626e+00 3.676 0.00317 ** dayWednesday 2.300e+01 5.522e+00 4.165 0.00131 ** monthYear01-2014 -2.350e+02 7.377e+01 -3.185 0.00784 ** monthYear02-2013 -2.526e+01 1.376e+01 -1.836 0.09130 . monthYear02-2014 -2.325e+02 7.763e+01 -2.995 0.01118 * monthYear03-2013 -4.605e+01 1.683e+01 -2.736 0.01805 * monthYear03-2014 -2.371e+02 8.324e+01 -2.848 0.01468 * monthYear04-2013 -6.570e+01 2.309e+01 -2.845 0.01477 * monthYear04-2014 -2.535e+02 8.746e+01 -2.899 0.01336 * monthYear05-2013 -8.672e+01 2.845e+01 -3.049 0.01011 * monthYear05-2014 -2.802e+02 9.420e+01 -2.975 0.01160 * monthYear06-2013 -1.022e+02 3.283e+01 -3.113 0.00897 ** monthYear06-2014 -2.996e+02 1.003e+02 -2.988 0.01132 * monthYear07-2014 -3.123e+02 1.054e+02 -2.965 0.01182 * monthYear08-2013 -1.326e+02 4.323e+01 -3.067 0.00976 ** monthYear08-2014 -3.060e+02 1.107e+02 -2.763 0.01718 * monthYear09-2013 NA NA NA NA monthYear09-2014 -3.465e+02 1.164e+02 -2.976 0.01158 * monthYear10-2012 2.602e+01 1.959e+01 1.328 0.20886 monthYear10-2013 -1.728e+02 5.678e+01 -3.044 0.01020 * monthYear11-2012 2.717e+01 1.509e+01 1.800 0.09704 . month02 NA NA NA NA month03 NA NA NA NA month04 NA NA NA NA month05 NA NA NA NA month06 NA NA NA NA month07 NA NA NA NA month08 NA NA NA NA month09 NA NA NA NA month10 NA NA NA NA month11 NA NA NA NA year2013 NA NA NA NA year2014 NA NA NA NA timeDiff NA NA NA NA --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1   Residual standard error: 5.287 on 12 degrees of freedom Multiple R-squared: 0.9585, Adjusted R-squared: 0.8512 F-statistic: 8.934 on 31 and 12 DF, p-value: 0.0001399 As I understand it we can look at the R-squared value to understand how much of the variance in the data has been explained by the model – in this case it’s 85%. A lot of the coefficients seem to be based around specific event names which seems a bit too specific to me so I wanted to see what would happen if I derived a feature which indicated whether a session was practical: events$practical = grepl("Hackathon|Hands on|Hands On", events$event.name) We can now run the model again with the new column having excluded ‘event.name’ field: > summary(lm(rsvps ~., data = subset(events, select = -c(event.name))))   Call: lm(formula = rsvps ~ ., data = subset(events, select = -c(event.name)))   Residuals: Min 1Q Median 3Q Max -18.647 -2.311 0.000 2.908 23.218   Coefficients: (13 not defined because of singularities) Estimate Std. Error t value Pr(>|t|) (Intercept) -3.980e+03 4.752e+03 -0.838 0.4127 eventTime 2.907e-06 3.873e-06 0.751 0.4621 announcedAt 3.336e-08 3.559e-06 0.009 0.9926 dayTuesday 7.547e+00 6.080e+00 1.241 0.2296 dayWednesday 2.442e+00 7.046e+00 0.347 0.7327 monthYear01-2014 -9.562e+01 1.187e+02 -0.806 0.4303 monthYear02-2013 -4.230e+00 2.289e+01 -0.185 0.8553 monthYear02-2014 -9.156e+01 1.254e+02 -0.730 0.4742 monthYear03-2013 -1.633e+01 2.808e+01 -0.582 0.5676 monthYear03-2014 -8.094e+01 1.329e+02 -0.609 0.5496 monthYear04-2013 -2.249e+01 3.785e+01 -0.594 0.5595 monthYear04-2014 -9.230e+01 1.401e+02 -0.659 0.5180 monthYear05-2013 -3.237e+01 4.654e+01 -0.696 0.4952 monthYear05-2014 -1.015e+02 1.509e+02 -0.673 0.5092 monthYear06-2013 -3.947e+01 5.355e+01 -0.737 0.4701 monthYear06-2014 -1.081e+02 1.604e+02 -0.674 0.5084 monthYear07-2014 -1.110e+02 1.678e+02 -0.661 0.5163 monthYear08-2013 -5.144e+01 6.988e+01 -0.736 0.4706 monthYear08-2014 -1.023e+02 1.784e+02 -0.573 0.5731 monthYear09-2013 -6.057e+01 7.893e+01 -0.767 0.4523 monthYear09-2014 -1.260e+02 1.874e+02 -0.672 0.5094 monthYear10-2012 9.557e+00 2.873e+01 0.333 0.7430 monthYear10-2013 -6.450e+01 9.169e+01 -0.703 0.4903 monthYear11-2012 1.689e+01 2.316e+01 0.729 0.4748 month02 NA NA NA NA month03 NA NA NA NA month04 NA NA NA NA month05 NA NA NA NA month06 NA NA NA NA month07 NA NA NA NA month08 NA NA NA NA month09 NA NA NA NA month10 NA NA NA NA month11 NA NA NA NA year2013 NA NA NA NA year2014 NA NA NA NA timeDiff NA NA NA NA practicalTRUE -9.388e+00 5.289e+00 -1.775 0.0919 . --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1   Residual standard error: 10.21 on 19 degrees of freedom Multiple R-squared: 0.7546, Adjusted R-squared: 0.4446 F-statistic: 2.434 on 24 and 19 DF, p-value: 0.02592 Now we’re only accounting for 44% of the variance and none of our coefficients are significant so this wasn’t such a good change. I also noticed that we’ve got a bit of overlap in the date related features – we’ve got one column for monthYear and then separate ones for month and year. Let’s strip out the combined one: > summary(lm(rsvps ~., data = subset(events, select = -c(event.name, monthYear))))   Call: lm(formula = rsvps ~ ., data = subset(events, select = -c(event.name, monthYear)))   Residuals: Min 1Q Median 3Q Max -16.5745 -4.0507 -0.1042 3.6586 24.4715   Coefficients: (1 not defined because of singularities) Estimate Std. Error t value Pr(>|t|) (Intercept) -1.573e+03 4.315e+03 -0.364 0.7185 eventTime 3.320e-06 3.434e-06 0.967 0.3425 announcedAt -2.149e-06 2.201e-06 -0.976 0.3379 dayTuesday 4.713e+00 5.871e+00 0.803 0.4294 dayWednesday -2.253e-01 6.685e+00 -0.034 0.9734 month02 3.164e+00 1.285e+01 0.246 0.8075 month03 1.127e+01 1.858e+01 0.607 0.5494 month04 4.148e+00 2.581e+01 0.161 0.8736 month05 1.979e+00 3.425e+01 0.058 0.9544 month06 -1.220e-01 4.271e+01 -0.003 0.9977 month07 1.671e+00 4.955e+01 0.034 0.9734 month08 8.849e+00 5.940e+01 0.149 0.8827 month09 -5.496e+00 6.782e+01 -0.081 0.9360 month10 -5.066e+00 7.893e+01 -0.064 0.9493 month11 4.255e+00 8.697e+01 0.049 0.9614 year2013 -1.799e+01 1.032e+02 -0.174 0.8629 year2014 -3.281e+01 2.045e+02 -0.160 0.8738 timeDiff NA NA NA NA practicalTRUE -9.816e+00 5.084e+00 -1.931 0.0645 . --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1   Residual standard error: 10.19 on 26 degrees of freedom Multiple R-squared: 0.666, Adjusted R-squared: 0.4476 F-statistic: 3.049 on 17 and 26 DF, p-value: 0.005187 Again none of the coefficients are statistically significant which is disappointing. I think the main problem may be that I have very few data points (only 42) making it difficult to come up with a general model. I think my next step is to look for some other features that could impact the number of RSVPs e.g. other events on that day, the weather. I’m a novice at this but trying to learn more so if you have any ideas of what I should do next please let me know.Reference: R: A first attempt at linear regression from our JCG partner Mark Needham at the Mark Needham Blog blog....
career-logo

How To Get a Job in a Different City

There are subtle nuances to job searches outside of the local area. Unless a candidate is considered superlative, non-local applicants are not always given the same level of attention as locals when employers have healthy candidate pools with local applicants. Why might remoteness impact interview decisions (even in a tight market), and how can the potential for negative bias be minimized? We’ll get to that in a minute. Before we can apply for a job, we need to find it.         Finding jobs Job sites – The usual suspects are where some people start, and those jobs will have multiple applicants. Googling to find regional job sites may help find companies that fly under the radar. LinkedIn – The Jobs tab can create a search for new posts, but everybody may use that strategy. Try an Advanced People Search using one or more of the technologies or skills (in keywords box) that might be used by an attractive employer, and enter a zip code and mileage range using the desired location. Note both the current and past employers for the profiles, then research those firms. Remote networking – Reaching out directly to some of the profiles found during the LinkedIn search will produce leads. Many fellow technologists will respond to messages stating a desire to move to their area. Finding a local recruiter on LinkedIn or via web search may bring several opportunities. User groups and meetups – Some user group sites have job ads, and sponsoring firms usually have a local presence. Speakers from past meetings often live locally. User group leaders are often contacted by recruiters and hiring companies that are looking for talent, so contacting group leaders directly and asking “Who is hiring?” should be helpful. or let the jobs find you… – Change the location field on a LinkedIn profile to the desired location and add language indicating an interest in new opportunities, and companies and agencies from that location may start knocking. Applying for jobsNow that the jobs are identified, initial contact must be made. This is where things can get complicated. Recruiters and HR professionals are tasked with looking at résumés and any accompanying material in order to make a reasonably quick yes/no decision on an initial interview. Screeners know an interview process is time consuming, and the decision to start that process will usually take valuable time from several employees of the organization. There are several factors that go into this decision, with candidate’s qualifications being the most important and obvious. Another factor is the recruiter’s assessment regarding the likelihood that a candidate would accept the job if offered, which is based on any obvious or assumed barriers. Details such as candidate compensation requirements in relation to company pay ranges or current job title in relation to vacant job title may play a role in the decision. Is someone making 150K likely to accept our job paying 110K? Is a Chief Architect likely to accept our Intermediate Developer position? And generally speaking, is this person likely to accept a job in another location? For exceptional candidates these questions are irrelevant, as they will be screened. But if a candidate barely meets the minimum requirements, has a couple additional flags, and happens to be non-local, will the employer even bother screening the candidate? Should they? Without additional context, it may be assumed that a recent graduate in the midwest that applies to a job in New York City is probably shipping résumés to Silicon Valley, Chicago, or Seattle. The HR rep could believe that they are competing with many companies across several markets, each with its own reputation, characteristics, and cost of living. How likely is it that this candidate will not only choose our market, but also choose our company? How can we lessen the impact of these assumptions and potential biases? Mention location -When location isn’t mentioned by non-local applicants and no other information is given, the screener is likely to get the impression that this candidate is indiscriminately applying to positions. An applicant’s non-local address is the elephant in the room, so it is vital to reference that immediately in a cover letter. If a future address is known, it should be listed on the résumé along with the current address. Keep in mind that the screener may open a résumé before reading any accompanying material. When there is a specific reason for relocating to this location, such as a family situation or a spouse’s job relocation, that information will be additional evidence of intent. Availability for interviews - Listing available dates for on-site interviews demonstrates at least some level of commitment to the new location. Screeners interpret this as a buying sign. Availability for start – Candidates that relocate for positions may have to sell their home, sublet an apartment, or have children in the middle of a school year. A mention of start date helps to set expectations early. Additional considerations Cost of living and salary – Some ads request salary history and compensation expectations. Be sure to research salaries and market values in the new city, and state that committing to a future salary figure is difficult until all of the data is collected. Relocation assistance – Companies may be willing to provide some relocation assistance even for candidates who are planning a move. Requesting a relo package in an application adds a potential reason for rejection, but negotiating relo money during the offer process is common. Since it is a one-time cost, companies may be more willing to provide relo if negotiations on salary or benefits become sticky. Consider the overall market – Before committing to an opportunity in another city, research employment prospects beyond the target company. How healthy is the job market, and how many other local companies have specific demand for the same skills? A strong local tech market does not always indicate a strong market for certain specialties.Reference: How To Get a Job in a Different City from our JCG partner Dave Fecak at the Job Tips For Geeks blog....
neo4j-logo

Neo4j: Generic/Vague relationship names

An approach to modelling that I often see while working with Neo4j users is creating very generic relationships (e.g. HAS, CONTAINS, IS) and filtering on a relationship property or on a property/label at the end node. Intuitively this doesn’t seem to make best use of the graph model as it means that you have to evaluate many relationships and nodes that you’re not interested in. However, I’ve never actually tested the performance differences between the approaches so I thought I’d try it out. I created 4 different databases which had one node with 60,000 outgoing relationships – 10,000 which we wanted to retrieve and 50,000 that were irrelevant. I modelled the ‘relationship’ in 4 different ways…Using a specific relationship type (node)-[:HAS_ADDRESS]->(address) Using a generic relationship type and then filtering by end node label (node)-[:HAS]->(address:Address) Using a generic relationship type and then filtering by relationship property (node)-[:HAS {type: "address"}]->(address) Using a generic relationship type and then filtering by end node property (node)-[:HAS]->(address {type: “address”})…and then measured how long it took to retrieve the ‘has address’ relationships.The code is on github if you want to take a look.Although it’s obviously not as precise as a JMH micro benchmark I think it’s good enough to get a feel for the difference between the approaches. I ran a query against each database 100 times and then took the 50th, 75th and 99th percentiles (times are in ms): Using a generic relationship type and then filtering by end node label 50%ile: 6.0 75%ile: 6.0 99%ile: 402.60999999999825   Using a generic relationship type and then filtering by relationship property 50%ile: 21.0 75%ile: 22.0 99%ile: 504.85999999999785   Using a generic relationship type and then filtering by end node label 50%ile: 4.0 75%ile: 4.0 99%ile: 145.65999999999931   Using a specific relationship type 50%ile: 0.0 75%ile: 1.0 99%ile: 25.749999999999872 We can drill further into why there’s a difference in the times for each of the approaches by profiling the equivalent cypher query. We’ll start with the one which uses a specific relationship name: Using a specific relationship type neo4j-sh (?)$ profile match (n) where id(n) = 0 match (n)-[:HAS_ADDRESS]->() return count(n); +----------+ | count(n) | +----------+ | 10000 | +----------+ 1 row   ColumnFilter | +EagerAggregation | +SimplePatternMatcher | +NodeByIdOrEmpty   +----------------------+-------+--------+-----------------------------+-----------------------+ | Operator | Rows | DbHits | Identifiers | Other | +----------------------+-------+--------+-----------------------------+-----------------------+ | ColumnFilter | 1 | 0 | | keep columns count(n) | | EagerAggregation | 1 | 0 | | | | SimplePatternMatcher | 10000 | 10000 | n, UNNAMED53, UNNAMED35 | | | NodeByIdOrEmpty | 1 | 1 | n, n | { AUTOINT0} | +----------------------+-------+--------+-----------------------------+-----------------------+   Total database accesses: 10001 Here we can see that there were 10,002 database accesses in order to get a count of our 10,000 HAS_ADDRESS relationships. We get a database access each time we load a node, relationship or property. By contrast the other approaches have to load in a lot more data only to then filter it out: Using a generic relationship type and then filtering by end node label neo4j-sh (?)$ profile match (n) where id(n) = 0 match (n)-[:HAS]->(:Address) return count(n); +----------+ | count(n) | +----------+ | 10000 | +----------+ 1 row   ColumnFilter | +EagerAggregation | +Filter | +SimplePatternMatcher | +NodeByIdOrEmpty   +----------------------+-------+--------+-----------------------------+----------------------------------+ | Operator | Rows | DbHits | Identifiers | Other | +----------------------+-------+--------+-----------------------------+----------------------------------+ | ColumnFilter | 1 | 0 | | keep columns count(n) | | EagerAggregation | 1 | 0 | | | | Filter | 10000 | 10000 | | hasLabel( UNNAMED45:Address(0)) | | SimplePatternMatcher | 10000 | 60000 | n, UNNAMED45, UNNAMED35 | | | NodeByIdOrEmpty | 1 | 1 | n, n | { AUTOINT0} | +----------------------+-------+--------+-----------------------------+----------------------------------+   Total database accesses: 70001 Using a generic relationship type and then filtering by relationship property neo4j-sh (?)$ profile match (n) where id(n) = 0 match (n)-[:HAS {type: "address"}]->() return count(n); +----------+ | count(n) | +----------+ | 10000 | +----------+ 1 row   ColumnFilter | +EagerAggregation | +Filter | +SimplePatternMatcher | +NodeByIdOrEmpty   +----------------------+-------+--------+-----------------------------+--------------------------------------------------+ | Operator | Rows | DbHits | Identifiers | Other | +----------------------+-------+--------+-----------------------------+--------------------------------------------------+ | ColumnFilter | 1 | 0 | | keep columns count(n) | | EagerAggregation | 1 | 0 | | | | Filter | 10000 | 20000 | | Property( UNNAMED35,type(0)) == { AUTOSTRING1} | | SimplePatternMatcher | 10000 | 120000 | n, UNNAMED63, UNNAMED35 | | | NodeByIdOrEmpty | 1 | 1 | n, n | { AUTOINT0} | +----------------------+-------+--------+-----------------------------+--------------------------------------------------+   Total database accesses: 140001 Using a generic relationship type and then filtering by end node property neo4j-sh (?)$ profile match (n) where id(n) = 0 match (n)-[:HAS]->({type: "address"}) return count(n); +----------+ | count(n) | +----------+ | 10000 | +----------+ 1 row   ColumnFilter | +EagerAggregation | +Filter | +SimplePatternMatcher | +NodeByIdOrEmpty   +----------------------+-------+--------+-----------------------------+--------------------------------------------------+ | Operator | Rows | DbHits | Identifiers | Other | +----------------------+-------+--------+-----------------------------+--------------------------------------------------+ | ColumnFilter | 1 | 0 | | keep columns count(n) | | EagerAggregation | 1 | 0 | | | | Filter | 10000 | 20000 | | Property( UNNAMED45,type(0)) == { AUTOSTRING1} | | SimplePatternMatcher | 10000 | 120000 | n, UNNAMED45, UNNAMED35 | | | NodeByIdOrEmpty | 1 | 1 | n, n | { AUTOINT0} | +----------------------+-------+--------+-----------------------------+--------------------------------------------------+   Total database accesses: 140001 So in summary…specific relationships #ftw!Reference: Neo4j: Generic/Vague relationship names from our JCG partner Mark Needham at the Mark Needham Blog blog....
software-development-2-logo

Test-Driven Development (TDD)

What is Test-Driven Development (TDD)? Test-Driven Development is a process that relies on the repetition of very short development cycle. It is based on the test-first concept of Extreme Programming (XP) that encourages simple design with high level of confidence. The procedure of doing TDD is following:  Write a test Run all tests Write the implementation code Run all tests RefactorThis procedure is often called Red-Green-Refactor. While writing tests we are in the red state. Since test is written before the actual implementation, it is supposed to fail. If it doesn’t, test is wrong. It describes something that already exists or it was written incorrectly. Being in green while writing tests is a sign of false positive. Tests like that should be removed or refactored. Next comes the green state. When the implementation of the last test is finished, all tests should pass. If they don’t, implementation is wrong and should be corrected. The idea is not to make the implementation final, but to provide just enough code for tests to pass. Once everything is green we can proceed to refactor the existing code. That means that we are making the code more optimum without introducing new features. While refactoring is in place, all tests should be passing all the time. If one of them fails, refactor broke an existing functionality. Refactoring should not include new tests. Speed is the keyI tend to see TDD as a game of ping pong (or table tennis). The game is very fast. Same holds true for TDD. I tend not to spend more than a minute on either side of the table (test and implementation). Write a short test and run it (ping), write the implementation and run all tests (pong), write another test (ping), write implementation of that test (pong), refactor and confirm that all tests are passing (score), repeat. Ping, pong, ping, pong, ping, pong, score, serve again. Do not try to make the perfect code. Instead, try to keep the ball rolling until you think that the time is right to score (refactor).   It’s not about testing T in TDD is often misunderstood. TDD is the way we approach the design. It is the way to force us to think about the implementation before writing the code. It is the way to better structure the code. That does not mean that tests resulting from using TDD are useless. Far from that. They are very useful and allow us to develop with great speed without being afraid that something will be broken. That is especially true when refactoring takes place. Being able to reorganize the code while having the confidence that no functionality is broken is a huge boost to the quality of the code. The main objective of TDD is code design with tests as a very useful side product. Mocking In order for tests to run fast thus providing constant feedback, code needs to be organized in a way that methods, functions and classes can be easily mocked and stubbed. Speed of the execution will be severely affected it, for example, our tests need to communicate with the database. By mocking external dependencies we are able to increase that speed drastically. Whole unit tests suite execution should be measured in minutes if not seconds. More importantly than speed, designing the code in a way that it can be easily mocked and stubbed forces us to better structure that code by applying separation of concerns. With or without mocks, the code should be written in a way that we can, for example, easily replace one database for another. That “another” can be, for example, mocked or in-memory database. An example of mocking in Scala can be found in the Scala Test-Driven Development (TDD): Unit Testing File Operations with Specs2 and Mockito article. If your programming language of choice is not Scala, article can still be very useful in order to see the pattern that can be applied to any language. Watchers Very useful tool when working in the TDD fashion are watchers. They are frameworks or tools that are executed before we start working and are watching for any change in the code. When such a change is detected, all the tests are run. In case of JavaScript, almost all build systems and task runners allow this. Gulp (my favorite) and Grunt are two out of many examples. Scala has sbt-revolver (among others). Most of other programming languages have similar tools that recompile (if needed) and run all (or affected) tests when the code changes. I always end up having my screen split into two windows. One with the code I’m working on and the other with results of tests that are being executed continually. All I have to do is pay attention that the output of those watchers corresponds with the phase I’m in (red or green). Documentation Another very useful side effect of TDD (and well structured tests in general) is documentation. In most cases, it is much easier to find out what the code does by looking at tests than the implementation itself. Additional benefit that other types of documentation cannot provide is that tests are never outdated. If there is any discrepancy between tests and the implementation code, tests fail. Failed tests mean inaccurate documentation. Tests as documentation article provides a bit deeper reasoning behind the usage of tests instead of traditional documentation. Summary In my experience, TDD is probably the most important tool we have in our software toolbox. It takes a lot of time and patience to become proficient in TDD but once that art is mastered, productivity and quality increases drastically. The best way to both learn and practice TDD is in combination with pair programming. As in a game of ping pong that requires two participants, TDD can be in pairs where one coder writes tests and the other writes the implementation of those tests. Roles can switch after every test (as it’s often done in coding dojos). Give it a try and don’t give up when faced with obstacles since there will be many. Test Driven Development (TDD): Best Practices Using Java Examples is a good starting point. Even though it uses Java examples, same, if not all, practices can be applied to any programming language. For an example in Java (as in the previous case, it is easily aplicable to other languages) please take a look at TDD Example Walkthrough article. Another great way to perfection TDD skills are code katas (there are many on this site). What is your experience with TDD? There are as many variations as there are teams practicing it and I’d like to hear about your experience.Reference: Test-Driven Development (TDD) from our JCG partner Viktor Farcic at the Technology conversations blog....
agile-logo

Creating a Succession Plan for Your Technical Team

We often think about a succession plan for managers. But, if you’re not thinking about a succession plan for your technical team, you’re falling prey to local shortages, and hiring the same old kinds of people. You’re not getting diverse people. That means you may not be able to create innovative, great products. It also means your people might be stuck. As soon as they can, they might leave. Sometimes, when I coach people on their hiring process, I discover that they have all one kind of person. Everyone has five years of experience in one domain. Or, everyone has fifteen years. Or, everyone has the same background. Everyone all looks alike. Everyone—even though they were hired at different times—has exactly the same demographics. This is not good. You want a mixture of experience on your team. You want some people with less experience and some people with more. I once had a client who, through their hiring practices and attrition, ended up with people who had no less than 25 years of experience. Every single person had at least 25 years of experience in this particular domain. It was very interesting introducing change to that organization, especially to the managers. The technical staff had no problem with change. But the managers? Oh boy. They had worked in a particular way for so long they had problems thinking in any other way. That was a problem. It’s not that less or more experience leads to easier or more difficult change. It’s that heterogeneity in a team tends leads to more innovation and more acceptance of change. So, what can you do to create a succession plan for your team?Assess the number of entry-level, mid-level, senior, and principal technical staff you have. I think of entry-level as 0-2 years, mid-level as about 2-10 years, senior as about 10-20 years, principal as about 20 years and on. Your ranges may vary. If you have narrower ranges, ask yourself why. If you start senior engineers at 5 years of experience, I want to know how the heck you can. You can call them anything you want. Are they really senior? Or, do you have title inflation? If you don’t already have one, create an expertise criteria chart. That’s a chart that shows what the criteria are for each level. Because your people might just have a year of experience every year, and not really have acquired any valuable experience. You and I both know people like that, right? Take the qualities, preferences and non-technical skills that you value the most when you hire. Explain what you want in each level, and that’s how you create an expertise criteria chart for your team. Resolve the criteria across the organization, so that your team is on par with the rest of the organization. In your one-on-ones, have a conversation with each person about their career goals and how you see their career over time. Provide feedback. If they want coaching, provide that.Now, you have data. You have information about how people are performing against what you need. You have information about how you could “slot” people into the HR ranges, if you need to do so. And, if you need to hire people, you have the opportunity to hire people where you need to do so. I did this when I was a manager. I needed the data to bring one person to parity. I needed the data later to bring an entire testing team to parity with the developers. This is a ton of work. You can do it. It’s worth it.Reference: Creating a Succession Plan for Your Technical Team from our JCG partner Johanna Rothman at the Managing Product Development blog....
Java Code Geeks and all content copyright © 2010-2014, Exelixis Media Ltd | Terms of Use | Privacy Policy | Contact
All trademarks and registered trademarks appearing on Java Code Geeks are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries.
Java Code Geeks is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.
Do you want to know how to develop your skillset and become a ...
Java Rockstar?

Subscribe to our newsletter to start Rocking right now!

To get you started we give you two of our best selling eBooks for FREE!

Get ready to Rock!
You can download the complementary eBooks using the links below:
Close