Home » Author Archives: Brian ONeill

Author Archives: Brian ONeill

Integrating Syslog w/ Kinesis : Anticipating use of the Firehose

devops-logo

On the heals of the Kinesis Firehose announcement, more people are going to be looking to integrate Kinesis with logging systems. (to expedite/simplify the ingestion of logs into S3 and Redshift)  Here is one take on solving that problem that integrates syslog-ng with Kinesis. First, let’s have a look at the syslog-ng configuration. In the syslog-ng configuration, you wire sources ...

Read More »

Streaming data into HPCC using Java

java-interview-questions-answers

High Performance Computing Cluster (HPCC) is a distributed processing framework akin to Hadoop, except that it runs programs written in its own Domain Specific Language (DSL) called Enterprise Control Language (ECL).   ECL is great, but occasionally you will want to call out to perform heavy lifting in other languages.  For example, you may want to leverage an NLP library ...

Read More »

Tuning Hadoop & Cassandra : Beware of vNodes, Splits and Pages

apache-cassandra-logo

When running Hadoop jobs against Cassandra, you will want to be careful about a few parameters. Specifically, pay special attention to vNodes, Splits and Page Sizes. vNodes were introduced in Cassandra 1.2. vNodes allow a host to have multiple portions of the token range.  This allows for more evenly distributed data, which means nodes can share the burden of a ...

Read More »

High-Performance Computing Clusters (HPCC) and Cassandra on OS X

apache-cassandra-logo

Our new parent company, LexisNexis, has one of the world’s largest public records database: “…our comprehensive collection of more than 46 billion records from more than 10,000 diverse sources—including public, private, regulated, and derived data. You get comprehensive information on approximately 269 million individuals and 277 million unique businesses.” http://www.lexisnexis.com/en-us/products/public-records.page And they’ve been managing, analyzing and searching this database for ...

Read More »

Delta Architectures: Unifying the Lambda Architecture and leveraging Storm from Hadoop/REST

apache-hadoop-logo

Recently, I’ve been asked by a bunch of people to go into more detail on the Druid/Storm integration that I wrote for our book: Storm Blueprints for Distributed Real-time Computation.  Druid is great. Storm is great. And the two together appear to solve the real-time dimensional query/aggregations problem. In fact, it looks like people are taking it mainstream, calling it ...

Read More »

Diction in Software Development (i.e. Don’t be a d1ck!)

software-development-2-logo

Over the years, I’ve come to realize how important diction is in software development (and life in general). It may mean the difference between a 15 minute meeting where everyone nods their heads, and a day long battle of egos (especially when you have a room full of passionate people). Here are a couple key words and phrases, I’ve incorporated into ...

Read More »

The Life(Cycles) of UX/UI Development

software-development-2-logo

It recently occurred to me that not one of the dozens and dozens of user interfaces I’ve worked on over the years, had the same methodology/lifecycle.  Many of those were results of the environments under which they were constructed: startup, BIG company, government contract, side-project, open-source, freelance, etc. But the technology also played a part in choosing the methodology we ...

Read More »

Applied Big Data : The Freakonomics of Healthcare

java-interview-questions-answers

I went with a less provocative title this time because my last blog post (http://brianoneill.blogspot.com/2014/04/big-data-fixes-obamacare.html) evidently incited political flame wars. In this post, I hope to avoid that by detailing exactly how Big Data can help our healthcare system in a nonpartisan way. First, let’s decompose the problem a bit. Economics Our healthcare system is still (mostly) based on capitalism: ...

Read More »

Want to take your Java skills to the next level?

Grab our programming books for FREE!

Here are some of the eBooks you will get:

  • Advanced Java Guide
  • Java Design Patterns
  • JMeter Tutorial
  • Java 8 Features Tutorial
  • JUnit Tutorial
  • JSF Programming Cookbook
  • Java Concurrency Essentials