Home » Author Archives: Bill Bejeck

Author Archives: Bill Bejeck

Husband, father of 3, passionate about software development.

What’s new in Java 8 – Date API

java-logo

With the final release of Java 8 around the corner, one of the new features I’m excited about is the new Date API, a result of the work on JSR 310. While Lambda expressions are certainly the big draw of Java 8, having a better way to work with dates is a decidedly welcome addition. This is a quick post ...

Read More »

MapReduce Algorithms – Understanding Data Joins Part II

apache-hadoop-logo

It’s been awhile since I last posted, and like last time I took a big break, I was taking some classes on Coursera. This time it was Functional Programming Principals in Scala and Principles of Reactive Programming. I found both of them to be great courses and would recommend taking either one if you have the time. In this post ...

Read More »

Fine-Grained Concurrency with the Guava Striped Class

java-logo

This post is going to cover how to use the Striped class from Guava to achieve finer-grained concurrency. The ConcurrentHashMap uses a striped locked approach to increase concurrency and the Striped class extends this principal by giving us the ability to have striped Locks, ReadWriteLocks and Semaphores. When accessing an object or data-structure such as an Array or HashMap typically ...

Read More »

Configuring Hadoop with Guava MapSplitters

apache-hadoop-logo

In this post we are going to provide a new twist on passing configuration parameters to a Hadoop Mapper via the Context object. Typically, we set configuration parameters as key/value pairs on the Context object when starting a map-reduce job. Then in the Mapper we use the key(s) to retrieve the value(s) to use for our configuration needs. The twist ...

Read More »

MapReduce Algorithms – Understanding Data Joins Part 1

apache-hadoop-mapreduce-logo

In this post we continue with our series of implementing the algorithms found in the Data-Intensive Text Processing with MapReduce book, this time discussing data joins. While we are going to discuss the techniques for joining data in Hadoop and provide sample code, in most cases you probably won’t be writing code to perform joins yourself. Instead, joining data is ...

Read More »

MapReduce Algorithms – Secondary Sorting

apache-hadoop-mapreduce-logo

We continue with our series on implementing MapReduce algorithms found in Data-Intensive Text Processing with MapReduce book. Other posts in this series: Working Through Data-Intensive Text Processing with MapReduce Working Through Data-Intensive Text Processing with MapReduce – Local Aggregation Part II Calculating A Co-Occurrence Matrix with Hadoop MapReduce Algorithms – Order Inversion       This post covers the pattern ...

Read More »

MapReduce Algorithms – Order Inversion

apache-hadoop-logo

This post is another segment in the series presenting MapReduce algorithms as found in the Data-Intensive Text Processing with MapReduce book. Previous installments are Local Aggregation, Local Aggregation PartII and Creating a Co-Occurrence Matrix. This time we will discuss the order inversion pattern. The order inversion pattern exploits the sorting phase of MapReduce to push data needed for calculations to ...

Read More »

Google Guava BloomFilter

java-logo

When the Guava project released version 11.0, one of the new additions was the BloomFilter class. A BloomFilter is a unique data-structure used to indicate if an element is contained in a set. What makes a BloomFilter interesting is it will indicate if an element is absolutely not contained, or may be contained in a set. This property of never ...

Read More »

Google Guava EventBus and Java 7 WatchService for Event Programming

java-interview-questions-answers

This post is going to cover using the Guava EventBus to publish changes to a directory or sub-directories detected by the Java 7 WatchService. The Guava EventBus is a great way to add publish/subscribe communication to an application. The WatchService, new in the Java 7 java.nio.file package, is used to monitor a directory for changes. Since the EventBus and WatchService ...

Read More »

Calculating A Co-Occurrence Matrix with Hadoop

apache-hadoop-mapreduce-logo

This post continues with our series of implementing the MapReduce algorithms found in the Data-Intensive Text Processing with MapReduce book. This time we will be creating a word co-occurrence matrix from a corpus of text. Previous posts in this series are: Working Through Data-Intensive Text Processing with MapReduce Working Through Data-Intensive Text Processing with MapReduce – Local Aggregation Part II ...

Read More »
Want to take your Java Skills to the next level?
Grab our programming books for FREE!
  • Save time by leveraging our field-tested solutions to common problems.
  • The books cover a wide range of topics, from JPA and JUnit, to JMeter and Android.
  • Each book comes as a standalone guide (with source code provided), so that you use it as reference.
Last Step ...

Where should we send the free eBooks?

Good Work!
To download the books, please verify your email address by following the instructions found on the email we just sent you.