Big Data
-
Software Development
Key Steps for Removing the Hive Metastore Password from the Hive Configuration
In a typical Hive installation with metadata in a MySQL configuration, a password is configured in a configuration file in…
Read More » -
Software Development
Spark Data Source API: Extending Our Spark SQL Query Engine
In my last post, Apache Spark as a Distributed SQL Engine, we explained how we could use SQL to query…
Read More » -
Software Development
Achieving Sub Second SQL JOINs and building a data warehouse using Spark, Cassandra, and FiloDB
Evan loves to design, build, and improve bleeding edge distributed data and backend systems using the latest in open source…
Read More » -
Software Development
The Method Behind March Madness
There are 150 quintillion (i.e. the one after trillion) permutations to consider when completing your NCAA bracket. Some of us…
Read More » -
Software Development
Getting Started with MapR Streams
MapR Streams is a new distributed messaging system for streaming event data at scale, and it’s integrated into the MapR…
Read More » -
Software Development
Apache Flink GA – Planning for the Future
The distributed computation world has seen a massive shift in the last decade. Apache Hadoop showed up on the scene…
Read More » -
Software Development
Gartner 2016 Magic Quadrant for Data Warehouse and Database Management Solutions for Analytics
We are excited to share with you that Gartner has named MapR a Visionary in the Gartner 2016 Magic Quadrant…
Read More » -
Software Development
The most important thing to know in Cassandra data modeling: The primary key
Patrick McFadin, Chief Evangelist for Apache Cassandra, DataStax Patrick is regarded as one of the foremost experts of Apache Cassandra…
Read More » -
Software Development
Decentralized Analytics for a Complex World
In 2015, General Stan McChrystal published Team of Teams, New Rules of Engagement For a Complex World. It was the…
Read More »