Home » Tag Archives: Big Data

Tag Archives: Big Data

Big Data SQL: Overview of Apache Drill Query Execution Capabilities – Whiteboard Walkthrough

software-development-2-logo

In this week’s Whiteboard Walkthrough, Neeraja Rentachintala, Senior Director of Product Management at MapR Technologies, gives an overview of how open source Apache Drill achieves low latency for interactive SQL queries carried out on large datasets. With Drill, you can use familiar ANSI SQL BI tools, such as Tableau or MicroStrategy, plus do exploration directly on big data. For additional ...

Read More »

How Apache Kafka and MapR Streams Handle Topic Partitions

software-development-2-logo

Streaming data can be used as a long-term auditable history when you choose a messaging system with persistence, but is this approach practical in terms of the cost of storing years of data at scale?  The answer is “yes”, particularly because of the way topic partitions are handled in MapR Streams. Here’s how it works. Streaming Data as a Long ...

Read More »

The Changing Economics of Big Data

software-development-2-logo

Perhaps you’re old enough to remember when the library was the place we went to learn. We foraged through card catalogs, encyclopedias and the Reader’s Guide to Periodical Literature in hopes that we’d be able to understand what was going on in other people’s minds when they decided what went where. The process was time-consuming, frustrating and often futile. We ...

Read More »

Distributed Deep Learning with Caffe Using a MapR Cluster

software-development-2-logo

We have experimented with CaffeOnSpark on a 5 node MapR 5.1 cluster running Spark 1.5.2 and will share our experience, difficulties, and solutions on this blog post. Deep Learning and Caffe Deep learning is getting a lot of attention recently, with AlphaGo beating a top world  player at a game that was thought so complicated as to be out of reach of ...

Read More »

Spark Streaming and Twitter Sentiment Analysis

apache-spark-logo

This blog post is the result of my efforts to show to a coworker how to get the insights he needed by using the streaming capabilities and concise API of Apache Spark. In this blog post, you’ll learn how to do some simple, yet very interesting analytics that will help you solve real problems by analyzing specific areas of a ...

Read More »

Key Steps for Removing the Hive Metastore Password from the Hive Configuration

apache-hive-logo

In a typical Hive installation with metadata in a MySQL configuration, a password is configured in a configuration file in clear text. This presents a few risks: 1) Unauthorized access could destroy/modify Hive metadata and disrupt workflows. A malicious user could alter Hive permissions or damage metadata. 2) This password permits hiveserver2-thrift-MySQL communication. To avoid this problem, you should use ...

Read More »

Want to take your Java skills to the next level?

Grab our programming books for FREE!

Here are some of the eBooks you will get:

  • Advanced Java Guide
  • Java Design Patterns
  • JMeter Tutorial
  • Java 8 Features Tutorial
  • JUnit Tutorial
  • JSF Programming Cookbook
  • Java Concurrency Essentials