Home » Archives for Carol Mcdonald

Author Archives: Carol Mcdonald

Event Driven Microservices Patterns

In this blog we will discuss some patterns which are often used in microservices applications which need to scale: Event Stream Event Sourcing Polyglot Persistence Memory Image Command Query Responsibility Separation The Motivation Uber, Gilt and others have moved from a monolithic to a microservices architecture because they needed to scale.  A monolithic application puts all of its functionality into a ...

Read More »

Monitoring Real-Time Uber Data Using Spark Machine Learning, Streaming, and the Kafka API (Part 2)

This post is the second part in a series where we will build a real-time example for analysis and monitoring of Uber car GPS trip data. If you have not already read the first part of this series, you should read that first. The first post discussed creating a machine learning model using Apache Spark’s K-means algorithm to cluster Uber data based ...

Read More »

Predicting Breast Cancer Using Apache Spark Machine Learning Logistic Regression

In this blog post, I’ll help you get started using Apache Spark’s spark.ml Logistic Regression for predicting cancer malignancy. Spark’s spark.ml library goal is to provide a set of APIs on top of DataFrames that help users create and tune machine learning workflows or pipelines. Using spark.ml with DataFrames improves performance through intelligent optimizations. Classification Classification is a family of ...

Read More »

How to Get Started with Spark Streaming and MapR Streams Using the Kafka API

This post will help you get started using Apache Spark Streaming for consuming and publishing messages with MapR Streams and the Kafka API. Spark Streaming is an extension of the core Spark API that enables continuous data stream processing. MapR Streams is a distributed messaging system for streaming event data at scale. MapR Streams enables producers and consumers to exchange events in real time via ...

Read More »

Fast, Scalable, Streaming Applications with MapR Streams, Spark Streaming, and MapR-DB

Many of the systems we want to monitor happen as a stream of events. Examples include event data from web or mobile applications, sensors, or medical devices. Real-time analysis examples include: Website monitoring , Network monitoring Fraud detection Web clicks Advertising Internet of Things: sensors Batch processing can give great insights into things that happened in the past, but it ...

Read More »

How to Get Started Using Apache Spark GraphX with Scala

Editor’s Note: Don’t miss our new free on-demand training course about how to create data pipeline applications using Apache Spark – learn more here. This post will help you get started using Apache Spark GraphX with Scala on the MapR Sandbox. GraphX is the Apache Spark component for graph-parallel computations, built upon a branch of mathematics called graph theory. It ...

Read More »