Search Results for: spark
-
Python
PySpark – Create Empty Dataframe and RDD
DataFrames and RDDs (Resilient Distributed Datasets) are fundamental abstractions in Apache Spark, a powerful distributed computing framework. Let us delve…
Read More » -
Software Development
Apache Spark: Unleashing Big Data Power
1. Introduction Apache Spark is a powerful open-source, distributed computing system that has become a cornerstone in the world of…
Read More » -
Apache Spark Cheatsheet
This cheatsheet is designed to provide quick access to the most commonly used Spark components, methods, and practices. Whether you’re…
Read More » -
Software Development
Apache Spark Cheatsheet
1. Introduction to Apache Spark 1.1 What is Apache Spark? Apache Spark is an open-source, distributed computing system designed for…
Read More » -
Core Java
Java Spark RDD reduce() Examples – sum, min and max operations
A quick guide to explore the Spark RDD reduce() method in java programming to find sum, min and max values…
Read More » -
Software Development
Where is Apache Spark heading?
I watched (COVID19-era version of “attended”) the latest spark Summit and in one of the keynotes Reynold Xin from Databricks,…
Read More » -
Enterprise Java
Recommendation System Using Spark ML Akka and Cassandra
Building a recommendation system with Spark is a simple task. Spark’s machine learning library already does all the hard work…
Read More » -
Enterprise Java
The Kubernetes Spark operator in OpenShift Origin (Part 1)
This series is about the Kubernetes Spark operator by Radanalytics.io onOpenShift Origin. It is an Open Source operator to manageApache…
Read More » -
Enterprise Java
Sparklens: a tool for Spark applications optimization
Sparklens is a profiling tool for Spark with a built-in Spark Scheduler simulator: it makes easier to understand the scalability…
Read More »