Enterprise Java

Spark Run local design pattern

Many spark applications have now become legacy applications and it is very hard to enhance, test & run locally.

Spark has very good testing support but still many spark applications are not testable.

I will share one common error that appears when you try to run some old spark applications.

  
Exception in thread "main" org.apache.spark.SparkException: A master URL must be set in your configuration
	at org.apache.spark.SparkContext.<init>(SparkContext.scala:376)
	at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2509)
	at org.apache.spark.sql.SparkSession$Builder$anonfun$6.apply(SparkSession.scala:909)
	at org.apache.spark.sql.SparkSession$Builder$anonfun$6.apply(SparkSession.scala:901)
	at scala.Option.getOrElse(Option.scala:121)

When you see such an error you have 2 options:

 – Forget that it can’t run locally and continue to work with this frustration

 – Fix it to run locally and show the example of  The Boy Scout Rule to your team

I will show a very simple pattern that will save you from such frustration.

  
 def main(args: Array[String]): Unit = {

    val localRun = SparkContextBuilder.isLocalSpark
    val sparkSession = SparkContextBuilder.newSparkSession(localRun, "Happy Local Spark")

    val numbers = sparkSession.sparkContext.parallelize(Range.apply(1, 1000))

    val total = numbers.sum()

    println(s"Total Value ${total}")


  }

This code is using isLocalSpark function to decide how to handle local mode. You can use any technique to make that decision like env parameter or command line parameter or anything else.

Once you know it runs locally then create spark context based on it.

Now this code can run locally or also via Spark-Submit.

Happy Spark Testing.

spark application

Code used in this blog is available @ runlocal repo

Published on Java Code Geeks with permission by Ashkrit Sharma, partner at our JCG program. See the original article here: Spark Run local design pattern

Opinions expressed by Java Code Geeks contributors are their own.

Ashkrit Sharma

Pragmatic software developer who loves practice that makes software development fun and likes to develop high performance & low latency system.
Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

0 Comments
Inline Feedbacks
View all comments
Back to top button