spark-submit ClassNotFound Exception with Maven

Question

I realize there are related questions with this one, but I just can't get my code to work.

I am running a Spark Streaming application in standalone mode, with the master node in my Windows host and a worker in an Ubuntu virtual machine. Here is the problem: when I run spark-submit, this is what shows up:

 >spark-submit --master spark://192.168.56.1:7077 --class spark.example.Main  C:/Users/Manuel Mourato/xxx/target/ParkMonitor-1.0-SNAPSHOT.jar
Warning: Skip remote jar C:/Users/Manuel.
java.lang.ClassNotFoundException: spark.example.Main
    at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(Class.java:348)
    at org.apache.spark.util.Utils$.classForName(Utils.scala:175)
    at   org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:689)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

I created this jar file with Maven, using "package" in IntelliJ. This is how I am setting up my SparkStreaming Context:

 SparkConf sparkConfiguration= new SparkConf().setAppName("ParkingDataAnalysis").setMaster("spark://192.168.56.1:7077");
 JavaStreamingContext sparkStrContext=new JavaStreamingContext(sparkConfiguration, Durations.seconds(1));

Can anyone help me? Thank you so much.

T. Gawęda · Accepted Answer · 2016-10-05 12:43:55Z

1

You've got space in folder name, please add quotes (") and try again

spark-submit --master spark://192.168.56.1:7077 --class spark.example.Main "C:/Users/Manuel Mourato/xxx/target/ParkMonitor-1.0-SNAPSHOT.jar"

One more, from docs: application-jar: Path to a bundled jar including your application and all dependencies. The URL must be globally visible inside of your cluster, for instance, an hdfs:// path or a file:// path that is present on all nodes.

So please copy your file to HDFS or to the same location on all nodes. It would be difficult in combination of Linux and Windows ;) I strongly recommend setting up HDFS

edited Oct 5, 2016 at 12:43

answered Oct 5, 2016 at 12:37

T. Gawęda

16k5 gold badges49 silver badges62 bronze badges

@manuelmourato please see update, I forgot about one, but important, thing
– T. Gawęda
Commented Oct 5, 2016 at 12:44

Add a comment |

Collectives™ on Stack Overflow

spark-submit ClassNotFound Exception with Maven

1 Answer 1

Not the answer you're looking for? Browse other questions tagged
maven
apache-spark
spark-streaming
or ask your own question.

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Not the answer you're looking for? Browse other questions tagged mavenapache-sparkspark-streaming or ask your own question.

Linked

Related

Not the answer you're looking for? Browse other questions tagged
maven
apache-spark
spark-streaming
or ask your own question.