Here is your very first Apache Spark program using Java: the equivalent of the Kernighan and Ritchie’s “Hello, World”.
package net.jgp.labs.spark; import org.apache.spark.SparkConf; import org.apache.spark.SparkContext; public class HelloSpark { public static void main(String[] args) { SparkConf conf = new SparkConf().setAppName("Hello Spark").setMaster("local"); SparkContext sc = new SparkContext(conf); System.out.println("Hello, Spark v." + sc.version()); } }
You can download it from GitHub:
https://github.com/JGPnet/net.jgp.labs.spark.git
Basically, the key is to create a local configuration – our conf
object, then a context from where you will do everything, including displaying the version number.
I used Maven, and I simply added in the dependencies:
<dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-core_2.10</artifactId> <version>1.6.1</version> </dependency>
The output is pretty rough, as I left logging on:
16/06/26 20:00:54 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 52832. 16/06/26 20:00:54 INFO NettyBlockTransferService: Server created on 52832 16/06/26 20:00:54 INFO BlockManagerMaster: Trying to register BlockManager 16/06/26 20:00:54 INFO BlockManagerMasterEndpoint: Registering block manager localhost:52832 with 1140.4 MB RAM, BlockManagerId(driver, localhost, 52832) 16/06/26 20:00:54 INFO BlockManagerMaster: Registered BlockManager Hello, Spark v.1.6.1 16/06/26 20:00:54 INFO SparkContext: Invoking stop() from shutdown hook 16/06/26 20:00:54 INFO SparkUI: Stopped Spark web UI at http://10.0.100.100:4040 16/06/26 20:00:54 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped! 16/06/26 20:00:54 INFO MemoryStore: MemoryStore cleared 16/06/26 20:00:54 INFO BlockManager: BlockManager stopped 16/06/26 20:00:55 INFO BlockManagerMaster: BlockManagerMaster stopped 16/06/26 20:00:55 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped! 16/06/26 20:00:55 INFO SparkContext: Successfully stopped SparkContext 16/06/26 20:00:55 INFO ShutdownHookManager: Shutdown hook called 16/06/26 20:00:55 INFO ShutdownHookManager: Deleting directory /private/var/folders/vs/kl6qlcvx30707d07txrm_xnw0000gn/T/spark-c3f8f992-b75a-4d69-944b-e851658c75a2
Note:
If you have the following error:
16/06/26 19:59:08 ERROR SparkContext: Error initializing SparkContext. org.apache.spark.SparkException: A master URL must be set in your configuration at org.apache.spark.SparkContext.<init>(SparkContext.scala:401) at net.jgp.labs.spark.HelloSpark.main(HelloSpark.java:10)
It means that you forgot to specify where the master is by using. You can resolve this by using: setMaster("local")
.
Comments are closed.