Here is your very first Apache Spark program using Java: the equivalent of the Kernighan and Ritchie’s “Hello, World”.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | package net.jgp.labs.spark; import org.apache.spark.SparkConf; import org.apache.spark.SparkContext; public class HelloSpark { public static void main(String[] args) { SparkConf conf = new SparkConf().setAppName( "Hello Spark" ).setMaster( "local" ); SparkContext sc = new SparkContext(conf); System.out.println( "Hello, Spark v." + sc.version()); } } |
You can download it from GitHub:
Basically, the key is to create a local configuration – our conf
object, then a context from where you will do everything, including displaying the version number.
I used Maven, and I simply added in the dependencies:
1 2 3 4 5 | < dependency > < groupId >org.apache.spark</ groupId > < artifactId >spark-core_2.10</ artifactId > < version >1.6.1</ version > </ dependency > |
The output is pretty rough, as I left logging on:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | 16/06/26 20:00:54 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 52832. 16/06/26 20:00:54 INFO NettyBlockTransferService: Server created on 52832 16/06/26 20:00:54 INFO BlockManagerMaster: Trying to register BlockManager 16/06/26 20:00:54 INFO BlockManagerMasterEndpoint: Registering block manager localhost:52832 with 1140.4 MB RAM, BlockManagerId(driver, localhost, 52832) 16/06/26 20:00:54 INFO BlockManagerMaster: Registered BlockManager Hello, Spark v.1.6.1 16/06/26 20:00:54 INFO SparkContext: Invoking stop() from shutdown hook 16/06/26 20:00:54 INFO SparkUI: Stopped Spark web UI at http://10.0.100.100:4040 16/06/26 20:00:54 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped! 16/06/26 20:00:54 INFO MemoryStore: MemoryStore cleared 16/06/26 20:00:54 INFO BlockManager: BlockManager stopped 16/06/26 20:00:55 INFO BlockManagerMaster: BlockManagerMaster stopped 16/06/26 20:00:55 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped! 16/06/26 20:00:55 INFO SparkContext: Successfully stopped SparkContext 16/06/26 20:00:55 INFO ShutdownHookManager: Shutdown hook called 16/06/26 20:00:55 INFO ShutdownHookManager: Deleting directory /private/var/folders/vs/kl6qlcvx30707d07txrm_xnw0000gn/T/spark-c3f8f992-b75a-4d69-944b-e851658c75a2 |
Note:
If you have the following error:
1 2 3 4 | 16/06/26 19:59:08 ERROR SparkContext: Error initializing SparkContext. org.apache.spark.SparkException: A master URL must be set in your configuration at org.apache.spark.SparkContext.<init>(SparkContext.scala:401) at net.jgp.labs.spark.HelloSpark.main(HelloSpark.java:10) |
It means that you forgot to specify where the master is by using. You can resolve this by using: setMaster("local")
.
Comments are closed.