[root@node1 spark]# bin/spark-shell --master spark://localhost:7077 2018-08-07 11:02:04 WARN Utils:66 - Your hostname, hidden.zzh.com resolves to a loopback address: 127.0.0.1; using 10.xxx.xxx.xxx instead (on interface eth0) 2018-08-07 11:02:04 WARN Utils:66 - Set SPARK_LOCAL_IP if you need to bind to another address 2018-08-07 11:02:04 WARN NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Setting default log level to "WARN". To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel). Spark context Web UI available at http:// 10.xxx.xxx.xxx:4040 Spark context available as 'sc' (master = spark://localhost:7077, app id = app-20180807110212-0000). Spark session available as 'spark'. Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 2.3.1 /_/ Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_102) Type in expressions to have them evaluated. Type :help for more information.
scala> val rdd = sc.textFile("/opt/spark/bin/spark-shell") rdd: org.apache.spark.rdd.RDD[String] = /opt/spark/bin/spark-shell MapPartitionsRDD[3] at textFile at <console>:24
scala> val wordmap = rdd.flatMap(_.split(" ")).map(x=>(x,1)) wordmap: org.apache.spark.rdd.RDD[(String, Int)] = MapPartitionsRDD[5] at map at <console>:25
scala> val wordsort = wordreduce.map(x=>(x._2,x._1)).sortByKey(false).map(x=>(x._2,x._1)) wordsort: org.apache.spark.rdd.RDD[(String, Int)] = MapPartitionsRDD[11] at map at <console>:25