scala - standalone spark: worker didn't show up -

February 15, 2014

i have 2 question want know:

this code:

object hi {   def  main (args: array[string]) {     println("sucess")     val conf = new sparkconf().setappname("hi").setmaster("local")     val sc = new sparkcontext(conf)     val textfile = sc.textfile("src/main/scala/source.txt")     val rows = textfile.map { line =>       val fields = line.split("::")       (fields(0), fields(1).toint)     }     val x = rows.map{case (range , ratednum) => range}.collect.mkstring("::")     val y = rows.map{case (range , ratednum) => ratednum}.collect.mkstring("::")     println(x)     println(y)     println("sucess2")    } }

here of resault :

15/04/26 16:49:57 info utils: started service 'sparkui' on port 4040. 15/04/26 16:49:57 info sparkui: started sparkui @ http://192.168.1.105:4040 15/04/26 16:49:57 info executor: starting executor id <driver> on host localhost 15/04/26 16:49:57 info akkautils: connecting heartbeatreceiver: akka.tcp://sparkdriver@192.168.1.105:64952/user/heartbeatreceiver 15/04/26 16:49:57 info nettyblocktransferservice: server created on 64954 15/04/26 16:49:57 info blockmanagermaster: trying register blockmanager 15/04/26 16:49:57 info blockmanagermasteractor: registering block manager localhost:64954 983.1 mb ram, blockmanagerid(<driver>, localhost, 64954) ..... 15/04/26 16:49:59 info sparkcontext: created broadcast 2 broadcast @ dagscheduler.scala:839 15/04/26 16:49:59 info dagscheduler: submitting 1 missing tasks stage 1 (mappartitionsrdd[4] @ map @ hi.scala:25) 15/04/26 16:49:59 info taskschedulerimpl: adding task set 1.0 1 tasks 15/04/26 16:49:59 info tasksetmanager: starting task 0.0 in stage 1.0 (tid 1, localhost, process_local, 1331 bytes) 15/04/26 16:49:59 info executor: running task 0.0 in stage 1.0 (tid 1) 15/04/26 16:49:59 info hadooprdd: input split: file:/users/winsome/ideaprojects/untitled/src/main/scala/source.txt:0+23 15/04/26 16:49:59 info executor: finished task 0.0 in stage 1.0 (tid 1). 1787 bytes result sent driver 15/04/26 16:49:59 info tasksetmanager: finished task 0.0 in stage 1.0 (tid 1) in 13 ms on localhost (1/1) 15/04/26 16:49:59 info dagscheduler: stage 1 (collect @ hi.scala:25) finished in 0.013 s 15/04/26 16:49:59 info taskschedulerimpl: removed taskset 1.0, tasks have completed, pool  15/04/26 16:49:59 info dagscheduler: job 1 finished: collect @ hi.scala:25, took 0.027784 s 1~1::2~2::3~3 10::20::30 sucess2

my first question : when check http://localhost:8080/
there no worker. , can't open http://192.168.1.105:4040 is because use spark standalone?
how fixed this??

(my environment mac,ide intellij)

enter image description here

my 2nd question is:

    val x = rows.map{case (range , ratednum) => range}.collect.mkstring("::")     val y = rows.map{case (range , ratednum) => ratednum}.collect.mkstring("::")     println(x)     println(y)

i thiink these code more x , y (something stuff :rows[range],rows[ratenum]),but i'm not familiar scala . give me advice?

i'm not sure first question, reading log see worker node lasted 13 ms, may reason why haven't see it. run longer job , may see workers.

about second question, yes, there simpler way write is:

val x = rows.map{(tuple) => tuple._1}.collect.mkstring("::")

because rdd made of tuple scala objects, made of 2 fields can access _1 , _2 respectively.

Search This Blog

UV code

scala - standalone spark: worker didn't show up -

Comments

Post a Comment

Popular posts from this blog

jquery - How do you format the date used in the popover widget title of FullCalendar? -

Bubble Sort Manually a Linked List in Java -

asp.net mvc - SSO between MVCForum and Umbraco7 -