batch file - Slow performance in spark streaming -


i using spark streaming 1.1.0 locally (not in cluster). created simple app parses data (about 10.000 entries), stores in stream , makes transformations on it. here code:

def main(args : array[string]){      val master = "local[8]"     val conf = new sparkconf().setappname("tester").setmaster(master)     val sc = new streamingcontext(conf, milliseconds(110000))      val stream = sc.receiverstream(new myreceiver("localhost", 9999))      val parsedstream = parse(stream)      parsedstream.foreachrdd(rdd =>          println(rdd.first()+"\nrule starts "+system.currenttimemillis()))      val result1 = parsedstream        .filter(entry => entry.symbol.contains("walking")         && entry.symbol.contains("true") && entry.symbol.contains("id0"))        .map(_.time)      val result2 = parsedstream        .filter(entry =>         entry.symbol == "disappear" && entry.symbol.contains("id0"))        .map(_.time)      val result3 = result1       .transformwith(result2, (rdd1, rdd2: rdd[int]) => rdd1.subtract(rdd2))      result3.foreachrdd(rdd =>      println(rdd.first()+"\nrule ends "+system.currenttimemillis()))     sc.start()    sc.awaittermination() }  def parse(stream: dstream[string]) = {      stream.flatmap { line =>         val entries = line.split("assert").filter(entry => !entry.isempty)         entries.map { tuple =>              val pattern = """\s*[(](.+)[,]\s*([0-9]+)+\s*[)]\s*[)]\s*[,|\.]\s*""".r              tuple match {               case pattern(symbol, time) =>               new data(symbol, time.toint)             }          }     } }  case class data (symbol: string, time: int) 

i have batch duration of 110.000 milliseconds in order receive data in 1 batch. believed that, locally, spark fast. in case, takes 3.5sec execute rule (between "rule starts" , "rule ends"). doing wrong or expected time? advise

so using case matching in allot of jobs , killed performance, more when introduced json parser. try tweaking batch time on streamingcontext. made quite bit of difference me. how many local workers have?


Comments

Popular posts from this blog

java - Oracle EBS .ClassNotFoundException: oracle.apps.fnd.formsClient.FormsLauncher.class ERROR -

c# - how to use buttonedit in devexpress gridcontrol -

nvd3.js - angularjs-nvd3-directives setting color in legend as well as in chart elements -