html - How to parse javascript from web page -


i need links javascript. using jsoup, didn't work.

screen need link source of page. can me how it?

    string url = "http://www.cda.pl/video/149016ec/rybki-z-ferajny-2004-1080p-dubbing-pl";     document doc = jsoup.connect(url).get();       elements scriptelements = doc.getelementsbytag("script");      (element element :scriptelements ){                            (datanode node : element.datanodes()) {                system.out.println(node.getwholedata());            }            system.out.println("-------------------");                  } 

i marked on screen urls want get.

you can use code:

    string url = "http://www.cda.pl/video/149016ec/rybki-z-ferajny-2004-1080p-dubbing-pl";     document doc = jsoup.connect(url).get();      //we pick script node      element script = doc.select("#player > script").get(0);     string text = script.html();      //then parse script desired uri     final string prefix = "l='";     int p1 = text.indexof(prefix) + prefix.length();     int p2 = text.indexof("'", p1);     string uri = text.substring(p1,  p2);      system.out.println(uri); 

it give desired output:

http://vgra001.cda.pl/lqcc6f8b3c8f76d1b58c1234813fcf67c7.mp4?st=sjoq8ddcnh7pw8_xnnka3w&e=1416438406 

please note example, need error checking.

now explanation:

you had done, had location of code uri, easy find important script node: can see <div class="wrapqualitybtn"> near script tag, can find div contains both script tag , div tag (the <div id="player" ... >, script's tag parent node)

once have script node need string parsing. parsing javascript code risky because little change in code can break parser, think in case looking l=' solid bet.

a couple of advices:

  • when page uses jquery can use jquery in browser console too! if put $('#player > script')[0] in browser see script tag.

  • you can search through dom of page in developer tools of browser (f12) , right click node , click in copy css path (in chrome, similar in firefox) , obtain selector useable in jsoup.

  • for more resiliant script parsing use regular expressions instead of plain indexof search.

i hope help, excuse me verbosity.


Comments

Popular posts from this blog

java - Oracle EBS .ClassNotFoundException: oracle.apps.fnd.formsClient.FormsLauncher.class ERROR -

c# - how to use buttonedit in devexpress gridcontrol -

nvd3.js - angularjs-nvd3-directives setting color in legend as well as in chart elements -