html - How to parse javascript from web page -
i need links javascript. using jsoup, didn't work.
screen need link source of page. can me how it?
string url = "http://www.cda.pl/video/149016ec/rybki-z-ferajny-2004-1080p-dubbing-pl"; document doc = jsoup.connect(url).get(); elements scriptelements = doc.getelementsbytag("script"); (element element :scriptelements ){ (datanode node : element.datanodes()) { system.out.println(node.getwholedata()); } system.out.println("-------------------"); } i marked on screen urls want get.
you can use code:
string url = "http://www.cda.pl/video/149016ec/rybki-z-ferajny-2004-1080p-dubbing-pl"; document doc = jsoup.connect(url).get(); //we pick script node element script = doc.select("#player > script").get(0); string text = script.html(); //then parse script desired uri final string prefix = "l='"; int p1 = text.indexof(prefix) + prefix.length(); int p2 = text.indexof("'", p1); string uri = text.substring(p1, p2); system.out.println(uri); it give desired output:
http://vgra001.cda.pl/lqcc6f8b3c8f76d1b58c1234813fcf67c7.mp4?st=sjoq8ddcnh7pw8_xnnka3w&e=1416438406 please note example, need error checking.
now explanation:
you had done, had location of code uri, easy find important script node: can see <div class="wrapqualitybtn"> near script tag, can find div contains both script tag , div tag (the <div id="player" ... >, script's tag parent node)
once have script node need string parsing. parsing javascript code risky because little change in code can break parser, think in case looking l=' solid bet.
a couple of advices:
when page uses jquery can use jquery in browser console too! if put
$('#player > script')[0]in browser see script tag.you can search through dom of page in developer tools of browser (f12) , right click node , click in
copy css path(in chrome, similar in firefox) , obtain selector useable in jsoup.for more resiliant script parsing use regular expressions instead of plain
indexofsearch.
i hope help, excuse me verbosity.
Comments
Post a Comment