html - How to parse javascript from web page -
i need links javascript. using jsoup, didn't work.
screen need link source of page. can me how it?
string url = "http://www.cda.pl/video/149016ec/rybki-z-ferajny-2004-1080p-dubbing-pl"; document doc = jsoup.connect(url).get(); elements scriptelements = doc.getelementsbytag("script"); (element element :scriptelements ){ (datanode node : element.datanodes()) { system.out.println(node.getwholedata()); } system.out.println("-------------------"); }
i marked on screen urls want get.
you can use code:
string url = "http://www.cda.pl/video/149016ec/rybki-z-ferajny-2004-1080p-dubbing-pl"; document doc = jsoup.connect(url).get(); //we pick script node element script = doc.select("#player > script").get(0); string text = script.html(); //then parse script desired uri final string prefix = "l='"; int p1 = text.indexof(prefix) + prefix.length(); int p2 = text.indexof("'", p1); string uri = text.substring(p1, p2); system.out.println(uri);
it give desired output:
http://vgra001.cda.pl/lqcc6f8b3c8f76d1b58c1234813fcf67c7.mp4?st=sjoq8ddcnh7pw8_xnnka3w&e=1416438406
please note example, need error checking.
now explanation:
you had done, had location of code uri, easy find important script node: can see <div class="wrapqualitybtn">
near script
tag, can find div
contains both script
tag , div
tag (the <div id="player" ... >
, script's tag parent node)
once have script node need string parsing. parsing javascript code risky because little change in code can break parser, think in case looking l='
solid bet.
a couple of advices:
when page uses jquery can use jquery in browser console too! if put
$('#player > script')[0]
in browser see script tag.you can search through dom of page in developer tools of browser (f12) , right click node , click in
copy css path
(in chrome, similar in firefox) , obtain selector useable in jsoup.for more resiliant script parsing use regular expressions instead of plain
indexof
search.
i hope help, excuse me verbosity.
Comments
Post a Comment