i want parse <dt>seeders:</dt>
& <dt>leechers:</dt>
html using jsoup. see full code below.
<div id="details"> <dl class="col1"> <dt>type:</dt> <dd><a href="/browse/101" title="more category">audio > music</a></dd> <dt>files:</dt> <dd><a href="/torrent/8682317/" title="files" onclick=" if (filelist < 1) { new ajax.updater('filelistcontainer', '/ajax_details_filelist.php', {method: 'get', parameters: 'id=8682317'}); filelist=1; }; togglefilelist(); return false;">28</a></dd> <dt>size:</dt> <dd>222.65 mib (233468815 bytes)</dd> <br /> <dt>tag(s):</dt> <dd><a href="/tag/markus">markus</a> <a href="/tag/schulz">schulz</a> <a href="/tag/dakota">dakota</a> <a href="/tag/things">things</a> <a href="/tag/trance">trance</a> <a href="/tag/armada">armada</a> <a href="/tag/2011">2011</a> <a href="/tag/inspiron">inspiron</a> </dd> <br /> <dt>uploaded:</dt> <dd>2013-07-13 15:30:25 gmt</dd> <dt>by:</dt> <dd> <a href="/user/-inspiron-/" title="browse -inspiron-">-inspiron-</a> <img src="/static/img/vip.gif" alt="vip" title="vip" style="width:11px;" border='0' /></dd> <br /> <dt>seeders:</dt> <dd>16</dd> <dt>leechers:</dt> <dd>1</dd> <dt>comments</dt> <dd><span id="numcomments">0</span> </dd> <br /> <dt>info hash:</dt><dd> </dd> 01dd6b7325c3db5f0df5bbe510fd3fd9738d1c88 </dl> <div class="torpicture"> <img src="//image.bayimg.com/345b5b11734bb9973863359cc52929f3ddc45205.jpg" title="picture" alt="picture" /> </div> <dl class="col2"> </dl> <div id="commentdiv" style="display:none;"> <form method="post" id="commentsform" name="commentsform" onsubmit="new ajax.updater('numcomments', '/ajax_post_comment.php', {evalscripts:true, asynchronous:true, parameters:form.serialize(this)}); return false;" action="/ajax_post_comment.php"> <p class="info"> <textarea name="add_comment" id="add_comment" rows="8" cols="50"></textarea><br/> <input type="hidden" name="id" value="8682317"/> <input type="submit" value="submit" /><input type="button" value="hide" onclick="document.getelementbyid('commentdiv').style.display = 'none'" /> </p> </form> </div> <br/> <br/> <div id="social"> </div> <iframe src="http://cdn1.adexprt.com/dl/dl.php?b=bar&r=75&n=markus_schulz_-_global_dj_broadcast_%282013-07-11%29_%28inspiron%29&m=magnet%3a%3fxt%3durn%3abtih%3a01dd6b7325c3db5f0df5bbe510fd3fd9738d1c88%26dn%3dmarkus%2bschulz%2b-%2bglobal%2bdj%2bbroadcast%2b%25282013-07-11%2529%2b%2528inspiron%2529%26tr%3dudp%253a%252f%252ftracker.openbittorrent.com%253a80%26tr%3dudp%253a%252f%252ftracker.publicbt.com%253a80%26tr%3dudp%253a%252f%252ftracker.istole.it%253a6969%26tr%3dudp%253a%252f%252ftracker.ccc.de%253a80%26tr%3dudp%253a%252f%252fopen.demonii.com%253a1337" width="622" height="51" frameborder="0" scrolling="no"></iframe> <br /><br /> <div class="download"> <a style='background-image: url("/static/img/icons/icon-magnet.gif");' href="magnet:?xt=urn:btih:01dd6b7325c3db5f0df5bbe510fd3fd9738d1c88&dn=markus+schulz+-+global+dj+broadcast+%282013-07-11%29+%28inspiron%29&tr=udp%3a%2f%2ftracker.openbittorrent.com%3a80&tr=udp%3a%2f%2ftracker.publicbt.com%3a80&tr=udp%3a%2f%2ftracker.istole.it%3a6969&tr=udp%3a%2f%2ftracker.ccc.de%3a80&tr=udp%3a%2f%2fopen.demonii.com%3a1337" title="get torrent"> get torrent</a> <a style='background-image: url("/static/img/icon-https.gif");' href="http://adexprt.me/get/markus_schulz_-_global_dj_broadcast_%282013-07-11%29_%28inspiron%29?tag=bal" title="anonymous download"> anonymous download</a> </div> <div>(problems magnets links fixed upgrading <a href="http://www.bitlordapp.com/d/btl1/?sr=irm&chnl=details" target="_blank">torrent client</a>!)</div> <div class="nfo"> <pre>======================================================= site: http://www.inspirontrance.com/ ======================================================= ======================================================= f b page: inspiron trance ======================================================= ======================================================= twitter : inspiron22 ======================================================= markus schulz 01. mobil - 1 morning (aleksey sladkov remix) 02. store n forward - nuts 03. alter future vs. holbrook & skykeeper - megapolis 04. danilo ercole - cruzer 05. aaron camz - emission 06. markus schulz featuring sarah howells - tempted 07. m.i.k.e. presents caromax - inner thoughts 08. ruffault - progressive dream 09. styller - left behind 10. meridian - exit 11. lange - different shade of crazy 12. tucandeo featuring natalie gioia - disappear (xtigma remix) 13. sebastian weikum - sky limit 14. markus schulz - don't leave until sunrise guy j 01. roger martinez & secret cinema - menthol raga (guy j remix) 02. ambassador - fade (guy j remix) 03. guy j - 7 04. echomen – perpetual (guy j remix) markus schulz 15. mauro picotto & riccardo ferri - new time, new place (new world punx remix) 16. grube & hovsepian - trickster 17. nifra - waves 18. markus schulz featuring dauby - perfect (digital x remix) [global selection] 19. basil o'glue - gilgamesh 20. skytech - other side 21. id enjoy (inspiron) </pre> </div>
i've used code parses whole details instead of parsing 'seeders' & 'leechers'
try { document = jsoup.connect(blog_url).get(); title = document.title(); } catch (ioexception e) { // todo auto-generated catch block e.printstacktrace(); } // selector query elements nodeblogstats = document.select("div#details"); // check results if (nodeblogstats.size() > 0) { // value result = nodeblogstats.get(0).text(); }
according http://jsoup.org/apidocs/org/jsoup/select/selector.html, looking
e ~ f f element preceded sibling e
and
:contains(text) elements contains specified text.
i try
element seeders = document.select("dt:contains(seeders) ~ dd").get(0); element leechers = document.select("dt:contains(leechers) ~ dd").get(0);
Comments
Post a Comment