javascript - Trying to scrape data generated by ajax request using scrapy, but the ajax request redirects to home page -
i new scraping not sure why getting problem. trying scrape customer vendor chats anyvan.com. normal job page of site looks this. clicking on pink view button in bids session sends ajax request loads chats. xhr request can been seen in developers tool -> network -> filter xhr request.
i using following simple spider simulate request using scrapy seems getting redirected anyvan.com
class avspider(spider): name = "anyvanscraper" allowed_domains = ["anyvan.com"] # start url job url start_urls = ["http://www.anyvan.com/view-listing/1935650"] def parse(self, response): # receives response start url. don't it. url = 'http://www.anyvan.com/ajax-bid-comment/bid/14916780' return request('http://www.anyvan.com/ajax-bid-comment/bid/14916780' , callback=self.parse_stores) def parse_stores(self, response): y = response.body f = open('html.txt','w') f.write(beautifulsoup(y).prettify().encode('utf-8'))
thanks in advance ellen
add header. can add request.
"x-requested-with": "xmlhttprequest"
something should work:
return request('http://www.anyvan.com/ajax-bid-comment/bid/14916780' , callback=self.parse_stores, headers={"x-requested-with": "xmlhttprequest"})
Comments
Post a Comment