New example Scrapy project available

Scrapy users have complained in the past about the lack of a pre-built example project that contains, for example, the dmoz spider described in the tutorial.

Complain no more!. We're happy to let you know that there is a now functional Scrapy project available on Github which contains the old Google Directory spider and the Dmoz spider described in the tutorial.

The project is called "dirbot", and it's available at https://github.com/scrapy/dirbot

 

The documentation of Scrapy 0.13 (which will become the next stable release, Scrapy 0.14) has been updated to point to this new example project.

About

Scrapy is a fast high-level screen scraping and web crawling framework, used to crawl websites and extract structured data from their pages. It can be used for a wide range of purposes, from data mining to monitoring and automated testing.

Twitter