Web Crawling / Scraping
Scrapy, a fast high-level web crawling & scraping framework for Python.
A Powerful Spider(Web Crawler) System in Python.
Ruby gem for web scraping purposes. It scrapes a given URL, and returns you its title, meta description, meta keywords, links, images...
Lightweight Ruby web crawler/scraper with an elegant DSL which extracts structured data from pages.
A task based API for taking screenshots and scraping text from websites.
Anemone web-spider framework
A versatile Ruby web spidering library that can spider a site, multiple domains, certain links or infinitely. Spidr is designed to be fast and easy to use.
A batteries-included framework for easy web-scraping. Just add CSS! (Or do more.)
Ruby gem to inspect completely a web page. It scrapes a given URL, and returns you its meta, links, images more.
A data scraping framework based on Open Civic Data's Pupa
Spider is a Web spidering library for Ruby. It handles the robots.txt, scraping, collecting, and looping so that you can just handle the data.
A DSL to write web spider. Depend on capybara and capybara-webkit
A simple Ruby web spider that uses Anemone to crawl every page of a site looking for email addresses. Stores the results with SQLite3 using Data Mapper.
Ronin Web is a Ruby library for Ronin that provides support for web scraping and spidering functionality.
ScrApify is a library to build APIs by scraping static sites and use data as models or JSON APIs. It powers APIfy which is used to create JSON APIs from any html or wikipedia page
