compass.web.website_crawl#

Custom COMPASS website crawler

Much more simplistic than the Crawl4AI crawler, but designed to access some links that Crawl4AI cannot (such as those behind a button interface).

Module attributes

DOC_THRESHOLD

Default max documents to collect before terminating COMPASS crawl

Classes

COMPASSCrawler(validator, url_scorer[, ...])

A simple website crawler to search for ordinance documents

COMPASSLinkScorer([keyword_points])

Custom URL scorer for COMPASS website crawling