compass.web.website_crawl#
Custom COMPASS website crawler
Much more simplistic than the Crawl4AI crawler, but designed to access some links that Crawl4AI cannot (such as those behind a button interface).
Module attributes
Default max documents to collect before terminating COMPASS crawl |
Classes
|
A simple website crawler to search for ordinance documents |
|
Custom URL scorer for COMPASS website crawling |
|
Crawl4AI Link subclass with a few utilities |