compass.scripts.download.download_jurisdiction_ordinance_using_search_engine#

async download_jurisdiction_ordinance_using_search_engine(question_templates, jurisdiction, num_urls=5, file_loader_kwargs=None, search_semaphore=None, browser_semaphore=None, url_ignore_substrings=None, **kwargs)[source]#

Download the ordinance document(s) for a single jurisdiction

Parameters:
  • jurisdiction (Jurisdiction) – Location objects representing the jurisdiction.

  • model_configs (dict) – Dictionary of LLMConfig instances. Should have at minium a “default” key that is used as a fallback for all tasks.

  • num_urls (int, optional) – Number of unique Google search result URL’s to check for ordinance document. By default, 5.

  • file_loader_kwargs (dict, optional) – Dictionary of keyword-argument pairs to initialize elm.web.file_loader.AsyncFileLoader with. If found, the “pw_launch_kwargs” key in these will also be used to initialize the elm.web.search.google.PlaywrightGoogleLinkSearch used for the google URL search. By default, None.

  • search_semaphore (asyncio.Semaphore, optional) – Semaphore instance that can be used to limit the number of playwright browsers used to submit search engine queries open concurrently. If this input is None, the input from browser_semaphore will be used in its place (i.e. the searches and file downloads will be limited using the same semaphore). By default, None.

  • browser_semaphore (asyncio.Semaphore, optional) – Semaphore instance that can be used to limit the number of playwright browsers used to download content from the web open concurrently. If None, no limits are applied. By default, None.

  • usage_tracker (compass.services.usage.UsageTracker, optional) – Optional tracker instance to monitor token usage during LLM calls. By default, None.

Returns:

list or None – List of BaseDocument instances possibly containing ordinance information, or None if no ordinance document was found.

Notes

Requires TempFileCachePB service to be running.