compass.validation.location.JurisdictionValidator#

class JurisdictionValidator(score_thresh=0.8, text_splitter=None, **kwargs)[source]#

Bases: object

COMPASS Ordinance Jurisdiction validator

Combines the logic of several validators into a single class.

Purpose:

Determine whether a document pertains to a specific county.

Responsibilities:
  1. Use a combination of heuristics and LLM queries to determine whether or not a document pertains to a particular county.

Key Relationships:

Uses a StructuredLLMCaller for LLM queries and delegates sub-validation to DTreeJurisdictionValidator, and DTreeURLJurisdictionValidator.

Parameters:
  • score_thresh (float, optional) – Score threshold to exceed when voting on content from raw pages. By default, 0.8.

  • text_splitter (langchain.text_splitter.TextSplitter, optional) – Optional text splitter instance to attach to doc (used for splitting out pages in an HTML document). By default, None.

  • **kwargs – Additional keyword arguments to pass to the BaseLLMCaller instance.

Methods

check(doc, jurisdiction)

Check if the document belongs to the county

async check(doc, jurisdiction)[source]#

Check if the document belongs to the county

Parameters:

doc (elm.web.document.BaseDocument) – Document instance. Should contain a “source” key in the attrs that contains a URL (used for the URL validation check). Raw content will be parsed for county name and correct jurisdiction.

Returns:

boolTrue if the doc contents pertain to the input county. False otherwise.