elm.ords.extraction.ordinance.OrdinanceValidator

class OrdinanceValidator(structured_llm_caller, text_chunks, num_to_recall=2)[source]

Bases: ValidationWithMemory

Check document text for wind ordinances

Purpose:: Determine wether a document contains relevant ordinance information.
Responsibilities:: 1. Determine wether a document contains relevant (e.g. utility-scale wind zoning) ordinance information by splitting the text into chunks and parsing them individually using LLMs.
Key Relationships:: Child class of ValidationWithMemory, which allows the validation to look at neighboring chunks of text.

Parameters:

structured_llm_caller (elm.ords.llm.StructuredLLMCaller) – StructuredLLMCaller instance. Used for structured validation queries.
text_chunks (list of str) – List of strings, each of which represent a chunk of text. The order of the strings should be the order of the text chunks. This validator may refer to previous text chunks to answer validation questions.
num_to_recall (int, optional) – Number of chunks to check for each validation call. This includes the original chunk! For example, if num_to_recall=2, the validator will first check the chunk at the requested index, and then the previous chunk as well. By default, 2.

Methods

`parse`([min_chunks_to_process])	Parse text chunks and look for ordinance text.
`parse_from_ind`(ind, prompt, key)	Validate a chunk of text.

Attributes

`CONTAINS_ORD_PROMPT`
`IS_LEGAL_TEXT_PROMPT`
`IS_UTILITY_SCALE_PROMPT`
`is_legal_text`	`True` if text was found to be from a legal source.
`ordinance_text`	Combined ordinance text from the individual chunks.

property is_legal_text

True if text was found to be from a legal source.

Type:: bool

property ordinance_text

Combined ordinance text from the individual chunks.

Type:: str

async parse(min_chunks_to_process=3)[source]

Parse text chunks and look for ordinance text.

Parameters:: min_chunks_to_process (int, optional) – Minimum number of chunks to process before checking if document resembles legal text and ignoring chunks that don’t pass the wind heuristic. By default, 3.
Returns:: bool – True if any ordinance text was found in the chunks.

async parse_from_ind(ind, prompt, key)

Validate a chunk of text.

Validation occurs by querying the LLM using the input prompt and parsing the key from the response JSON. The prompt should request that the key be a boolean output. If the key retrieved from the LLM response is False, a number of previous text chunks are checked as well, using the same prompt. This can be helpful in cases where the answer to the validation prompt (e.g. does this text pertain to a large WECS?) is only found in a previous text chunk.

Parameters:

ind (int) – Positive integer corresponding to the chunk index. Must be less than len(text_chunks).
prompt (str) – Input LLM system prompt that describes the validation question. This should request a JSON output from the LLM. It should also take key as a formatting input.
key (str) – A key expected in the JSON output of the LLM containing the response for the validation question. This string will also be used to format the system prompt before it is passed to the LLM.

Returns:

bool – True if the LLM returned True for this text chunk or num_to_recall-1 text chunks before it. False otherwise.