elm.ords.extraction.ordinance.OrdinanceValidator

class OrdinanceValidator(structured_llm_caller, text_chunks, num_to_recall=2)[source]

Bases: ValidationWithMemory

Check document text for wind ordinances.

Parameters:
  • structured_llm_caller (elm.ords.llm.StructuredLLMCaller) – StructuredLLMCaller instance. Used for structured validation queries.

  • text_chunks (list of str) – List of strings, each of which represent a chunk of text. The order of the strings should be the order of the text chunks. This validator may refer to previous text chunks to answer validation questions.

  • num_to_recall (int, optional) – Number of chunks to check for each validation call. This includes the original chunk! For example, if num_to_recall=2, the validator will first check the chunk at the requested index, and then the previous chunk as well. By default, 2.

Methods

parse([min_chunks_to_process])

Parse text chunks and look for ordinance text.

parse_from_ind(ind, prompt, key)

Validate a chunk of text.

Attributes

CONTAINS_ORD_PROMPT

IS_LEGAL_TEXT_PROMPT

IS_UTILITY_SCALE_PROMPT

is_legal_text

True if text was found to be from a legal source.

ordinance_text

Combined ordinance text from the individual chunks.

True if text was found to be from a legal source.

Type:

bool

property ordinance_text

Combined ordinance text from the individual chunks.

Type:

str

async parse(min_chunks_to_process=3)[source]

Parse text chunks and look for ordinance text.

Parameters:

min_chunks_to_process (int, optional) – Minimum number of chunks to process before checking if document resembles legal text and ignoring chunks that don’t pass the wind heuristic. By default, 3.

Returns:

boolTrue if any ordinance text was found in the chunks.

async parse_from_ind(ind, prompt, key)

Validate a chunk of text.

Validation occurs by querying the LLM using the input prompt and parsing the key from the response JSON. The prompt should request that the key be a boolean output. If the key retrieved from the LLM response is False, a number of previous text chunks are checked as well, using the same prompt. This can be helpful in cases where the answer to the validation prompt (e.g. does this text pertain to a large WECS?) is only found in a previous text chunk.

Parameters:
  • ind (int) – Positive integer corresponding to the chunk index. Must be less than len(text_chunks).

  • prompt (str) – Input LLM system prompt that describes the validation question. This should request a JSON output from the LLM. It should also take key as a formatting input.

  • key (str) – A key expected in the JSON output of the LLM containing the response for the validation question. This string will also be used to format the system prompt before it is passed to the LLM.

Returns:

boolTrue if the LLM returned True for this text chunk or num_to_recall-1 text chunks before it. False otherwise.