elm.ords.extraction.ordinance.OrdinanceValidator
- class OrdinanceValidator(structured_llm_caller, text_chunks, num_to_recall=2)[source]
Bases:
ValidationWithMemory
Check document text for wind ordinances
- Purpose:
Determine wether a document contains relevant ordinance information.
- Responsibilities:
1. Determine wether a document contains relevant (e.g. utility-scale wind zoning) ordinance information by splitting the text into chunks and parsing them individually using LLMs.
- Key Relationships:
Child class of
ValidationWithMemory
, which allows the validation to look at neighboring chunks of text.
- Parameters:
structured_llm_caller (elm.ords.llm.StructuredLLMCaller) – StructuredLLMCaller instance. Used for structured validation queries.
text_chunks (list of str) – List of strings, each of which represent a chunk of text. The order of the strings should be the order of the text chunks. This validator may refer to previous text chunks to answer validation questions.
num_to_recall (int, optional) – Number of chunks to check for each validation call. This includes the original chunk! For example, if num_to_recall=2, the validator will first check the chunk at the requested index, and then the previous chunk as well. By default,
2
.
Methods
parse
([min_chunks_to_process])Parse text chunks and look for ordinance text.
parse_from_ind
(ind, prompt, key)Validate a chunk of text.
Attributes
CONTAINS_ORD_PROMPT
IS_LEGAL_TEXT_PROMPT
IS_UTILITY_SCALE_PROMPT
True
if text was found to be from a legal source.Combined ordinance text from the individual chunks.
- async parse(min_chunks_to_process=3)[source]
Parse text chunks and look for ordinance text.
- Parameters:
min_chunks_to_process (int, optional) – Minimum number of chunks to process before checking if document resembles legal text and ignoring chunks that don’t pass the wind heuristic. By default,
3
.- Returns:
bool –
True
if any ordinance text was found in the chunks.
- async parse_from_ind(ind, prompt, key)
Validate a chunk of text.
Validation occurs by querying the LLM using the input prompt and parsing the key from the response JSON. The prompt should request that the key be a boolean output. If the key retrieved from the LLM response is False, a number of previous text chunks are checked as well, using the same prompt. This can be helpful in cases where the answer to the validation prompt (e.g. does this text pertain to a large WECS?) is only found in a previous text chunk.
- Parameters:
ind (int) – Positive integer corresponding to the chunk index. Must be less than len(text_chunks).
prompt (str) – Input LLM system prompt that describes the validation question. This should request a JSON output from the LLM. It should also take key as a formatting input.
key (str) – A key expected in the JSON output of the LLM containing the response for the validation question. This string will also be used to format the system prompt before it is passed to the LLM.
- Returns:
bool –
True
if the LLM returnedTrue
for this text chunk or num_to_recall-1 text chunks before it.False
otherwise.