compass.extraction.wind.ordinance.WindOrdinanceTextExtractor#

class WindOrdinanceTextExtractor(llm_caller)[source]#

Bases: BaseTextExtractor

Extract succinct ordinance text from input

Purpose:

Extract relevant ordinance text from document.

Responsibilities:
  1. Extract portions from chunked document text relevant to particular ordinance type (e.g. wind zoning for utility-scale systems).

Key Relationships:

Uses a StructuredLLMCaller for LLM queries.

Parameters:

llm_caller (LLMCaller) – LLM Caller instance used to extract ordinance info with.

Methods

extract_large_wind_energy_system_section(...)

Extract large WES ordinance text from input text chunks

extract_wind_energy_system_section(text_chunks)

Extract ordinance text from input text chunks for WES

Attributes

LARGE_WIND_ENERGY_SYSTEM_SECTION_FILTER_PROMPT

Prompt to extract ordinance text for utility-scale WECS

SYSTEM_MESSAGE

System message for text extraction LLM calls

WIND_ENERGY_SYSTEM_FILTER_PROMPT

Prompt to extract ordinance text for WECS

parsers

Iterable of parsers provided by this extractor

WIND_ENERGY_SYSTEM_FILTER_PROMPT = "# CONTEXT #\nWe want to reduce the provided excerpt to only contain information about **wind energy systems**. The extracted text will be used for structured data extraction, so it must be both **comprehensive** (retaining all relevant details) and **focused** (excluding unrelated content), with **zero rewriting or paraphrasing**. Ensure that all retained information is **directly applicable to wind energy systems** while preserving full context and accuracy.\n\n# OBJECTIVE #\nExtract all text **pertaining to wind energy systems** from the provided excerpt.\n\n# RESPONSE #\nFollow these guidelines carefully:\n\n1. ## Scope of Extraction ##:\n- Include all text that pertains to **wind energy systems**.\n- Explicitly include any text related to **bans or prohibitions** on wind energy systems.\n- Explicitly include any text related to the adoption or enactment date of the ordinance (if any).\n\n2. ## Exclusions ##:\n- Do **not** include text that does not pertain to wind energy systems.\n\n3. ## Formatting & Structure ##:\n- **Preserve _all_ section titles, headers, and numberings** for reference.\n- **Maintain the original wording, formatting, and structure** to ensure accuracy.\n\n4. ## Output Handling ##:\n- This is a strict extraction task act like a text filter, **not** a summarizer or writer.\n- Do not add, explain, reword, or summarize anything.\n- The output must be a **copy-paste** of the original excerpt.\n**Absolutely no paraphrasing or rewriting.**\n- The output must consist **only** of contiguous or discontiguous verbatim blocks copied from the input.\n- If **no relevant text** is found, return the response: 'No relevant text.'"#

Prompt to extract ordinance text for WECS

LARGE_WIND_ENERGY_SYSTEM_SECTION_FILTER_PROMPT = "# CONTEXT #\nWe want to reduce the provided excerpt to only contain information about **large wind energy systems**. The extracted text will be used for structured data extraction, so it must be both **comprehensive** (retaining all relevant details) and **focused** (excluding unrelated content), with **zero rewriting or paraphrasing**. Ensure that all retained information is **directly applicable** to large wind energy systems while preserving full context and accuracy.\n\n# OBJECTIVE #\nExtract all text **pertaining to large wind energy systems** from the provided excerpt.\n\n# RESPONSE #\nFollow these guidelines carefully:\n\n1. ## Scope of Extraction ##:\n- Include all text that pertains to **large wind energy systems**, even if they are referred to by different names such as:\n\tWind turbines, wind energy conversion systems (wecs), wind energy facilities (wef), wind energy turbines (wet), large wind energy turbines (lwet), utility-scale wind energy turbines (uwet), commercial wind energy conversion systems (cwecs), alternate energy systems (aes), commercial energy production systems (cepcs), or similar.\n- Explicitly include any text related to **bans or prohibitions** on large wind energy systems.\n- Explicitly include any text related to the adoption or enactment date of the ordinance (if any).\n- **Retain all relevant technical, design, operational, safety, environmental, and infrastructure-related provisions** that apply to the topic, such as (but not limited to):\n\t- Compliance with legal or regulatory standards.\n\t- Site, structural, or design specifications.\n\t- Environmental impact considerations.\n\t- Safety and risk mitigation measures.\n\t- Infrastructure, implementation, operation, and maintenance details.\n\t- All other **closely related provisions**.\n\n2. ## Exclusions ##:\n- Do **not** include text that explicitly applies **only** to private, residential, micro, small, or medium sized wind energy systems.\n- Do **not** include text that does not pertain at all to wind energy systems.\n\n3. ## Formatting & Structure ##:\n- **Preserve _all_ section titles, headers, and numberings** for reference.\n- **Maintain the original wording, formatting, and structure** to ensure accuracy.\n\n4. ## Output Handling ##:\n- This is a strict extraction task act like a text filter, **not** a summarizer or writer.\n- Do not add, explain, reword, or summarize anything.\n- The output must be a **copy-paste** of the original excerpt.\n**Absolutely no paraphrasing or rewriting.**\n- The output must consist **only** of contiguous or discontiguous verbatim blocks copied from the input.\n- If **no relevant text** is found, return the response: 'No relevant text.'"#

Prompt to extract ordinance text for utility-scale WECS

async extract_wind_energy_system_section(text_chunks)[source]#

Extract ordinance text from input text chunks for WES

Parameters:

text_chunks (list of str) – List of strings, each of which represent a chunk of text. The order of the strings should be the order of the text chunks.

Returns:

str – Ordinance text extracted from text chunks.

async extract_large_wind_energy_system_section(text_chunks)[source]#

Extract large WES ordinance text from input text chunks

Parameters:

text_chunks (list of str) – List of strings, each of which represent a chunk of text. The order of the strings should be the order of the text chunks.

Returns:

str – Ordinance text extracted from text chunks.

property parsers#

Iterable of parsers provided by this extractor

Yields:
  • name (str) – Name describing the type of text output by the parser.

  • parser (callable()) – Async function that takes a text_chunks input and outputs parsed text.

SYSTEM_MESSAGE = 'You are a text extraction assistant. Your job is to extract only verbatim, **unmodified** excerpts from provided legal or policy documents. Do not interpret or paraphrase. Do not summarize. Only return exactly copied segments that match the specified scope. If the relevant content appears within a table, return the entire table, including headers and footers, exactly as formatted.'#

System message for text extraction LLM calls