compass.extraction.solar.ordinance.SolarOrdinanceTextExtractor#
- class SolarOrdinanceTextExtractor(llm_caller)[source]#
Bases:
BaseTextExtractorExtract succinct ordinance text from input
- Purpose:
Extract relevant ordinance text from document.
- Responsibilities:
Extract portions from chunked document text relevant to particular ordinance type (e.g. solar zoning for utility-scale systems).
- Key Relationships:
Uses a StructuredLLMCaller for LLM queries.
- Parameters:
llm_caller (
LLMCaller) – LLM Caller instance used to extract ordinance info with.
Methods
extract_solar_energy_system_section(text_chunks)Extract ordinance text from input text chunks for SEF
Attributes
Prompt to extract ordinance text for SEF
System message for text extraction LLM calls
Iterable of parsers provided by this extractor
- SOLAR_ENERGY_SYSTEM_FILTER_PROMPT = "# CONTEXT #\nWe want to reduce the provided excerpt to only contain information about **solar energy systems**. The extracted text will be used for structured data extraction, so it must be both **comprehensive** (retaining all relevant details) and **focused** (excluding unrelated content), with **zero rewriting or paraphrasing**. Ensure that all retained information is **directly applicable to solar energy systems** while preserving full context and accuracy.\n\n# OBJECTIVE #\nExtract all text **pertaining to solar energy systems** from the provided excerpt.\n\n# RESPONSE #\nFollow these guidelines carefully:\n\n1. ## Scope of Extraction ##:\n- Include **all** text that pertains to** solar energy systems**, even if they are referred to by different names such as:\n\tSolar panels, solar energy conversion systems (secs), solar energy facilities (sef), solar energy farms (sef), solar farms (sf), utility-scale solar energy systems (uses), commercial solar energy systems (cses), ground-mounted solar energy systems (gses), alternate energy systems (aes), commercial energy production systems (cepcs), or similar.\n- Explicitly include any text related to **bans or prohibitions** on solar energy systems.\n- Explicitly include any text related to the adoption or enactment date of the ordinance (if any).\n\n2. ## Exclusions ##:\n- Do **not** include text that does not pertain to solar energy systems.\n\n3. ## Formatting & Structure ##:\n- **Preserve _all_ section titles, headers, and numberings** for reference.\n- **Maintain the original wording, formatting, and structure** to ensure accuracy.\n\n4. ## Output Handling ##:\n- This is a strict extraction task — act like a text filter, **not** a summarizer or writer.\n- Do not add, explain, reword, or summarize anything.\n- The output must be a **copy-paste** of the original excerpt.\n**Absolutely no paraphrasing or rewriting.**\n- The output must consist **only** of contiguous or discontiguous verbatim blocks copied from the input.\n- If **no relevant text** is found, return the response: 'No relevant text.'"#
Prompt to extract ordinance text for SEF
- async extract_solar_energy_system_section(text_chunks)[source]#
Extract ordinance text from input text chunks for SEF
- property parsers#
Iterable of parsers provided by this extractor
- Yields:
name (
str) – Name describing the type of text output by the parser.parser (
callable()) – Async function that takes atext_chunksinput and outputs parsed text.
- SYSTEM_MESSAGE = 'You are a text extraction assistant. Your job is to extract only verbatim, **unmodified** excerpts from provided legal or policy documents. Do not interpret or paraphrase. Do not summarize. Only return exactly copied segments that match the specified scope. If the relevant content appears within a table, return the entire table, including headers and footers, exactly as formatted.'#
System message for text extraction LLM calls