compass.extraction.apply.check_for_ordinance_info#

async check_for_ordinance_info(doc, model_config, heuristic, ordinance_text_collector_class, permitted_use_text_collector_class=None, usage_tracker=None)[source]#

Parse a single document for ordinance information

Parameters:
  • doc (elm.web.document.BaseDocument) – A document potentially containing ordinance information. Note that if the document’s attrs contains the "contains_ord_info" key, it will not be processed. To force a document to be processed by this function, remove that key from the documents attrs.

  • text_splitter (obj) – Instance of an object that implements a split_text method. The method should take text as input (str) and return a list of text chunks. Langchain’s text splitters should work for this input.

  • usage_tracker (compass.services.usage.UsageTracker, optional) – Optional tracker instance to monitor token usage during LLM calls. By default, None.

Returns:

elm.web.document.BaseDocument – Document that has been parsed for ordinance text. The results of the parsing are stored in the documents attrs. In particular, the attrs will contain a "contains_ord_info" key that will be set to True if ordinance info was found in the text, and False otherwise. If True, the attrs will also contain a "date" key containing the most recent date that the ordinance was enacted (or a tuple of None if not found), and an "ordinance_text" key containing the ordinance text snippet. Note that the snippet may contain other info as well, but should encapsulate all of the ordinance text.