compass.extraction.apply.check_for_ordinance_info#
- async check_for_ordinance_info(doc, model_config, heuristic, ordinance_text_collector_class, permitted_use_text_collector_class=None, usage_tracker=None)[source]#
Parse a single document for ordinance information
- Parameters:
doc (
elm.web.document.BaseDocument
) – A document potentially containing ordinance information. Note that if the document’s attrs contains the"contains_ord_info"
key, it will not be processed. To force a document to be processed by this function, remove that key from the documents attrs.text_splitter (
obj
) – Instance of an object that implements a split_text method. The method should take text as input (str) and return a list of text chunks. Langchain’s text splitters should work for this input.usage_tracker (
compass.services.usage.UsageTracker
, optional) – Optional tracker instance to monitor token usage during LLM calls. By default,None
.
- Returns:
elm.web.document.BaseDocument
– Document that has been parsed for ordinance text. The results of the parsing are stored in the documents attrs. In particular, the attrs will contain a"contains_ord_info"
key that will be set toTrue
if ordinance info was found in the text, andFalse
otherwise. IfTrue
, the attrs will also contain a"date"
key containing the most recent date that the ordinance was enacted (or a tuple of None if not found), and an"ordinance_text"
key containing the ordinance text snippet. Note that the snippet may contain other info as well, but should encapsulate all of the ordinance text.