compass.extraction.small_wind.ordinance.SmallWindHeuristic#

class SmallWindHeuristic[source]#

Bases: Heuristic

Perform a heuristic check for mention of wind turbines in text

Methods

check(text[, match_count_threshold])

Check for mention of a tech in text

Attributes

GOOD_TECH_ACRONYMS

Acronyms for WECS that we want to capture

GOOD_TECH_KEYWORDS

Words that indicate we should keep a chunk for analysis

GOOD_TECH_PHRASES

Phrases that indicate text is about WECS

NOT_TECH_WORDS

Words and phrases that indicate text is NOT about WECS

NOT_TECH_WORDS = ['wind farm', 'wind energy farm', 'utility wind energy system', 'commercial wind energy system', 'rewind', 'windbreak', 'windiest', 'winds', 'windshield', 'window', 'windy', 'wind attribute', 'wind blow', 'wind break', 'wind current', 'wind damage', 'wind data', 'wind direction', 'wind draft', 'wind erosion', 'wind energy resource atlas', 'wind load', 'wind movement', 'wind orient', 'wind resource', 'wind runway', 'prevailing wind', 'downwind']#

Words and phrases that indicate text is NOT about WECS

GOOD_TECH_KEYWORDS = ['wind', 'setback']#

Words that indicate we should keep a chunk for analysis

GOOD_TECH_ACRONYMS = ['wecs', 'wes', 'swet', 'pwet', 'wef']#

Acronyms for WECS that we want to capture

GOOD_TECH_PHRASES = ['small wecs', 'small turbine', 'small wind', 'medium wecs', 'medium turbine', 'medium wind', 'accessory wecs', 'accessory turbine', 'accessory wind', 'on-site wecs', 'on-site turbine', 'on-site wind', 'onsite wecs', 'onsite turbine', 'onsite wind', 'on-farm wecs', 'on-farm turbine', 'on-farm wind', 'distributed wecs', 'distributed turbine', 'distributed wind', 'residential wecs', 'residential turbine', 'residential wind', 'agricultural wecs', 'agricultural turbine', 'agricultural wind', 'local wecs', 'local turbine', 'local wind', 'behind-the-meter wecs', 'behind-the-meter turbine', 'behind-the-meter wind', 'front-of-meter wecs', 'front-of-meter turbine', 'front-of-meter wind', 'pwec', 'swecs', 'wind energy conversion', 'wind turbine', 'wind tower', 'wind energy system']#

Phrases that indicate text is about WECS

check(text, match_count_threshold=1)#

Check for mention of a tech in text

This check first strips the text of any tech “look-alike” words (e.g. “window”, “windshield”, etc for “wind” technology). Then, it checks for particular keywords, acronyms, and phrases that pertain to the tech in the text. If enough keywords are mentions (as dictated by match_count_threshold), this check returns True.

Parameters:
  • text (str) – Input text that may or may not mention the technology of interest.

  • match_count_threshold (int, optional) – Number of keywords that must match for the text to pass this heuristic check. Count must be strictly greater than this value. By default, 1.

Returns:

boolTrue if the number of keywords/acronyms/phrases detected exceeds the match_count_threshold.