compass.validation.content.Heuristic#

class Heuristic[source]#

Bases: ABC

Perform a heuristic check for mention of a technology in text

Methods

check(text[, match_count_threshold])

Check for mention of a tech in text

Attributes

GOOD_TECH_ACRONYMS

Iterable of acronyms that pertain to the tech

GOOD_TECH_KEYWORDS

Iterable of keywords that pertain to the tech

GOOD_TECH_PHRASES

Iterable of phrases that pertain to the tech

NOT_TECH_WORDS

Iterable of words that don't pertain to the tech

check(text, match_count_threshold=1)[source]#

Check for mention of a tech in text

This check first strips the text of any tech “look-alike” words (e.g. “window”, “windshield”, etc for “wind” technology). Then, it checks for particular keywords, acronyms, and phrases that pertain to the tech in the text. If enough keywords are mentions (as dictated by match_count_threshold), this check returns True.

Parameters:
  • text (str) – Input text that may or may not mention the technology of interest.

  • match_count_threshold (int, optional) – Number of keywords that must match for the text to pass this heuristic check. Count must be strictly greater than this value. By default, 1.

Returns:

boolTrue if the number of keywords/acronyms/phrases detected exceeds the match_count_threshold.

abstract property NOT_TECH_WORDS#

Iterable of words that don’t pertain to the tech

Type:

iter

abstract property GOOD_TECH_KEYWORDS#

Iterable of keywords that pertain to the tech

Type:

iter

abstract property GOOD_TECH_ACRONYMS#

Iterable of acronyms that pertain to the tech

Type:

iter

abstract property GOOD_TECH_PHRASES#

Iterable of phrases that pertain to the tech

Type:

iter