Skip to main content
Ctrl+K

COMPASS 0.1.dev1 documentation

  • Home page
  • Installation
  • Examples
  • API reference
  • CLI reference
    • Validation
    • Development
  • Home page
  • Installation
  • Examples
  • API reference
  • CLI reference
  • Validation
  • Development

Section Navigation

  • compass.common
    • compass.common.base
      • compass.common.base.empty_output
      • compass.common.base.found_ord
      • compass.common.base.llm_response_does_not_start_with_no
      • compass.common.base.llm_response_starts_with_no
      • compass.common.base.llm_response_starts_with_yes
      • compass.common.base.run_async_tree
      • compass.common.base.run_async_tree_with_bm
      • compass.common.base.setup_async_decision_tree
      • compass.common.base.setup_base_setback_graph
      • compass.common.base.setup_graph_extra_restriction
      • compass.common.base.setup_graph_no_nodes
      • compass.common.base.setup_graph_permitted_use_districts
      • compass.common.base.setup_participating_owner
      • compass.common.base.BaseTextExtractor
    • compass.common.tree
      • compass.common.tree.AsyncDecisionTree
  • compass.exceptions
    • compass.exceptions.COMPASSError
    • compass.exceptions.COMPASSNotInitializedError
    • compass.exceptions.COMPASSRuntimeError
    • compass.exceptions.COMPASSValueError
  • compass.extraction
    • compass.extraction.apply
      • compass.extraction.apply.check_for_ordinance_info
      • compass.extraction.apply.extract_date
      • compass.extraction.apply.extract_ordinance_text_with_llm
      • compass.extraction.apply.extract_ordinance_text_with_ngram_validation
      • compass.extraction.apply.extract_ordinance_values
    • compass.extraction.date
      • compass.extraction.date.DateExtractor
    • compass.extraction.features
      • compass.extraction.features.SetbackFeatures
    • compass.extraction.ngrams
      • compass.extraction.ngrams.convert_text_to_sentence_ngrams
      • compass.extraction.ngrams.sentence_ngram_containment
    • compass.extraction.solar
      • compass.extraction.solar.graphs
        • compass.extraction.solar.graphs.setup_graph_sef_types
        • compass.extraction.solar.graphs.setup_multiplier
      • compass.extraction.solar.ordinance
        • compass.extraction.solar.ordinance.SolarHeuristic
        • compass.extraction.solar.ordinance.SolarOrdinanceTextCollector
        • compass.extraction.solar.ordinance.SolarOrdinanceTextExtractor
        • compass.extraction.solar.ordinance.SolarPermittedUseDistrictsTextCollector
        • compass.extraction.solar.ordinance.SolarPermittedUseDistrictsTextExtractor
      • compass.extraction.solar.parse
        • compass.extraction.solar.parse.StructuredSolarOrdinanceParser
        • compass.extraction.solar.parse.StructuredSolarParser
        • compass.extraction.solar.parse.StructuredSolarPermittedUseDistrictsParser
    • compass.extraction.wind
      • compass.extraction.wind.graphs
        • compass.extraction.wind.graphs.setup_conditional_max
        • compass.extraction.wind.graphs.setup_conditional_min
        • compass.extraction.wind.graphs.setup_graph_wes_types
        • compass.extraction.wind.graphs.setup_multiplier
      • compass.extraction.wind.ordinance
        • compass.extraction.wind.ordinance.WindHeuristic
        • compass.extraction.wind.ordinance.WindOrdinanceTextCollector
        • compass.extraction.wind.ordinance.WindOrdinanceTextExtractor
        • compass.extraction.wind.ordinance.WindPermittedUseDistrictsTextCollector
        • compass.extraction.wind.ordinance.WindPermittedUseDistrictsTextExtractor
      • compass.extraction.wind.parse
        • compass.extraction.wind.parse.StructuredWindOrdinanceParser
        • compass.extraction.wind.parse.StructuredWindParser
        • compass.extraction.wind.parse.StructuredWindPermittedUseDistrictsParser
  • compass.llm
    • compass.llm.calling
      • compass.llm.calling.BaseLLMCaller
      • compass.llm.calling.ChatLLMCaller
      • compass.llm.calling.LLMCaller
      • compass.llm.calling.StructuredLLMCaller
    • compass.llm.config
      • compass.llm.config.LLMConfig
      • compass.llm.config.OpenAIConfig
  • compass.pb
    • compass.pb.COMPASS_PB
  • compass.scripts
    • compass.scripts.download
      • compass.scripts.download.download_jurisdiction_ordinance_using_search_engine
      • compass.scripts.download.download_jurisdiction_ordinances_from_website
      • compass.scripts.download.download_jurisdiction_ordinances_from_website_compass_crawl
      • compass.scripts.download.download_known_urls
      • compass.scripts.download.filter_ordinance_docs
      • compass.scripts.download.find_jurisdiction_website
    • compass.scripts.process
      • compass.scripts.process.process_jurisdictions_with_openai
  • compass.services
    • compass.services.base
      • compass.services.base.LLMService
      • compass.services.base.Service
    • compass.services.cpu
      • compass.services.cpu.read_pdf_doc
      • compass.services.cpu.read_pdf_doc_ocr
      • compass.services.cpu.OCRPDFLoader
      • compass.services.cpu.PDFLoader
      • compass.services.cpu.ProcessPoolService
    • compass.services.openai
      • compass.services.openai.count_tokens
      • compass.services.openai.usage_from_response
      • compass.services.openai.OpenAIService
    • compass.services.provider
      • compass.services.provider.RunningAsyncServices
    • compass.services.queues
      • compass.services.queues.get_service_queue
      • compass.services.queues.initialize_service_queue
      • compass.services.queues.tear_down_service_queue
    • compass.services.threaded
      • compass.services.threaded.CleanedFileWriter
      • compass.services.threaded.FileMover
      • compass.services.threaded.JurisdictionUpdater
      • compass.services.threaded.OrdDBFileWriter
      • compass.services.threaded.StoreFileOnDisk
      • compass.services.threaded.TempFileCache
      • compass.services.threaded.TempFileCachePB
      • compass.services.threaded.ThreadedService
      • compass.services.threaded.UsageUpdater
    • compass.services.usage
      • compass.services.usage.TimeBoundedUsageTracker
      • compass.services.usage.TimedEntry
      • compass.services.usage.UsageTracker
  • compass.utilities
    • compass.utilities.LLM_COST_REGISTRY
    • compass.utilities.base
      • compass.utilities.base.title_preserving_caps
      • compass.utilities.base.Directories
      • compass.utilities.base.WebSearchParams
    • compass.utilities.enums
      • compass.utilities.enums.LLMTasks
      • compass.utilities.enums.LLMUsageCategory
    • compass.utilities.finalize
      • compass.utilities.finalize.QUANT_OUT_COLS
      • compass.utilities.finalize.QUAL_OUT_COLS
      • compass.utilities.finalize.compile_run_summary_message
      • compass.utilities.finalize.doc_infos_to_db
      • compass.utilities.finalize.save_db
      • compass.utilities.finalize.save_run_meta
    • compass.utilities.jurisdictions
      • compass.utilities.jurisdictions.jurisdiction_websites
      • compass.utilities.jurisdictions.load_all_jurisdiction_info
      • compass.utilities.jurisdictions.load_jurisdictions_from_fp
    • compass.utilities.location
      • compass.utilities.location.Jurisdiction
    • compass.utilities.logs
      • compass.utilities.logs.AddLocationFilter
      • compass.utilities.logs.ExceptionOnlyFilter
      • compass.utilities.logs.JsonExceptionFileHandler
      • compass.utilities.logs.JsonFormatter
      • compass.utilities.logs.LocalProcessQueueHandler
      • compass.utilities.logs.LocationFileLog
      • compass.utilities.logs.LocationFilter
      • compass.utilities.logs.LogListener
      • compass.utilities.logs.NoLocationFilter
    • compass.utilities.nt
      • compass.utilities.nt.ProcessKwargs
      • compass.utilities.nt.TechSpec
    • compass.utilities.parsing
      • compass.utilities.parsing.clean_backticks_from_llm_response
      • compass.utilities.parsing.extract_ord_year_from_doc_attrs
      • compass.utilities.parsing.llm_response_as_json
      • compass.utilities.parsing.load_config
      • compass.utilities.parsing.merge_overlapping_texts
      • compass.utilities.parsing.num_ordinances_dataframe
      • compass.utilities.parsing.num_ordinances_in_doc
      • compass.utilities.parsing.ordinances_bool_index
  • compass.validation
    • compass.validation.content
      • compass.validation.content.parse_by_chunks
      • compass.validation.content.Heuristic
      • compass.validation.content.LegalTextValidator
      • compass.validation.content.ParseChunksWithMemory
    • compass.validation.graphs
      • compass.validation.graphs.setup_graph_correct_document_type
      • compass.validation.graphs.setup_graph_correct_jurisdiction_from_url
      • compass.validation.graphs.setup_graph_correct_jurisdiction_type
    • compass.validation.location
      • compass.validation.location.DTreeJurisdictionValidator
      • compass.validation.location.DTreeURLJurisdictionValidator
      • compass.validation.location.JurisdictionValidator
      • compass.validation.location.JurisdictionWebsiteValidator
  • compass.warn
    • compass.warn.COMPASSWarning
  • compass.web
    • compass.web.website_crawl
      • compass.web.website_crawl.DOC_THRESHOLD
      • compass.web.website_crawl.COMPASSCrawler
      • compass.web.website_crawl.COMPASSLinkScorer
      • compass.web.website_crawl.Link
  • compass
  • compass.web
  • compass.web.website_crawl
  • compass.web.website_crawl.Link

compass.web.website_crawl.Link#

class Link(*, href: str | None = '', text: str | None = '', title: str | None = '', base_domain: str | None = '', head_data: Dict[str, Any] | None = None, head_extraction_status: str | None = None, head_extraction_error: str | None = None, intrinsic_score: float | None = None, contextual_score: float | None = None, total_score: float | None = None)[source]#

Bases: Link

Crawl4AI Link subclass with a few utilities

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Methods

construct([_fields_set])

copy(*[, include, exclude, update, deep])

Returns a copy of the model.

dict(*[, include, exclude, by_alias, ...])

from_orm(obj)

json(*[, include, exclude, by_alias, ...])

model_construct([_fields_set])

Creates a new instance of the Model class with validated data.

model_copy(*[, update, deep])

!!! abstract "Usage Documentation"

model_dump(*[, mode, include, exclude, ...])

!!! abstract "Usage Documentation"

model_dump_json(*[, indent, include, ...])

!!! abstract "Usage Documentation"

model_json_schema([by_alias, ref_template, ...])

Generates a JSON schema for a model class.

model_parametrized_name(params)

Compute the class name for parametrizations of generic classes.

model_post_init(context, /)

Override this method to perform additional initialization after __init__ and model_construct.

model_rebuild(*[, force, raise_errors, ...])

Try to rebuild the pydantic-core schema for the model.

model_validate(obj, *[, strict, ...])

Validate a pydantic model instance.

model_validate_json(json_data, *[, strict, ...])

!!! abstract "Usage Documentation"

model_validate_strings(obj, *[, strict, ...])

Validate the given object with string data against the Pydantic model.

parse_file(path, *[, content_type, ...])

parse_obj(obj)

parse_raw(b, *[, content_type, encoding, ...])

schema([by_alias, ref_template])

schema_json(*[, by_alias, ref_template])

update_forward_refs(**localns)

validate(value)

Attributes

consistent_domain

True if the link is from the base domain

model_computed_fields

model_config

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_extra

Get extra fields set during validation.

model_fields

model_fields_set

Returns the set of fields that have been explicitly set on this model instance.

resembles_pdf

True if the link has "pdf" in title or href

href

text

title

base_domain

head_data

head_extraction_status

head_extraction_error

intrinsic_score

contextual_score

total_score

property consistent_domain#

True if the link is from the base domain

Type:

bool

property resembles_pdf#

True if the link has “pdf” in title or href

Type:

bool

copy(*, include: AbstractSetIntStr | MappingIntStrAny | None = None, exclude: AbstractSetIntStr | MappingIntStrAny | None = None, update: Dict[str, Any] | None = None, deep: bool = False) → Self#

Returns a copy of the model.

!!! warning “Deprecated”

This method is now deprecated; use model_copy instead.

If you need include or exclude, use:

`python {test="skip" lint="skip"} data = self.model_dump(include=include, exclude=exclude, round_trip=True) data = {**data, **(update or {})} copied = self.model_validate(data) `

Args:

include: Optional set or mapping specifying which fields to include in the copied model. exclude: Optional set or mapping specifying which fields to exclude in the copied model. update: Optional dictionary of field-value pairs to override field values in the copied model. deep: If True, the values of fields that are Pydantic models will be deep-copied.

Returns:

A copy of the model with included, excluded and updated fields as specified.

model_config: ClassVar[ConfigDict] = {}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

classmethod model_construct(_fields_set: set[str] | None = None, **values: Any) → Self#

Creates a new instance of the Model class with validated data.

Creates a new model setting __dict__ and __pydantic_fields_set__ from trusted or pre-validated data. Default values are respected, but no other validation is performed.

!!! note

model_construct() generally respects the model_config.extra setting on the provided model. That is, if model_config.extra == ‘allow’, then all extra passed values are added to the model instance’s __dict__ and __pydantic_extra__ fields. If model_config.extra == ‘ignore’ (the default), then all extra passed values are ignored. Because no validation is performed with a call to model_construct(), having model_config.extra == ‘forbid’ does not result in an error if extra values are passed, but they will be ignored.

Args:
_fields_set: A set of field names that were originally explicitly set during instantiation. If provided,

this is directly used for the [model_fields_set][pydantic.BaseModel.model_fields_set] attribute. Otherwise, the field names from the values argument will be used.

values: Trusted or pre-validated data dictionary.

Returns:

A new instance of the Model class with validated data.

model_copy(*, update: Mapping[str, Any] | None = None, deep: bool = False) → Self#
!!! abstract “Usage Documentation”

[model_copy](../concepts/serialization.md#model_copy)

Returns a copy of the model.

!!! note

The underlying instance’s [__dict__][object.__dict__] attribute is copied. This might have unexpected side effects if you store anything in it, on top of the model fields (e.g. the value of [cached properties][functools.cached_property]).

Args:
update: Values to change/add in the new model. Note: the data is not validated

before creating the new model. You should trust this data.

deep: Set to True to make a deep copy of the model.

Returns:

New model instance.

model_dump(*, mode: Literal['json', 'python'] | str = 'python', include: set[int] | set[str] | Mapping[int, set[int] | set[str] | Mapping[int, IncEx | bool] | Mapping[str, IncEx | bool] | bool] | Mapping[str, set[int] | set[str] | Mapping[int, IncEx | bool] | Mapping[str, IncEx | bool] | bool] | None = None, exclude: set[int] | set[str] | Mapping[int, set[int] | set[str] | Mapping[int, IncEx | bool] | Mapping[str, IncEx | bool] | bool] | Mapping[str, set[int] | set[str] | Mapping[int, IncEx | bool] | Mapping[str, IncEx | bool] | bool] | None = None, context: Any | None = None, by_alias: bool | None = None, exclude_unset: bool = False, exclude_defaults: bool = False, exclude_none: bool = False, round_trip: bool = False, warnings: bool | Literal['none', 'warn', 'error'] = True, fallback: Callable[[Any], Any] | None = None, serialize_as_any: bool = False) → dict[str, Any]#
!!! abstract “Usage Documentation”

[model_dump](../concepts/serialization.md#modelmodel_dump)

Generate a dictionary representation of the model, optionally specifying which fields to include or exclude.

Args:
mode: The mode in which to_python should run.

If mode is ‘json’, the output will only contain JSON serializable types. If mode is ‘python’, the output may contain non-JSON-serializable Python objects.

include: A set of fields to include in the output. exclude: A set of fields to exclude from the output. context: Additional context to pass to the serializer. by_alias: Whether to use the field’s alias in the dictionary key if defined. exclude_unset: Whether to exclude fields that have not been explicitly set. exclude_defaults: Whether to exclude fields that are set to their default value. exclude_none: Whether to exclude fields that have a value of None. round_trip: If True, dumped values should be valid as input for non-idempotent types such as Json[T]. warnings: How to handle serialization errors. False/”none” ignores them, True/”warn” logs errors,

“error” raises a [PydanticSerializationError][pydantic_core.PydanticSerializationError].

fallback: A function to call when an unknown value is encountered. If not provided,

a [PydanticSerializationError][pydantic_core.PydanticSerializationError] error is raised.

serialize_as_any: Whether to serialize fields with duck-typing serialization behavior.

Returns:

A dictionary representation of the model.

model_dump_json(*, indent: int | None = None, include: set[int] | set[str] | Mapping[int, set[int] | set[str] | Mapping[int, IncEx | bool] | Mapping[str, IncEx | bool] | bool] | Mapping[str, set[int] | set[str] | Mapping[int, IncEx | bool] | Mapping[str, IncEx | bool] | bool] | None = None, exclude: set[int] | set[str] | Mapping[int, set[int] | set[str] | Mapping[int, IncEx | bool] | Mapping[str, IncEx | bool] | bool] | Mapping[str, set[int] | set[str] | Mapping[int, IncEx | bool] | Mapping[str, IncEx | bool] | bool] | None = None, context: Any | None = None, by_alias: bool | None = None, exclude_unset: bool = False, exclude_defaults: bool = False, exclude_none: bool = False, round_trip: bool = False, warnings: bool | Literal['none', 'warn', 'error'] = True, fallback: Callable[[Any], Any] | None = None, serialize_as_any: bool = False) → str#
!!! abstract “Usage Documentation”

[model_dump_json](../concepts/serialization.md#modelmodel_dump_json)

Generates a JSON representation of the model using Pydantic’s to_json method.

Args:

indent: Indentation to use in the JSON output. If None is passed, the output will be compact. include: Field(s) to include in the JSON output. exclude: Field(s) to exclude from the JSON output. context: Additional context to pass to the serializer. by_alias: Whether to serialize using field aliases. exclude_unset: Whether to exclude fields that have not been explicitly set. exclude_defaults: Whether to exclude fields that are set to their default value. exclude_none: Whether to exclude fields that have a value of None. round_trip: If True, dumped values should be valid as input for non-idempotent types such as Json[T]. warnings: How to handle serialization errors. False/”none” ignores them, True/”warn” logs errors,

“error” raises a [PydanticSerializationError][pydantic_core.PydanticSerializationError].

fallback: A function to call when an unknown value is encountered. If not provided,

a [PydanticSerializationError][pydantic_core.PydanticSerializationError] error is raised.

serialize_as_any: Whether to serialize fields with duck-typing serialization behavior.

Returns:

A JSON string representation of the model.

property model_extra: dict[str, Any] | None#

Get extra fields set during validation.

Returns:

A dictionary of extra fields, or None if config.extra is not set to “allow”.

property model_fields_set: set[str]#

Returns the set of fields that have been explicitly set on this model instance.

Returns:
A set of strings representing the fields that have been set,

i.e. that were not filled from defaults.

classmethod model_json_schema(by_alias: bool = True, ref_template: str = '#/$defs/{model}', schema_generator: type[~pydantic.json_schema.GenerateJsonSchema] = <class 'pydantic.json_schema.GenerateJsonSchema'>, mode: ~typing.Literal['validation', 'serialization'] = 'validation') → dict[str, Any]#

Generates a JSON schema for a model class.

Args:

by_alias: Whether to use attribute aliases or not. ref_template: The reference template. schema_generator: To override the logic used to generate the JSON schema, as a subclass of

GenerateJsonSchema with your desired modifications

mode: The mode in which to generate the schema.

Returns:

The JSON schema for the given model class.

classmethod model_parametrized_name(params: tuple[type[Any], ...]) → str#

Compute the class name for parametrizations of generic classes.

This method can be overridden to achieve a custom naming scheme for generic BaseModels.

Args:
params: Tuple of types of the class. Given a generic class

Model with 2 type variables and a concrete model Model[str, int], the value (str, int) would be passed to params.

Returns:

String representing the new class where params are passed to cls as type variables.

Raises:

TypeError: Raised when trying to generate concrete names for non-generic models.

model_post_init(context: Any, /) → None#

Override this method to perform additional initialization after __init__ and model_construct. This is useful if you want to do some validation that requires the entire model to be initialized.

classmethod model_rebuild(*, force: bool = False, raise_errors: bool = True, _parent_namespace_depth: int = 2, _types_namespace: MappingNamespace | None = None) → bool | None#

Try to rebuild the pydantic-core schema for the model.

This may be necessary when one of the annotations is a ForwardRef which could not be resolved during the initial attempt to build the schema, and automatic rebuilding fails.

Args:

force: Whether to force the rebuilding of the model schema, defaults to False. raise_errors: Whether to raise errors, defaults to True. _parent_namespace_depth: The depth level of the parent namespace, defaults to 2. _types_namespace: The types namespace, defaults to None.

Returns:

Returns None if the schema is already “complete” and rebuilding was not required. If rebuilding _was_ required, returns True if rebuilding was successful, otherwise False.

classmethod model_validate(obj: Any, *, strict: bool | None = None, from_attributes: bool | None = None, context: Any | None = None, by_alias: bool | None = None, by_name: bool | None = None) → Self#

Validate a pydantic model instance.

Args:

obj: The object to validate. strict: Whether to enforce types strictly. from_attributes: Whether to extract data from object attributes. context: Additional context to pass to the validator. by_alias: Whether to use the field’s alias when validating against the provided input data. by_name: Whether to use the field’s name when validating against the provided input data.

Raises:

ValidationError: If the object could not be validated.

Returns:

The validated model instance.

classmethod model_validate_json(json_data: str | bytes | bytearray, *, strict: bool | None = None, context: Any | None = None, by_alias: bool | None = None, by_name: bool | None = None) → Self#
!!! abstract “Usage Documentation”

[JSON Parsing](../concepts/json.md#json-parsing)

Validate the given JSON data against the Pydantic model.

Args:

json_data: The JSON data to validate. strict: Whether to enforce types strictly. context: Extra variables to pass to the validator. by_alias: Whether to use the field’s alias when validating against the provided input data. by_name: Whether to use the field’s name when validating against the provided input data.

Returns:

The validated Pydantic model.

Raises:

ValidationError: If json_data is not a JSON string or the object could not be validated.

classmethod model_validate_strings(obj: Any, *, strict: bool | None = None, context: Any | None = None, by_alias: bool | None = None, by_name: bool | None = None) → Self#

Validate the given object with string data against the Pydantic model.

Args:

obj: The object containing string data to validate. strict: Whether to enforce types strictly. context: Extra variables to pass to the validator. by_alias: Whether to use the field’s alias when validating against the provided input data. by_name: Whether to use the field’s name when validating against the provided input data.

Returns:

The validated Pydantic model.

previous

compass.web.website_crawl.COMPASSLinkScorer

next

Command Line Interface (CLI)

On this page
  • Link
    • Link.consistent_domain
    • Link.resembles_pdf
    • Link.copy()
    • Link.model_config
    • Link.model_construct()
    • Link.model_copy()
    • Link.model_dump()
    • Link.model_dump_json()
    • Link.model_extra
    • Link.model_fields_set
    • Link.model_json_schema()
    • Link.model_parametrized_name()
    • Link.model_post_init()
    • Link.model_rebuild()
    • Link.model_validate()
    • Link.model_validate_json()
    • Link.model_validate_strings()

© Copyright 2025, Alliance for Sustainable Energy, LLC.

Created using Sphinx 8.2.3.

Built with the PyData Sphinx Theme 0.16.1.