compass.services.base.LLMService#

class LLMService(model_name, rate_limit, rate_tracker, service_tag=None)[source]#

Bases: Service

Base class for LLm service

This service differs from other services in that it must be used as an object, not as a class. that is, users must initialize it and pass it around in functions in order to use it.

Parameters:
  • model_name (str) – Name of model being used.

  • rate_limit (int or float) – Max usage per duration of the rate tracker. For example, if the rate tracker is set to compute the total over minute-long intervals, this value should be the max usage per minute.

  • rate_tracker (TimeBoundedUsageTracker) – A TimeBoundedUsageTracker instance. This will be used to track usage per time interval and compare to rate_limit.

  • service_tag (str, optional) – optional tag to use to distinguish service (i.e. make unique from other services). Must set this if multiple models with the same name are run concurrently. By default, None.

Methods

acquire_resources()

Use this method to allocate resources, if needed

call(*args, **kwargs)

Call the service

process(*args, **kwargs)

Process a call to the service.

process_using_futures(fut, *args, **kwargs)

Process a call to the service

release_resources()

Use this method to clean up resources, if needed

Attributes

MAX_CONCURRENT_JOBS

Max number of concurrent job submissions.

can_process

Check if usage is under the rate limit

name

Unique service name used to pull the correct queue

property can_process#

Check if usage is under the rate limit

Type:

bool

property name#

Unique service name used to pull the correct queue

Type:

str

MAX_CONCURRENT_JOBS = 10000#

Max number of concurrent job submissions.

acquire_resources()#

Use this method to allocate resources, if needed

async call(*args, **kwargs)[source]#

Call the service

Parameters:

*args, **kwargs – Positional and keyword arguments to be passed to the underlying service processing function.

Returns:

obj – A response object from the underlying service.

abstractmethod async process(*args, **kwargs)#

Process a call to the service.

Parameters:

*args, **kwargs – Positional and keyword arguments to be passed to the underlying processing function.

async process_using_futures(fut, *args, **kwargs)#

Process a call to the service

Parameters:
  • fut (asyncio.Future) – A future object that should get the result of the processing operation. If the processing function returns answer, this method should call fut.set_result(answer).

  • **kwargs – Keyword arguments to be passed to the underlying processing function.

release_resources()#

Use this method to clean up resources, if needed