compass.services.base.LLMService#

class LLMService(model_name, rate_limit, rate_tracker, service_tag=None)[source]#

Bases: Service

Base class for LLm service

This service differs from other services in that it must be used as an object, not as a class. that is, users must initialize it and pass it around in functions in order to use it.

Parameters:

model_name (str) – Name of model being used.
rate_limit (int or float) – Max usage per duration of the rate tracker. For example, if the rate tracker is set to compute the total over minute-long intervals, this value should be the max usage per minute.
rate_tracker (TimeBoundedUsageTracker) – A TimeBoundedUsageTracker instance. This will be used to track usage per time interval and compare to rate_limit.
service_tag (str, optional) – optional tag to use to distinguish service (i.e. make unique from other services). Must set this if multiple models with the same name are run concurrently. By default, None.

Methods

`acquire_resources`()	Use this method to allocate resources, if needed
`call`(args, *kwargs)	Call the service
`process`(args, *kwargs)	Process a call to the service.
`process_using_futures`(fut, args, *kwargs)	Process a call to the service
`release_resources`()	Use this method to clean up resources, if needed

Attributes

`MAX_CONCURRENT_JOBS`	Max number of concurrent job submissions.
`can_process`	Check if usage is under the rate limit
`name`	Unique service name used to pull the correct queue

property can_process#

Check if usage is under the rate limit

Type:: bool

property name#

Unique service name used to pull the correct queue

Type:: str

MAX_CONCURRENT_JOBS = 10000#: Max number of concurrent job submissions.

acquire_resources()#: Use this method to allocate resources, if needed

async call(*args, **kwargs)[source]#

Call the service

Parameters:: *args, **kwargs – Positional and keyword arguments to be passed to the underlying service processing function.
Returns:: obj – A response object from the underlying service.

abstractmethod async process(*args, **kwargs)#

Process a call to the service.

Parameters:: *args, **kwargs – Positional and keyword arguments to be passed to the underlying processing function.

async process_using_futures(fut, *args, **kwargs)#

Process a call to the service

Parameters:

fut (asyncio.Future) – A future object that should get the result of the processing operation. If the processing function returns answer, this method should call fut.set_result(answer).
**kwargs – Keyword arguments to be passed to the underlying processing function.

release_resources()#: Use this method to clean up resources, if needed