compass.services.base.LLMService#
- class LLMService(model_name, rate_limit, rate_tracker, service_tag=None)[source]#
Bases:
Service
Base class for LLm service
This service differs from other services in that it must be used as an object, not as a class. that is, users must initialize it and pass it around in functions in order to use it.
- Parameters:
model_name (
str
) – Name of model being used.rate_limit (
int
orfloat
) – Max usage per duration of the rate tracker. For example, if the rate tracker is set to compute the total over minute-long intervals, this value should be the max usage per minute.rate_tracker (TimeBoundedUsageTracker) – A TimeBoundedUsageTracker instance. This will be used to track usage per time interval and compare to rate_limit.
service_tag (
str
, optional) – optional tag to use to distinguish service (i.e. make unique from other services). Must set this if multiple models with the same name are run concurrently. By default,None
.
Methods
Use this method to allocate resources, if needed
call
(*args, **kwargs)Call the service
process
(*args, **kwargs)Process a call to the service.
process_using_futures
(fut, *args, **kwargs)Process a call to the service
Use this method to clean up resources, if needed
Attributes
Max number of concurrent job submissions.
Check if usage is under the rate limit
Unique service name used to pull the correct queue
- MAX_CONCURRENT_JOBS = 10000#
Max number of concurrent job submissions.
- acquire_resources()#
Use this method to allocate resources, if needed
- async call(*args, **kwargs)[source]#
Call the service
- Parameters:
*args, **kwargs – Positional and keyword arguments to be passed to the underlying service processing function.
- Returns:
obj
– A response object from the underlying service.
- abstractmethod async process(*args, **kwargs)#
Process a call to the service.
- Parameters:
*args, **kwargs – Positional and keyword arguments to be passed to the underlying processing function.
- async process_using_futures(fut, *args, **kwargs)#
Process a call to the service
- Parameters:
fut (
asyncio.Future
) – A future object that should get the result of the processing operation. If the processing function returnsanswer
, this method should callfut.set_result(answer)
.**kwargs – Keyword arguments to be passed to the underlying processing function.
- release_resources()#
Use this method to clean up resources, if needed