elm.base.ApiQueue
- class ApiQueue(url, headers, request_jsons, ignore_error=None, rate_limit=40000.0, max_retries=10)[source]
Bases:
object
Class to manage the parallel API queue and submission
- Parameters:
url (str) –
- OpenAI API url, typically either:
https://api.openai.com/v1/embeddings https://api.openai.com/v1/chat/completions
headers (dict) –
- OpenAI API headers, typically:
- {“Content-Type”: “application/json”,
“Authorization”: f”Bearer {openai.api_key}”}
all_request_jsons (list) – List of API data input, one entry typically looks like this for chat completion:
- {“model”: “gpt-3.5-turbo”,
- “messages”: [{“role”: “system”, “content”: “You do this…”},
{“role”: “user”, “content”: “Do this: {}”}],
“temperature”: 0.0}
ignore_error (None | callable) – Optional callable to parse API error string. If the callable returns True, the error will be ignored, the API call will not be tried again, and the output will be an empty string.
rate_limit (float) – OpenAI API rate limit (tokens / minute). Note that the gpt-3.5-turbo limit is 90k as of 4/2023, but we’re using a large factor of safety (~1/2) because we can only count the tokens on the input side and assume the output is about the same count.
max_retries (int) – Number of times to retry an API call with an error response before raising an error.
Methods
Collect asyncronous API calls and API outputs.
run
()Run all asyncronous API calls.
Submit a subset jobs asynchronously and hold jobs in the api_jobs attribute.
Attributes
Get a list of async jobs that are being waited on.
- property waiting_on
Get a list of async jobs that are being waited on.
- submit_jobs()[source]
Submit a subset jobs asynchronously and hold jobs in the api_jobs attribute. Break when the rate_limit is exceeded.