elm.wizard.EnergyWizard
- class EnergyWizard(corpus, model=None, token_budget=3500, ref_col=None)[source]
Bases:
EnergyWizardBase
Interface to ask OpenAI LLMs about energy research.
This class is for execution on a local machine with a vector database in memory
- Parameters:
corpus (pd.DataFrame) – Corpus of text in dataframe format. Must have columns “text” and “embedding”.
model (str) – GPT model name, default is the DEFAULT_MODEL global var
token_budget (int) – Number of tokens that can be embedded in the prompt. Note that the default budget for GPT-3.5-Turbo is 4096, but you want to subtract some tokens to account for the response budget.
ref_col (None | str) – Optional column label in the corpus that provides a reference text string for each chunk of text.
Methods
call_api
(url, headers, request_json)Make an asyncronous OpenAI API call.
call_api_async
(url, headers, all_request_jsons)Use GPT to clean raw pdf text in parallel calls to the OpenAI API.
chat
(query[, debug, stream, temperature, ...])Answers a query by doing a semantic search of relevant text with embeddings and then sending engineered query to the LLM.
clear
()Clear chat history and reduce messages to just the initial model role message.
cosine_dist
(query_embedding)Compute the cosine distance of the query embedding array vs.
count_tokens
(text, model[, fallback_model])Return the number of tokens in a string.
engineer_query
(query[, token_budget, ...])Engineer a query for GPT using the corpus of information
generic_async_query
(queries[, model_role, ...])Run a number of generic single queries asynchronously (not conversational)
generic_query
(query[, model_role, temperature])Ask a generic single query without conversation
get_embedding
(text)Get the 1D array (list) embedding of a text string.
make_ref_list
(idx)Make a reference list
preflight_corpus
(corpus[, required])Run preflight checks on the text corpus.
query_vector_db
(query[, limit])Returns a list of strings and relatednesses, sorted from most related to least.
Attributes
Default model to do pdf text cleaning.
Default model to do text embeddings.
OpenAI embedding API URL
OpenAI API Headers
Prefix to the engineered prompt
High level model role, somewhat redundant to MODEL_INSTRUCTION
Optional mappings for unusual Azure names to tiktoken/openai names.
Order-prioritized list of model sub-strings to look for in model name to send to tokenizer.
OpenAI API URL to be used with environment variable OPENAI_API_KEY.
Get a string printout of the full conversation with the LLM
- static preflight_corpus(corpus, required=('text', 'embedding'))[source]
Run preflight checks on the text corpus.
- Parameters:
corpus (pd.DataFrame) – Corpus of text in dataframe format. Must have columns “text” and “embedding”.
required (list | tuple) – Column names required to be in the corpus df
- Returns:
corpus (pd.DataFrame) – Corpus of text in dataframe format. Must have columns “text” and “embedding”.
- cosine_dist(query_embedding)[source]
Compute the cosine distance of the query embedding array vs. all of the embedding arrays of the full text corpus
- Parameters:
query_embedding (np.ndarray) – 1D array of the numerical embedding of the request query.
- Returns:
out (np.ndarray) – 1D array with length equal to the number of entries in the text corpus. Each value is a distance score where smaller is closer
- query_vector_db(query, limit=100)[source]
Returns a list of strings and relatednesses, sorted from most related to least.
- Parameters:
query (str) – Question being asked of GPT
limit (int) – Number of top results to return.
- Returns:
strings (np.ndarray) – 1D array of related strings
score (np.ndarray) – 1D array of float scores of strings
idx (np.ndarray) – 1D array of indices in the text corpus corresponding to the ranked strings/scores outputs.
- make_ref_list(idx)[source]
Make a reference list
- Parameters:
used_index (np.ndarray) – Indices of the used text from the text corpus
- Returns:
ref_list (list) – A list of references (strs) used. This takes information straight from
ref_col
. Ideally, this is something like: [“{ref_title} ({ref_url})”]
- DEFAULT_MODEL = 'gpt-3.5-turbo'
Default model to do pdf text cleaning.
- EMBEDDING_MODEL = 'text-embedding-ada-002'
Default model to do text embeddings.
- EMBEDDING_URL = 'https://api.openai.com/v1/embeddings'
OpenAI embedding API URL
- HEADERS = {'Authorization': 'Bearer None', 'Content-Type': 'application/json', 'api-key': 'None'}
OpenAI API Headers
- MODEL_INSTRUCTION = 'Use the information below to answer the subsequent question. If the answer cannot be found in the text, write "I could not find an answer."'
Prefix to the engineered prompt
- MODEL_ROLE = 'You parse through articles to answer questions.'
High level model role, somewhat redundant to MODEL_INSTRUCTION
- TOKENIZER_ALIASES = {'gpt-35-turbo': 'gpt-3.5-turbo', 'gpt-4-32k': 'gpt-4-32k-0314', 'llmev-gpt-4-32k': 'gpt-4-32k-0314', 'wetosa-gpt-4': 'gpt-4', 'wetosa-gpt-4-standard': 'gpt-4', 'wetosa-gpt-4o': 'gpt-4o'}
Optional mappings for unusual Azure names to tiktoken/openai names.
- TOKENIZER_PATTERNS = ('gpt-4o', 'gpt-4-32k', 'gpt-4')
Order-prioritized list of model sub-strings to look for in model name to send to tokenizer. As an alternative to alias lookup, this will use the tokenizer pattern if found in the model string
- URL = 'https://api.openai.com/v1/chat/completions'
OpenAI API URL to be used with environment variable OPENAI_API_KEY. Use an Azure API endpoint to trigger Azure usage along with environment variables AZURE_OPENAI_KEY, AZURE_OPENAI_VERSION, and AZURE_OPENAI_ENDPOINT
- property all_messages_txt
Get a string printout of the full conversation with the LLM
- Returns:
str
- async static call_api(url, headers, request_json)
Make an asyncronous OpenAI API call.
- Parameters:
url (str) –
- OpenAI API url, typically either:
https://api.openai.com/v1/embeddings https://api.openai.com/v1/chat/completions
headers (dict) –
- OpenAI API headers, typically:
- {“Content-Type”: “application/json”,
“Authorization”: f”Bearer {openai.api_key}”}
request_json (dict) –
- API data input, typically looks like this for chat completion:
- {“model”: “gpt-3.5-turbo”,
- “messages”: [{“role”: “system”, “content”: “You do this…”},
{“role”: “user”, “content”: “Do this: {}”}],
“temperature”: 0.0}
- Returns:
out (dict) – API response in json format
- async call_api_async(url, headers, all_request_jsons, ignore_error=None, rate_limit=40000.0)
Use GPT to clean raw pdf text in parallel calls to the OpenAI API.
NOTE: you need to call this using the await command in ipython or jupyter, e.g.: out = await PDFtoTXT.clean_txt_async()
- Parameters:
url (str) –
- OpenAI API url, typically either:
https://api.openai.com/v1/embeddings https://api.openai.com/v1/chat/completions
headers (dict) –
- OpenAI API headers, typically:
- {“Content-Type”: “application/json”,
“Authorization”: f”Bearer {openai.api_key}”}
all_request_jsons (list) – List of API data input, one entry typically looks like this for chat completion:
- {“model”: “gpt-3.5-turbo”,
- “messages”: [{“role”: “system”, “content”: “You do this…”},
{“role”: “user”, “content”: “Do this: {}”}],
“temperature”: 0.0}
ignore_error (None | callable) – Optional callable to parse API error string. If the callable returns True, the error will be ignored, the API call will not be tried again, and the output will be an empty string.
rate_limit (float) – OpenAI API rate limit (tokens / minute). Note that the gpt-3.5-turbo limit is 90k as of 4/2023, but we’re using a large factor of safety (~1/2) because we can only count the tokens on the input side and assume the output is about the same count.
- Returns:
out (list) – List of API outputs where each list entry is a GPT answer from the corresponding message in the all_request_jsons input.
- chat(query, debug=True, stream=True, temperature=0, convo=False, token_budget=None, new_info_threshold=0.7, print_references=False, return_chat_obj=False)
Answers a query by doing a semantic search of relevant text with embeddings and then sending engineered query to the LLM.
- Parameters:
query (str) – Question being asked of EnergyWizard
debug (bool) – Flag to return extra diagnostics on the engineered question.
stream (bool) – Flag to print subsequent chunks of the response in a streaming fashion
temperature (float) – GPT model temperature, a measure of response entropy from 0 to 1. 0 is more reliable and nearly deterministic; 1 will give the model more creative freedom and may not return as factual of results.
convo (bool) – Flag to perform semantic search with full conversation history (True) or just the single query (False). Call EnergyWizard.clear() to reset the chat history.
token_budget (int) – Option to override the class init token budget.
new_info_threshold (float) – New text added to the engineered query must contain at least this much new information. This helps prevent (for example) the table of contents being added multiple times.
print_references (bool) – Flag to print references if EnergyWizard is initialized with a valid ref_col.
return_chat_obj (bool) – Flag to only return the ChatCompletion from OpenAI API.
- Returns:
response (str) – GPT output / answer.
query (str) – If debug is True, the engineered query asked of GPT will also be returned here
references (list) – If debug is True, the list of references (strs) used in the engineered prompt is returned here
- clear()
Clear chat history and reduce messages to just the initial model role message.
- classmethod count_tokens(text, model, fallback_model='gpt-4')
Return the number of tokens in a string.
- Parameters:
text (str) – Text string to get number of tokens for
model (str) – specification of OpenAI model to use (e.g., “gpt-3.5-turbo”)
fallback_model (str, default=’gpt-4’) – Model to be used for tokenizer if input model can’t be found in
TOKENIZER_ALIASES
and doesn’t have any easily noticeable patterns.
- Returns:
n (int) – Number of tokens in text
- engineer_query(query, token_budget=None, new_info_threshold=0.7, convo=False)
Engineer a query for GPT using the corpus of information
- Parameters:
query (str) – Question being asked of GPT
token_budget (int) – Option to override the class init token budget.
new_info_threshold (float) – New text added to the engineered query must contain at least this much new information. This helps prevent (for example) the table of contents being added multiple times.
convo (bool) – Flag to perform semantic search with full conversation history (True) or just the single query (False). Call EnergyWizard.clear() to reset the chat history.
- Returns:
message (str) – Engineered question to GPT including information from corpus and the original query
references (list) – The list of references (strs) used in the engineered prompt is returned here
- async generic_async_query(queries, model_role=None, temperature=0, ignore_error=None, rate_limit=40000.0)
Run a number of generic single queries asynchronously (not conversational)
NOTE: you need to call this using the await command in ipython or jupyter, e.g.: out = await Summary.run_async()
- Parameters:
query (list) – Questions to ask ChatGPT (list of strings)
model_role (str | None) – Role for the model to take, e.g.: “You are a research assistant”. This defaults to self.MODEL_ROLE
temperature (float) – GPT model temperature, a measure of response entropy from 0 to 1. 0 is more reliable and nearly deterministic; 1 will give the model more creative freedom and may not return as factual of results.
ignore_error (None | callable) – Optional callable to parse API error string. If the callable returns True, the error will be ignored, the API call will not be tried again, and the output will be an empty string.
rate_limit (float) – OpenAI API rate limit (tokens / minute). Note that the gpt-3.5-turbo limit is 90k as of 4/2023, but we’re using a large factor of safety (~1/2) because we can only count the tokens on the input side and assume the output is about the same count.
- Returns:
response (list) – Model responses with same length as query input.
- generic_query(query, model_role=None, temperature=0)
Ask a generic single query without conversation
- Parameters:
query (str) – Question to ask ChatGPT
model_role (str | None) – Role for the model to take, e.g.: “You are a research assistant”. This defaults to self.MODEL_ROLE
temperature (float) – GPT model temperature, a measure of response entropy from 0 to 1. 0 is more reliable and nearly deterministic; 1 will give the model more creative freedom and may not return as factual of results.
- Returns:
response (str) – Model response
- classmethod get_embedding(text)
Get the 1D array (list) embedding of a text string.
- Parameters:
text (str) – Text to embed
- Returns:
embedding (list) – List of float that represents the numerical embedding of the text