Skip to content

AI Models

Yiannis Charalambous edited this page Jun 30, 2023 · 17 revisions

Introduction

Various different LLMs are supported by ESBMC-AI. The built-in models consist of a collection of OpenAI models, and open source models. Due to the nature of open-source models, that is, they are hosted for free on Hugging Face, the open-source models chosen to be included in ESBMC-AI are smaller than the proprietary models. The performance of the open-source built-in models will be poorer than the proprietary models.

Due to the aforementioned fact, ESBMC-AI allows the definition and loading of Custom LLM that are hosted using Hugging Face's Text Generation Inference Server. This allows for self-hosted large language models to be seamlessly loaded into ESBMC-AI.

Built-In LLMs

The following section describes the built-in models that are shipped with ESBMC-AI.

OpenAI

The following models require the OPENAI_API_KEY environment variable to be set.

  • gpt-3.5-turbo
  • gpt-3.5-turbo-16k
  • gpt-4
  • gpt-4-32k

HF Text Generation Inference

The following models require the HUGGINGFACE_API_KEY environment variable to be set. They use text-generation-inference on the backend.

  • falcon-7b: Made by Technology Innovation Institute (TII)
  • starchat-beta: Made by Hugging Face.

Custom LLM

ESBMC-AI has support of custom AI. The config supports adding custom AI models that are hosted using the Text Generation Inference server from Hugging Face. These are specified in config.json inside the ai_custom field. The ai_model field selects which AI model to use. If the AI model is chosen and does not exist built in, then the list inside ai_custom will be checked. This means that when adding a custom_ai entry, all the entries inside ai_custom must be unique and not match any of the built-in first class AI. The name of the AI will be the entry name. The entry takes the following fields:

  • The max_tokens are the acceptable max tokens that the AI can accept.
  • The url is the text-generation-inference server URL that the LLM is hosted in. The HUGGINGFACE_API_KEY in the .env file will be used for access.
  • The config_message is a node containing the following fields: template, human, ai, system. The fields are further explained in the Template Variables section.
  • All the values mentioned are dependent on the LLM that is implemented.

Template Variables

There are f-string templates that should be used inside the config_message entries in order to replace the template message with data during execution of ESBMC-AI. The templating is done before every request to the LLM.

Conversation Template (config_message.template)

The custom configuration message that will wrap the messages, this is used to convert text generation LLM into Chat LLMs.

  • {history}: The previous messages sent by the AI model.
  • {user_prompt}: The last message of the user to the AI.

Message Templates (config_message.system, config_message.human, and, config_message.ai)

Templates that will be applied to each message before being applied to the config message template.

  • {content}: The content of each message will replace the token.

Examples

In order to run each LLM configuration, the ai_model entry in the config.json needs to be set to the name of the entry. Alternatively, the argument -m <name> can be passed to ESBMC-AI when running the program.

Template

Here is a template example entry for custom-ai. The conversation template specified is simply the identity of the conversation, as it inserts the previous messages, then two new lines, then another two new lines, so the LLM can complete the assistant response. It does not manipulate the conversation at all.

The message templates insert the name of the entity before the contents of the message, along with a new line after the message. So, for example, one human message “What's wrong with this code?”, will be substituted with “Human: What's wrong with this code?\n”.

In addition, the configuration entry also specifies that the maximum number of tokens that the model accepts is 1024, along with the URL describing its location and config message.

"example-llm": {
  "max_tokens": 1024,
  "url": "www.example.com",
  "config_message": {
    "system": "System: {content}\n",
    "human": "Human: {content}\n",
    "ai": "AI: {content}\n",
    "template": "{history}\n\n{user_prompt}\n\n"
  },
  "stop_sequences": []
}

Starchat Beta

Here is an example entry for the ai_custom field in the config.json that implements the same behavior as the starchat-beta built-in model:

"my-starchat-beta": {
  "max_tokens": 1024,
  "url": "https://api-inference.huggingface.co/models/HuggingFaceH4/starchat-beta",
  "config_message": {
    "system": "<|system|>\n{content}\n<|end|>",
    "human": "<|user|>\n{content}\n<|end|>",
    "ai": "<|assistant|>{content}",
    "template": "{history}\n\n{user_prompt}\n\n<|assistant|>"
  },
  "stop_sequences": ["<|end|>"]
}

In order to run my-starchat-beta, the name could be assigned in the ai_model field in the config, or alternatively, specified using the -m my-starchat-beta parameter. For example, ./esbmc_ai.py -m my-starchat-beta path/to/file.