[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
New GNU ELPA package: llm - Interface to pluggable llm backends
From: |
Sacha Chua |
Subject: |
New GNU ELPA package: llm - Interface to pluggable llm backends |
Date: |
Mon, 25 Sep 2023 08:46:58 -0400 |
Summary: Interface to pluggable llm backends
Requires: request-0.3.3, emacs-28.1
Website: https://github.com/ahyatt/llm
Maintainer: Andrew Hyatt <ahyatt@gmail.com>
Author: Andrew Hyatt <ahyatt@gmail.com>
━━━━━━━━━━━━━━━━━━━━━━━
LLM PACKAGE FOR EMACS
━━━━━━━━━━━━━━━━━━━━━━━
1 Introduction
══════════════
This is a library for interfacing with Large Language Models. It
allows elisp code to use LLMs, but allows gives the end-user an option
to choose which LLM they would prefer. This is especially useful for
LLMs, since there are various high-quality ones that in which API
access costs money, as well as locally installed ones that are free,
but of medium quality. Applications using LLMs can use this library
to make sure their application works regardless of whether the user
has a local LLM or is paying for API access.
The functionality supported by LLMs is not completely consistent, nor
are their APIs. In this library we attempt to abstract functionality
to a higher level, because sometimes those higher level concepts are
supported by an API, and othertimes they must be put in more low-level
concepts. One such higher-level concept is "examples" where the
client can show example interactions to demonstrate a pattern for the
LLM. The GCloud Vertex API has an explicit API for examples, but for
Open AI's API, examples must be specified by modifying the system
prompt. Open AI has the concept of a system prompt, whereas Vertex
API does not. These are the kinds of API differences we attempt to
hide by having higher-level concepts in our API.
Some functionality may not be supported by LLMs. Any unsupported
functionality with throw a `'not-implemented' signal.
This package is simple at the moment, but will grow as both LLMs and
functionality is added.
2 Setting up providers
══════════════════════
Users who use an application that uses this package should not need to
install it. The llm module should be installed as a dependency when
you install the package that uses it. You do need to make sure to
both require and set up the provider you will be using. Typically,
applications will have a variable you can set. For example, let's say
there's a package called "llm-refactoring", which has a variable
`llm-refactoring-provider'. You would set it up like so:
┌────
│ (use-package llm-refactoring
│ :init
│ (require 'llm-openai)
│ (setq llm-refactoring-provider (make-llm-openai :key my-openai-key))
└────
Here `my-openai-key' would be a variable you set up before with your
Open AI key. Or, just substitute the key itself as a string. It's
important that you remember never to check your key into a public
repository such as github, because your key must be kept private.
Anyone with your key can use the API, and you will be charged.
2.1 Open AI
───────────
You can set up with `make-llm-openai', with the following parameters:
• `:key', the Open AI key that you get when you sign up to use Open
AI's APIs. Remember to keep this private. This is non-optional.
• `:chat-model': A model name from the [list of Open AI's model
names.] Keep in mind some of these are not available to everyone.
This is optional, and will default to a reasonable 3.5 model.
• `:embedding-model': A model name from [list of Open AI's embedding
model names.] This is optional, and will default to a reasonable
model.
[list of Open AI's model names.]
<https://platform.openai.com/docs/models/gpt-4>
[list of Open AI's embedding model names.]
<https://platform.openai.com/docs/guides/embeddings/embedding-models>
2.2 Vertex
──────────
You can set up with `make-llm-vertex', with the following parameters:
• `:project': Your project number from Google Cloud that has Vertex
API enabled.
• `:chat-model': A model name from the [list of Vertex's model names.]
This is optional, and will default to a reasonable model.
• `:embedding-model': A model name from the [list of Vertex's
embedding model names.] This is optional, and will default to a
reasonable model.
In addition to the provider, which you may want multiple of (for
example, to charge against different projects), there are customizable
variables:
• `llm-vertex-gcloud-binary': The binary to use for generating the API
key.
• `llm-vertex-gcloud-region': The gcloud region to use. It's good to
set this to a region near where you are for best latency. Defaults
to "us-central1".
[list of Vertex's model names.]
<https://cloud.google.com/vertex-ai/docs/generative-ai/chat/chat-prompts#supported_model>
[list of Vertex's embedding model names.]
<https://cloud.google.com/vertex-ai/docs/generative-ai/embeddings/get-text-embeddings#supported_models>
2.3 Fake
────────
This is a client that makes no call, but it just there for testing and
debugging. Mostly this is of use to programmatic clients of the llm
package, but end users can also use it to understand what will be sent
to the LLMs. It has the following parameters:
• `:output-to-buffer': if non-nil, the buffer or buffer name to append
the request sent to the LLM to.
• `:chat-action-func': a function that will be called to provide a
string or symbol and message cons which are used to raise an error.
• `:embedding-action-func': a function that will be called to provide
a vector or symbol and message cons which are used to raise an
error.
3 `llm' and the use of non-free LLMs
════════════════════════════════════
The `llm' package is part of GNU Emacs by being part of GNU ELPA.
Unfortunately, the most popular LLMs in use are non-free, which is not
what GNU software should be promoting by inclusion. On the other
hand, by use of the `llm' package, the user can make sure that any
client that codes against it will work with free models that come
along. It's likely that sophisticated free LLMs will, emerge,
although it's unclear right now what free software means with respsect
to LLMs. Because of this tradeoff, we have decided to warn the user
when using non-free LLMs (which is every LLM supported right now
except the fake one). You can turn this off the same way you turn off
any other warning, by clicking on the left arrow next to the warning
when it comes up. Alternatively, you can set `llm-warn-on-nonfree' to
`nil'. This can be set via customization as well.
To build upon the example from before:
┌────
│ (use-package llm-refactoring
│ :init
│ (require 'llm-openai)
│ (setq llm-refactoring-provider (make-llm-openai :key my-openai-key)
│ llm-warn-on-nonfree nil)
└────
4 Programmatic use
══════════════════
Client applications should require the `llm' package, and code against
it. Most functions are generic, and take a struct representing a
provider as the first argument. The client code, or the user
themselves can then require the specific module, such as `llm-openai',
and create a provider with a function such as `(make-llm-openai :key
user-api-key)'. The client application will use this provider to call
all the generic functions.
A list of all the functions:
• `llm-chat provider prompt': With user-chosen `provider' , and a
`llm-chat-prompt' structure (containing context, examples,
interactions, and parameters such as temperature and max tokens),
send that prompt to the LLM and wait for the string output.
• `llm-chat-async provider prompt response-callback error-callback':
Same as `llm-chat', but executes in the background. Takes a
`response-callback' which will be called with the text response.
The `error-callback' will be called in case of error, with the error
symbol and an error message.
• `llm-embedding provider string': With the user-chosen `provider',
send a string and get an embedding, which is a large vector of
floating point values. The embedding represents the semantic
meaning of the string, and the vector can be compared against other
vectors, where smaller distances between the vectors represent
greater semantic similarity.
• `llm-embedding-async provider string vector-callback
error-callback': Same as `llm-embedding' but this is processed
asynchronously. `vector-callback' is called with the vector
embedding, and, in case of error, `error-callback' is called with
the same arguments as in `llm-chat-async'.
5 Contributions
═══════════════
If you are interested in creating a provider, please send a pull
request, or open a bug. This library is part of GNU ELPA, so any
major provider that we include in this module needs to be written by
someone with FSF papers. However, you can always write a module and
put it on a different package archive, such as MELPA.
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- New GNU ELPA package: llm - Interface to pluggable llm backends,
Sacha Chua <=