[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[elpa] externals/llm b9fc46f333 08/13: Resolved merge conflicts and merg
From: |
ELPA Syncer |
Subject: |
[elpa] externals/llm b9fc46f333 08/13: Resolved merge conflicts and merged upstream/main into ollama-chat-endpoint-support. |
Date: |
Wed, 7 Feb 2024 18:58:11 -0500 (EST) |
branch: externals/llm
commit b9fc46f3338fbfd7166cc4fbbaab3b4a397660db
Author: Thomas E. Allen <thomas@assistivemachines.com>
Commit: Thomas E. Allen <thomas@assistivemachines.com>
Resolved merge conflicts and merged upstream/main into
ollama-chat-endpoint-support.
---
NEWS.org | 5 +++++
README.org | 5 ++++-
llm-gemini.el | 15 +++++++++------
llm-openai.el | 4 ++--
llm-request.el | 14 +++++++++++++-
llm-vertex.el | 55 +++++++++++++++++++++----------------------------------
llm.el | 2 +-
7 files changed, 55 insertions(+), 45 deletions(-)
diff --git a/NEWS.org b/NEWS.org
index 12d46bea89..dd514663b9 100644
--- a/NEWS.org
+++ b/NEWS.org
@@ -1,5 +1,10 @@
+* Version 0.9.1
+- Default to the new "text-embedding-3-small" model for Open AI. *Important*:
Anyone who has stored embeddings should either regenerate embeddings
(recommended) or hard-code the old embedding model ("text-embedding-ada-002").
+- Fix response breaking when prompts run afoul of Gemini / Vertex's safety
checks.
+- Change Gemini streaming to be the correct URL. This doesn't seem to have an
effect on behavior.
* Version 0.9
- Add =llm-chat-token-limit= to find the token limit based on the model.
+- Add request timeout customization.
* Version 0.8
- Allow users to change the Open AI URL, to allow for proxies and other
services that re-use the API.
- Add =llm-name= and =llm-cancel-request= to the API.
diff --git a/README.org b/README.org
index 2a2659e598..2da47be47e 100644
--- a/README.org
+++ b/README.org
@@ -9,7 +9,7 @@ Certain functionalities might not be available in some LLMs.
Any such unsupporte
This package is still in its early stages but will continue to develop as LLMs
and functionality are introduced.
* Setting up providers
-Users of an application that uses this package should not need to install it
themselves. The llm module should be installed as a dependency when you install
the package that uses it. However, you do need to require the llm module and
set up the provider you will be using. Typically, applications will have a
variable you can set. For example, let's say there's a package called
"llm-refactoring", which has a variable ~llm-refactoring-provider~. You would
set it up like so:
+Users of an application that uses this package should not need to install it
themselves. The llm package should be installed as a dependency when you
install the package that uses it. However, you do need to require the llm
module and set up the provider you will be using. Typically, applications will
have a variable you can set. For example, let's say there's a package called
"llm-refactoring", which has a variable ~llm-refactoring-provider~. You would
set it up like so:
#+begin_src emacs-lisp
(use-package llm-refactoring
@@ -19,6 +19,8 @@ Users of an application that uses this package should not
need to install it the
#+end_src
Here ~my-openai-key~ would be a variable you set up before with your OpenAI
key. Or, just substitute the key itself as a string. It's important to remember
never to check your key into a public repository such as GitHub, because your
key must be kept private. Anyone with your key can use the API, and you will be
charged.
+
+For embedding users. if you store the embeddings, you *must* set the embedding
model. Even though there's no way for the llm package to tell whether you are
storing it, if the default model changes, you may find yourself storing
incompatible embeddings.
** Open AI
You can set up with ~make-llm-openai~, with the following parameters:
- ~:key~, the Open AI key that you get when you sign up to use Open AI's APIs.
Remember to keep this private. This is non-optional.
@@ -100,6 +102,7 @@ For all callbacks, the callback will be executed in the
buffer the function was
- ~llm-count-tokens provider string~: Count how many tokens are in ~string~.
This may vary by ~provider~, because some provideres implement an API for this,
but typically is always about the same. This gives an estimate if the provider
has no API support.
- ~llm-cancel-request request~ Cancels the given request, if possible. The
~request~ object is the return value of async and streaming functions.
- ~llm-name provider~. Provides a short name of the model or provider,
suitable for showing to users.
+- ~llm-chat-token-limit~. Gets the token limit for the chat model. This
isn't possible for some backends like =llama.cpp=, in which the model isn't
selected or known by this library.
And the following helper functions:
- ~llm-make-simple-chat-prompt text~: For the common case of just wanting a
simple text prompt without the richness that ~llm-chat-prompt~ struct provides,
use this to turn a string into a ~llm-chat-prompt~ that can be passed to the
main functions above.
diff --git a/llm-gemini.el b/llm-gemini.el
index 07b7aaa093..3c80872333 100644
--- a/llm-gemini.el
+++ b/llm-gemini.el
@@ -72,10 +72,13 @@ You can get this at
https://makersuite.google.com/app/apikey."
buf error-callback
'error (llm-vertex--error-message
data))))))
-(defun llm-gemini--chat-url (provider)
- "Return the URL for the chat request, using PROVIDER."
- (format
"https://generativelanguage.googleapis.com/v1beta/models/%s:generateContent?key=%s"
+;; from https://ai.google.dev/tutorials/rest_quickstart
+(defun llm-gemini--chat-url (provider streaming-p)
+ "Return the URL for the chat request, using PROVIDER.
+If STREAMING-P is non-nil, use the streaming endpoint."
+ (format
"https://generativelanguage.googleapis.com/v1beta/models/%s:%s?key=%s"
(llm-gemini-chat-model provider)
+ (if streaming-p "streamGenerateContent" "generateContent")
(llm-gemini-key provider)))
(defun llm-gemini--get-chat-response (response)
@@ -85,7 +88,7 @@ You can get this at https://makersuite.google.com/app/apikey."
(cl-defmethod llm-chat ((provider llm-gemini) prompt)
(let ((response (llm-vertex--get-chat-response-streaming
- (llm-request-sync (llm-gemini--chat-url provider)
+ (llm-request-sync (llm-gemini--chat-url provider nil)
:data (llm-vertex--chat-request-streaming
prompt)))))
(setf (llm-chat-prompt-interactions prompt)
(append (llm-chat-prompt-interactions prompt)
@@ -94,10 +97,10 @@ You can get this at
https://makersuite.google.com/app/apikey."
(cl-defmethod llm-chat-streaming ((provider llm-gemini) prompt
partial-callback response-callback error-callback)
(let ((buf (current-buffer)))
- (llm-request-async (llm-gemini--chat-url provider)
+ (llm-request-async (llm-gemini--chat-url provider t)
:data (llm-vertex--chat-request-streaming prompt)
:on-partial (lambda (partial)
- (when-let ((response
(llm-vertex--get-partial-chat-ui-repsonse partial)))
+ (when-let ((response
(llm-vertex--get-partial-chat-response partial)))
(llm-request-callback-in-buffer buf
partial-callback response)))
:on-success (lambda (data)
(let ((response
(llm-vertex--get-chat-response-streaming data)))
diff --git a/llm-openai.el b/llm-openai.el
index 341275c9c8..fd57d0bd93 100644
--- a/llm-openai.el
+++ b/llm-openai.el
@@ -69,7 +69,7 @@ https://api.example.com/v1/chat, then URL should be
"Return the request to the server for the embedding of STRING.
MODEL is the embedding model to use, or nil to use the default.."
`(("input" . ,string)
- ("model" . ,(or model "text-embedding-ada-002"))))
+ ("model" . ,(or model "text-embedding-3-small"))))
(defun llm-openai--embedding-extract-response (response)
"Return the embedding from the server RESPONSE."
@@ -113,7 +113,7 @@ This is just the key, if it exists."
"/") command))
(cl-defmethod llm-embedding-async ((provider llm-openai) string
vector-callback error-callback)
- (llm-openai--check-key provider)
+ (llm-openai--check-key provider)
(let ((buf (current-buffer)))
(llm-request-async (llm-openai--url provider "embeddings")
:headers (llm-openai--headers provider)
diff --git a/llm-request.el b/llm-request.el
index a8ee5d489b..4241793c02 100644
--- a/llm-request.el
+++ b/llm-request.el
@@ -25,6 +25,18 @@
(require 'url-http)
(require 'rx)
+(defcustom llm-request-timeout 20
+ "The number of seconds to wait for a response from a HTTP server.
+
+Request timings are depending on the request. Requests that need
+more output may take more time, and there is other processing
+besides just token generation that can take a while. Sometimes
+the LLM can get stuck, and you don't want it to take too long.
+This should be balanced to be good enough for hard requests but
+not very long so that we can end stuck requests."
+ :type 'integer
+ :group 'llm)
+
(defun llm-request--content ()
"From the current buffer, return the content of the response."
(decode-coding-string
@@ -57,7 +69,7 @@ TIMEOUT is the number of seconds to wait for a response."
(url-request-extra-headers
(append headers '(("Content-Type" . "application/json"))))
(url-request-data (encode-coding-string (json-encode data) 'utf-8)))
- (let ((buf (url-retrieve-synchronously url t nil (or timeout 5))))
+ (let ((buf (url-retrieve-synchronously url t nil (or timeout
llm-request-timeout))))
(if buf
(with-current-buffer buf
(url-http-parse-response)
diff --git a/llm-vertex.el b/llm-vertex.el
index 87e4465cab..2427fde5ef 100644
--- a/llm-vertex.el
+++ b/llm-vertex.el
@@ -151,41 +151,28 @@ This handles different kinds of models."
(pcase (type-of response)
('vector (mapconcat #'llm-vertex--get-chat-response-streaming
response ""))
- ('cons (let ((parts (assoc-default 'parts
- (assoc-default 'content
- (aref (assoc-default
'candidates response) 0)))))
- (if parts
- (assoc-default 'text (aref parts 0))
- "")))))
-
-(defun llm-vertex--get-partial-chat-ui-repsonse (response)
- "Return the partial response from as much of RESPONSE as we can parse.
-If the response is not parseable, return nil."
+ ('cons (if (assoc-default 'candidates response)
+ (let ((parts (assoc-default
+ 'parts
+ (assoc-default 'content
+ (aref (assoc-default 'candidates
response) 0)))))
+ (if parts
+ (assoc-default 'text (aref parts 0))
+ ""))
+ "NOTE: No response was sent back by the LLM, the prompt may have
violated safety checks."))))
+
+(defun llm-vertex--get-partial-chat-response (response)
+ "Return the partial response from as much of RESPONSE as we can parse."
(with-temp-buffer
(insert response)
- (let ((start (point-min))
- (end-of-valid-chunk
- (save-excursion
- (goto-char (point-max))
- (search-backward "\n," nil t)
- (point))))
- (when (and start end-of-valid-chunk)
- ;; It'd be nice if our little algorithm always worked, but doesn't, so
let's
- ;; just ignore when it fails. As long as it mostly succeeds, it
should be fine.
- (condition-case nil
- (when-let
- ((json (ignore-errors
- (json-read-from-string
- (concat
- (buffer-substring-no-properties
- start end-of-valid-chunk)
- ;; Close off the json
- "]")))))
- (llm-vertex--get-chat-response-streaming json))
- (error (message "Unparseable buffer saved to
*llm-vertex-unparseable*")
- (with-current-buffer (get-buffer-create
"*llm-vertex-unparseable*")
- (erase-buffer)
- (insert response))))))))
+ (let ((result ""))
+ ;; We just will parse every line that is "text": "..." and concatenate
them.
+ (save-excursion
+ (goto-char (point-min))
+ (while (re-search-forward (rx (seq (literal "\"text\": ")
+ (group-n 1 ?\" (* any) ?\")
line-end)) nil t)
+ (setq result (concat result (json-read-from-string (match-string
1))))))
+ result)))
(defun llm-vertex--chat-request-streaming (prompt)
"Return an alist with chat input for the streaming API.
@@ -247,7 +234,7 @@ If STREAMING is non-nil, use the URL for the streaming API."
:headers `(("Authorization" . ,(format "Bearer %s"
(llm-vertex-key provider))))
:data (llm-vertex--chat-request-streaming prompt)
:on-partial (lambda (partial)
- (when-let ((response
(llm-vertex--get-partial-chat-ui-repsonse partial)))
+ (when-let ((response
(llm-vertex--get-partial-chat-response partial)))
(llm-request-callback-in-buffer buf
partial-callback response)))
:on-success (lambda (data)
(let ((response
(llm-vertex--get-chat-response-streaming data)))
diff --git a/llm.el b/llm.el
index ae59017bcd..7b190224b3 100644
--- a/llm.el
+++ b/llm.el
@@ -5,7 +5,7 @@
;; Author: Andrew Hyatt <ahyatt@gmail.com>
;; Homepage: https://github.com/ahyatt/llm
;; Package-Requires: ((emacs "28.1"))
-;; Package-Version: 0.8.0
+;; Package-Version: 0.9.1
;; SPDX-License-Identifier: GPL-3.0-or-later
;;
;; This program is free software; you can redistribute it and/or
- [elpa] externals/llm updated (bed2cb774d -> a343797144), ELPA Syncer, 2024/02/07
- [elpa] externals/llm 3147810ec4 03/13: Minor changes to new function comments., ELPA Syncer, 2024/02/07
- [elpa] externals/llm a4d7098c44 01/13: Added support for Ollama /api/chat endpoint, ELPA Syncer, 2024/02/07
- [elpa] externals/llm 1c3727ce50 05/13: Restored comment that I had accidentally dropped from the generate endpoint helper., ELPA Syncer, 2024/02/07
- [elpa] externals/llm 843cf24aa4 04/13: Added endpoint parameter to documentation., ELPA Syncer, 2024/02/07
- [elpa] externals/llm ea72852375 09/13: Merge remote-tracking branch 'upstream/main' into ollama-chat-endpoint-support, ELPA Syncer, 2024/02/07
- [elpa] externals/llm ea2ec282aa 10/13: Removed /generate endpoint support based on PR feedback, ELPA Syncer, 2024/02/07
- [elpa] externals/llm a343797144 13/13: Merge pull request #16 from tquartus/ollama-chat-endpoint-support, ELPA Syncer, 2024/02/07
- [elpa] externals/llm b9fc46f333 08/13: Resolved merge conflicts and merged upstream/main into ollama-chat-endpoint-support.,
ELPA Syncer <=
- [elpa] externals/llm 1e08b7381d 07/13: Merge branch 'ahyatt:main' into ollama-chat-endpoint-support, ELPA Syncer, 2024/02/07
- [elpa] externals/llm 61db5c3cf8 02/13: Corrected form of comments of helper functions., ELPA Syncer, 2024/02/07
- [elpa] externals/llm a1b17b0170 06/13: Remove unneeded space at end of line., ELPA Syncer, 2024/02/07
- [elpa] externals/llm 9e7344ac27 11/13: Minor clean up, remove mention of :endpoint slot in README., ELPA Syncer, 2024/02/07
- [elpa] externals/llm 993081f072 12/13: Minor changes, ELPA Syncer, 2024/02/07