Hugging Face Chat

Hugging Face Text Generation Inference (TGI) 是一种专门用于在云端部署大型语言模型 (LLM) 的解决方案，使其可以通过 API 访问。TGI 通过连续批处理、令牌流和高效内存管理等功能为文本生成任务提供优化的性能。

Text Generation Inference 要求模型与其架构特定的优化兼容。虽然许多流行的 LLM 都受支持，但并非 Hugging Face Hub 上的所有模型都可以使用 TGI 部署。如果需要部署其他类型的模型，请考虑使用标准的 Hugging Face 推理终端。

有关受支持模型和架构的完整最新列表，请参阅 Text Generation Inference 受支持模型文档。

前提条件

您需要在 Hugging Face 上创建推理终端并创建 API 令牌来访问该终端。更多详情可在此处找到。Spring AI 项目定义了一个名为 spring.ai.huggingface.chat.api-key 的配置属性，您应将其设置为从 Hugging Face 获取的 API 令牌值。还有一个名为 spring.ai.huggingface.chat.url 的配置属性，您应将其设置为在 Hugging Face 中预置模型时获取的推理终端 URL。您可以在推理终端的 UI 上找到它，地址在此处。导出环境变量是设置这些配置属性的一种方法。

export SPRING_AI_HUGGINGFACE_CHAT_API_KEY=<INSERT KEY HERE>
export SPRING_AI_HUGGINGFACE_CHAT_URL=<INSERT INFERENCE ENDPOINT URL HERE>

添加仓库和 BOM

Spring AI 工件发布在 Maven Central 和 Spring Snapshot 仓库中。请参阅仓库部分，将这些仓库添加到您的构建系统。

为了帮助进行依赖管理，Spring AI 提供了一个 BOM（物料清单），以确保在整个项目中使用一致版本的 Spring AI。请参阅依赖管理部分，将 Spring AI BOM 添加到您的构建系统。

自动配置

Spring AI 自动配置、starter 模块的 artifact 名称发生了重大变化。请参阅升级说明获取更多信息。

Spring AI 为 Hugging Face Chat Client 提供了 Spring Boot 自动配置。要启用它，请将以下依赖项添加到您项目的 Maven pom.xml 文件中

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-starter-model-huggingface</artifactId>
</dependency>

或添加到您的 Gradle build.gradle 构建文件。

dependencies {
    implementation 'org.springframework.ai:spring-ai-starter-model-huggingface'
}

请参阅依赖管理部分，将 Spring AI BOM 添加到您的构建文件。

聊天属性

现在通过带有前缀 spring.ai.model.chat 的顶级属性配置聊天自动配置的启用和禁用。

要启用，spring.ai.model.chat=huggingface （默认已启用）

要禁用，spring.ai.model.chat=none （或任何与 huggingface 不匹配的值）

进行此更改是为了支持配置多个模型。

前缀 spring.ai.huggingface 是用于配置 Hugging Face 聊天模型实现的属性前缀。

属性

描述

默认值

spring.ai.huggingface.chat.api-key

用于向推理终端进行身份验证的 API Key。

spring.ai.huggingface.chat.url

要连接的推理终端 URL

spring.ai.huggingface.chat.enabled （已移除且不再有效）

启用 Hugging Face 聊天模型。

true

spring.ai.model.chat （已移除且不再有效）

启用 Hugging Face 聊天模型。

huggingface

示例控制器（自动配置）

创建一个新的 Spring Boot 项目，并将 spring-ai-starter-model-huggingface 添加到您的 pom（或 gradle）依赖项中。

在 src/main/resources 目录下添加一个 application.properties 文件，以启用和配置 Hugging Face 聊天模型

spring.ai.huggingface.chat.api-key=YOUR_API_KEY
spring.ai.huggingface.chat.url=YOUR_INFERENCE_ENDPOINT_URL

将 api-key 和 url 替换为您的 Hugging Face 值。

这将创建一个 HuggingfaceChatModel 实现，您可以将其注入到您的类中。这是一个简单的 @Controller 类示例，它使用聊天模型进行文本生成。

@RestController
public class ChatController {

    private final HuggingfaceChatModel chatModel;

    @Autowired
    public ChatController(HuggingfaceChatModel chatModel) {
        this.chatModel = chatModel;
    }

    @GetMapping("/ai/generate")
    public Map generate(@RequestParam(value = "message", defaultValue = "Tell me a joke") String message) {
        return Map.of("generation", this.chatModel.call(message));
    }
}

手动配置

HuggingfaceChatModel 实现了 ChatModel 接口，并使用[低层 API] 连接到 Hugging Face 推理终端。

将 spring-ai-huggingface 依赖项添加到您项目的 Maven pom.xml 文件中

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-huggingface</artifactId>
</dependency>

或添加到您的 Gradle build.gradle 构建文件。

dependencies {
    implementation 'org.springframework.ai:spring-ai-huggingface'
}

请参阅依赖管理部分，将 Spring AI BOM 添加到您的构建文件。

接下来，创建一个 HuggingfaceChatModel 并使用它进行文本生成

HuggingfaceChatModel chatModel = new HuggingfaceChatModel(apiKey, url);

ChatResponse response = this.chatModel.call(
    new Prompt("Generate the names of 5 famous pirates."));

System.out.println(response.getGeneration().getResult().getOutput().getContent());