VertexAI Gemini 聊天

Vertex AI Gemini API 允许开发者使用 Gemini 模型构建生成式 AI 应用。 Vertex AI Gemini API 支持将多模态提示作为输入，并输出文本或代码。多模态模型是一种能够处理来自多种模态信息（包括图像、视频和文本）的模型。例如，您可以向模型发送一张饼干的照片，并要求它提供这些饼干的配方。

Gemini 是 Google DeepMind 开发的一系列生成式 AI 模型，专为多模态用例设计。Gemini API 让您可以使用 Gemini 1.0 Pro Vision 和 Gemini 1.0 Pro 模型。有关 Vertex AI Gemini API 模型规范的详细信息，请参阅模型信息。

Gemini API 参考

先决条件

安装适用于您操作系统的 gcloud CLI。
运行以下命令进行身份验证。将 PROJECT_ID 替换为您的 Google Cloud 项目 ID，将 ACCOUNT 替换为您的 Google Cloud 用户名。

gcloud config set project <PROJECT_ID> &&
gcloud auth application-default login <ACCOUNT>

自动配置

Spring AI 自动配置、starter 模块的 artifact 名称发生了重大变化。请参阅升级注意事项了解更多信息。

Spring AI 为 VertexAI Gemini 聊天客户端提供了 Spring Boot 自动配置。要启用它，请将以下依赖项添加到项目的 Maven pom.xml 或 Gradle build.gradle 构建文件中

Maven
Gradle

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-starter-model-vertex-ai-gemini</artifactId>
</dependency>

dependencies {
    implementation 'org.springframework.ai:spring-ai-starter-model-vertex-ai-gemini'
}

请参阅依赖管理部分，将 Spring AI BOM 添加到您的构建文件。

聊天属性

现在可以通过前缀为 spring.ai.model.chat 的顶级属性配置聊天自动配置的启用和禁用。

要启用，请设置 spring.ai.model.chat=vertexai（默认已启用）

要禁用，请设置 spring.ai.model.chat=none（或任何不匹配 vertexai 的值）

此更改是为了允许配置多个模型。

前缀 spring.ai.vertex.ai.gemini 用作属性前缀，允许您连接到 VertexAI。

属性描述默认值

属性	描述	默认值
spring.ai.model.chat	启用聊天模型客户端	vertexai
spring.ai.vertex.ai.gemini.projectId	Google Cloud Platform 项目 ID	-
spring.ai.vertex.ai.gemini.location	区域	-
spring.ai.vertex.ai.gemini.credentialsUri	Vertex AI Gemini 凭据的 URI。提供后，将用于创建 `GoogleCredentials` 实例以对 `VertexAI` 进行身份验证。	-
spring.ai.vertex.ai.gemini.apiEndpoint	Vertex AI Gemini API 端点。	-
spring.ai.vertex.ai.gemini.scopes		-
spring.ai.vertex.ai.gemini.transport	API 传输协议。GRPC 或 REST。	GRPC

spring.ai.model.chat

启用聊天模型客户端

vertexai

spring.ai.vertex.ai.gemini.projectId

Google Cloud Platform 项目 ID

spring.ai.vertex.ai.gemini.location

区域

spring.ai.vertex.ai.gemini.credentialsUri

Vertex AI Gemini 凭据的 URI。提供后，将用于创建 GoogleCredentials 实例以对 VertexAI 进行身份验证。

spring.ai.vertex.ai.gemini.apiEndpoint

Vertex AI Gemini API 端点。

spring.ai.vertex.ai.gemini.scopes

spring.ai.vertex.ai.gemini.transport

API 传输协议。GRPC 或 REST。

GRPC

前缀 spring.ai.vertex.ai.gemini.chat 是属性前缀，允许您配置 VertexAI Gemini 聊天的聊天模型实现。

属性描述默认值

属性	描述	默认值
spring.ai.vertex.ai.gemini.chat.options.model	支持使用的 Vertex AI Gemini 聊天模型包括 (1.0) `gemini-pro`、`gemini-pro-vision`（已弃用）以及新的 `gemini-1.5-pro-001`、`gemini-1.5-flash-001` 模型。	gemini-1.5-pro-001
spring.ai.vertex.ai.gemini.chat.options.responseMimeType	生成的候选文本的输出响应 mimetype。	`text/plain`：（默认）文本输出，或 `application/json`：JSON 响应。
spring.ai.vertex.ai.gemini.chat.options.googleSearchRetrieval	使用 Google 搜索 Grounding 功能	`true` 或 `false`，默认 `false`。
spring.ai.vertex.ai.gemini.chat.options.temperature	控制输出的随机性。取值范围为 [0.0, 1.0]，包含边界。值越接近 1.0，生成的响应越多样化；值越接近 0.0，生成模型的响应通常越不令人惊讶。此值指定了后端在调用生成模型时使用的默认值。	0.8
spring.ai.vertex.ai.gemini.chat.options.topK	采样时要考虑的最大 token 数。生成模型使用 Top-k 和 nucleus 采样相结合的方式。Top-k 采样考虑 topK 个最可能的 token 集合。	-
spring.ai.vertex.ai.gemini.chat.options.topP	采样时要考虑的 token 的最大累积概率。生成模型使用 Top-k 和 nucleus 采样相结合的方式。Nucleus 采样考虑概率和至少为 topP 的最小 token 集合。	-
spring.ai.vertex.ai.gemini.chat.options.candidateCount	要返回的生成响应消息的数量。此值必须介于 [1, 8] 之间，包含边界。默认值为 1。	-
spring.ai.vertex.ai.gemini.chat.options.candidateCount	要返回的生成响应消息的数量。此值必须介于 [1, 8] 之间，包含边界。默认值为 1。	-
spring.ai.vertex.ai.gemini.chat.options.maxOutputTokens	要生成的最大 token 数。	-
spring.ai.vertex.ai.gemini.chat.options.frequencyPenalty		-
spring.ai.vertex.ai.gemini.chat.options.presencePenalty		-
spring.ai.vertex.ai.gemini.chat.options.toolNames	工具列表，按名称标识，用于在单个提示请求中启用函数调用。具有这些名称的工具必须存在于 ToolCallback 注册表中。	-
（已被 `toolNames` 弃用）spring.ai.vertex.ai.gemini.chat.options.functions	函数列表，按名称标识，用于在单个提示请求中启用函数调用。具有这些名称的函数必须存在于 functionCallbacks 注册表中。	-
spring.ai.vertex.ai.gemini.chat.options.proxy-tool-calls	如果为 true，Spring AI 将不会在内部处理函数调用，而是将其代理到客户端。然后由客户端负责处理函数调用，将其分派到适当的函数，并返回结果。如果为 false（默认值），Spring AI 将在内部处理函数调用。仅适用于支持函数调用的聊天模型	false
spring.ai.vertex.ai.gemini.chat.options.safetySettings	用于控制安全过滤器的安全设置列表，定义见 Vertex AI 安全过滤器。每个安全设置可以包含方法、阈值和类别。	-

spring.ai.vertex.ai.gemini.chat.options.model

支持使用的 Vertex AI Gemini 聊天模型包括 (1.0) gemini-pro、gemini-pro-vision（已弃用）以及新的 gemini-1.5-pro-001、gemini-1.5-flash-001 模型。

gemini-1.5-pro-001

spring.ai.vertex.ai.gemini.chat.options.responseMimeType

生成的候选文本的输出响应 mimetype。

text/plain：（默认）文本输出，或 application/json：JSON 响应。

spring.ai.vertex.ai.gemini.chat.options.googleSearchRetrieval

使用 Google 搜索 Grounding 功能

true 或 false，默认 false。

spring.ai.vertex.ai.gemini.chat.options.temperature

控制输出的随机性。取值范围为 [0.0, 1.0]，包含边界。值越接近 1.0，生成的响应越多样化；值越接近 0.0，生成模型的响应通常越不令人惊讶。此值指定了后端在调用生成模型时使用的默认值。

0.8

spring.ai.vertex.ai.gemini.chat.options.topK

采样时要考虑的最大 token 数。生成模型使用 Top-k 和 nucleus 采样相结合的方式。Top-k 采样考虑 topK 个最可能的 token 集合。

spring.ai.vertex.ai.gemini.chat.options.topP

采样时要考虑的 token 的最大累积概率。生成模型使用 Top-k 和 nucleus 采样相结合的方式。Nucleus 采样考虑概率和至少为 topP 的最小 token 集合。

spring.ai.vertex.ai.gemini.chat.options.candidateCount

要返回的生成响应消息的数量。此值必须介于 [1, 8] 之间，包含边界。默认值为 1。

spring.ai.vertex.ai.gemini.chat.options.candidateCount

要返回的生成响应消息的数量。此值必须介于 [1, 8] 之间，包含边界。默认值为 1。

spring.ai.vertex.ai.gemini.chat.options.maxOutputTokens

要生成的最大 token 数。

spring.ai.vertex.ai.gemini.chat.options.frequencyPenalty

spring.ai.vertex.ai.gemini.chat.options.presencePenalty

spring.ai.vertex.ai.gemini.chat.options.toolNames

工具列表，按名称标识，用于在单个提示请求中启用函数调用。具有这些名称的工具必须存在于 ToolCallback 注册表中。

（已被 toolNames 弃用）spring.ai.vertex.ai.gemini.chat.options.functions

函数列表，按名称标识，用于在单个提示请求中启用函数调用。具有这些名称的函数必须存在于 functionCallbacks 注册表中。

spring.ai.vertex.ai.gemini.chat.options.proxy-tool-calls

如果为 true，Spring AI 将不会在内部处理函数调用，而是将其代理到客户端。然后由客户端负责处理函数调用，将其分派到适当的函数，并返回结果。如果为 false（默认值），Spring AI 将在内部处理函数调用。仅适用于支持函数调用的聊天模型

false

spring.ai.vertex.ai.gemini.chat.options.safetySettings

用于控制安全过滤器的安全设置列表，定义见 Vertex AI 安全过滤器。每个安全设置可以包含方法、阈值和类别。

所有带有 spring.ai.vertex.ai.gemini.chat.options 前缀的属性都可以在运行时通过向 Prompt 调用添加请求特定的运行时选项来覆盖。

运行时选项

VertexAiGeminiChatOptions.java 提供了模型配置，例如 temperature、topK 等。

启动时，可以通过 VertexAiGeminiChatModel(api, options) 构造函数或 spring.ai.vertex.ai.chat.options.* 属性来配置默认选项。

在运行时，您可以通过向 Prompt 调用添加新的、针对请求的选项来覆盖默认选项。例如，要为特定请求覆盖默认的 temperature

ChatResponse response = chatModel.call(
    new Prompt(
        "Generate the names of 5 famous pirates.",
        VertexAiGeminiChatOptions.builder()
            .temperature(0.4)
        .build()
    ));

除了模型特定的 VertexAiGeminiChatOptions，您还可以使用一个可移植的 ChatOptions 实例，该实例使用 ChatOptionsBuilder#builder() 创建。

工具调用

Vertex AI Gemini 模型支持工具调用功能，允许模型在对话中使用工具。以下是定义和使用基于 @Tool 的工具的示例

public class WeatherService {

    @Tool(description = "Get the weather in location")
    public String weatherByLocation(@ToolParam(description= "City or state name") String location) {
        ...
    }
}

String response = ChatClient.create(this.chatModel)
        .prompt("What's the weather like in Boston?")
        .tools(new WeatherService())
        .call()
        .content();

您也可以使用 java.util.function beans 作为工具

@Bean
@Description("Get the weather in location. Return temperature in 36°F or 36°C format.")
public Function<Request, Response> weatherFunction() {
    return new MockWeatherService();
}

String response = ChatClient.create(this.chatModel)
        .prompt("What's the weather like in Boston?")
        .tools("weatherFunction")
        .inputType(Request.class)
        .call()
        .content();

更多信息请参阅工具文档。

多模态

多模态是指模型同时理解和处理来自各种来源的信息的能力，包括 text、pdf、images、audio 和其他数据格式。

图像、音频、视频

Google 的 Gemini AI 模型通过理解和整合文本、代码、音频、图像和视频来支持此功能。更多详情，请参阅博客文章 Introducing Gemini。

Spring AI 的 Message 接口通过引入 Media 类型来支持多模态 AI 模型。此类型包含消息中媒体附件的数据和信息，使用 Spring 的 org.springframework.util.MimeType 和 java.lang.Object 来表示原始媒体数据。

下面是一个从 VertexAiGeminiChatModelIT.java 中提取的简单代码示例，演示了用户文本与图像的组合。

byte[] data = new ClassPathResource("/vertex-test.png").getContentAsByteArray();

var userMessage = new UserMessage("Explain what do you see on this picture?",
        List.of(new Media(MimeTypeUtils.IMAGE_PNG, this.data)));

ChatResponse response = chatModel.call(new Prompt(List.of(this.userMessage)));

PDF

最新的 Vertex Gemini 支持 PDF 输入类型。使用 application/pdf 媒体类型将 PDF 文件作为附件添加到消息中

var pdfData = new ClassPathResource("/spring-ai-reference-overview.pdf");

var userMessage = new UserMessage(
        "You are a very professional document summarization specialist. Please summarize the given document.",
        List.of(new Media(new MimeType("application", "pdf"), pdfData)));

var response = this.chatModel.call(new Prompt(List.of(userMessage)));

示例控制器

创建一个新的 Spring Boot 项目，并将 spring-ai-starter-model-vertex-ai-gemini 添加到您的 pom (或 gradle) 依赖项中。

在 src/main/resources 目录下添加一个 application.properties 文件，以启用和配置 VertexAi 聊天模型

spring.ai.vertex.ai.gemini.project-id=PROJECT_ID
spring.ai.vertex.ai.gemini.location=LOCATION
spring.ai.vertex.ai.gemini.chat.options.model=vertex-pro-vision
spring.ai.vertex.ai.gemini.chat.options.temperature=0.5

将 project-id 替换为您的 Google Cloud 项目 ID，将 location 替换为 Gemini 区域。

这将创建一个 VertexAiGeminiChatModel 实现，您可以将其注入到您的类中。以下是一个简单的 @Controller 类示例，它使用聊天模型进行文本生成。

@RestController
public class ChatController {

    private final VertexAiGeminiChatModel chatModel;

    @Autowired
    public ChatController(VertexAiGeminiChatModel chatModel) {
        this.chatModel = chatModel;
    }

    @GetMapping("/ai/generate")
    public Map generate(@RequestParam(value = "message", defaultValue = "Tell me a joke") String message) {
        return Map.of("generation", this.chatModel.call(message));
    }

    @GetMapping("/ai/generateStream")
	public Flux<ChatResponse> generateStream(@RequestParam(value = "message", defaultValue = "Tell me a joke") String message) {
        Prompt prompt = new Prompt(new UserMessage(message));
        return this.chatModel.stream(prompt);
    }
}

手动配置

VertexAiGeminiChatModel 实现了 ChatModel 接口，并使用 VertexAI 连接到 Vertex AI Gemini 服务。

将 spring-ai-vertex-ai-gemini 依赖项添加到项目的 Maven pom.xml 文件中

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-vertex-ai-gemini</artifactId>
</dependency>

或者添加到您的 Gradle build.gradle 构建文件中。

dependencies {
    implementation 'org.springframework.ai:spring-ai-vertex-ai-gemini'
}

请参阅依赖管理部分，将 Spring AI BOM 添加到您的构建文件。

接下来，创建一个 VertexAiGeminiChatModel 并将其用于文本生成

VertexAI vertexApi =  new VertexAI(projectId, location);

var chatModel = new VertexAiGeminiChatModel(this.vertexApi,
    VertexAiGeminiChatOptions.builder()
        .model(ChatModel.GEMINI_PRO_1_5_PRO)
        .temperature(0.4)
    .build());

ChatResponse response = this.chatModel.call(
    new Prompt("Generate the names of 5 famous pirates."));

VertexAiGeminiChatOptions 提供了聊天请求的配置信息。VertexAiGeminiChatOptions.Builder 是一个流式选项构建器。

低级 Java 客户端

以下类图说明了 Vertex AI Gemini 原生 Java API