SAP HANA 云

先决条件

自动配置

Spring AI 为 SAP Hana 向量存储提供 Spring Boot 自动配置。要启用它,请将以下依赖项添加到您项目的 Maven pom.xml 文件

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-hanadb-store-spring-boot-starter</artifactId>
</dependency>

或您的 Gradle build.gradle 构建文件。

dependencies {
    implementation 'org.springframework.ai:spring-ai-hanadb-store-spring-boot-starter'
}
请参考 依赖管理 部分将 Spring AI BOM 添加到您的构建文件。

请查看 配置参数 列表,了解向量存储的默认值和配置选项。

请参考 仓库 部分将里程碑和/或快照仓库添加到您的构建文件。

此外,您还需要一个已配置的 EmbeddingModel bean。有关更多信息,请参考 EmbeddingModel 部分。

HanaCloudVectorStore 属性

您可以在 Spring Boot 配置中使用以下属性来自定义 SAP Hana 向量存储。它使用 spring.datasource. 属性来配置 Hana 数据源,并使用 spring.ai.vectorstore.hanadb. 属性来配置 Hana 向量存储。

属性 描述 默认值

spring.datasource.driver-class-name

驱动类名称

com.sap.db.jdbc.Driver

spring.datasource.url

Hana 数据源 URL

-

spring.datasource.username

Hana 数据源用户名

-

spring.datasource.password

Hana 数据源密码

-

spring.ai.vectorstore.hanadb.top-k

待办事项

-

spring.ai.vectorstore.hanadb.table-name

待办事项

-

spring.ai.vectorstore.hanadb.initialize-schema

是否初始化所需的模式

false

构建一个示例 RAG 应用程序

展示如何设置一个使用 SAP Hana Cloud 作为向量数据库并利用 OpenAI 实现 RAG 模式的项目

  • 在 SAP Hana DB 中创建一个名为 CRICKET_WORLD_CUP 的表

CREATE TABLE CRICKET_WORLD_CUP (
    _ID VARCHAR2(255) PRIMARY KEY,
    CONTENT CLOB,
    EMBEDDING REAL_VECTOR(1536)
)
  • 在您的 pom.xml 中添加以下依赖项

您可以将属性 spring-ai-version 设置为 <spring-ai-version>1.0.0-SNAPSHOT</spring-ai-version>

<dependencyManagement>
    <dependencies>
        <dependency>
            <groupId>org.springframework.ai</groupId>
            <artifactId>spring-ai-bom</artifactId>
            <version>${spring-ai-version}</version>
            <type>pom</type>
            <scope>import</scope>
        </dependency>
    </dependencies>
</dependencyManagement>

<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-web</artifactId>
</dependency>

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-pdf-document-reader</artifactId>
</dependency>

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-openai-spring-boot-starter</artifactId>
</dependency>

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-hanadb-store-spring-boot-starter</artifactId>
</dependency>

<dependency>
    <groupId>org.projectlombok</groupId>
    <artifactId>lombok</artifactId>
    <version>1.18.30</version>
    <scope>provided</scope>
</dependency>
  • application.properties 文件中添加以下属性

spring.ai.openai.api-key=${OPENAI_API_KEY}
spring.ai.openai.embedding.options.model=text-embedding-ada-002

spring.datasource.driver-class-name=com.sap.db.jdbc.Driver
spring.datasource.url=${HANA_DATASOURCE_URL}
spring.datasource.username=${HANA_DATASOURCE_USERNAME}
spring.datasource.password=${HANA_DATASOURCE_PASSWORD}

spring.ai.vectorstore.hanadb.tableName=CRICKET_WORLD_CUP
spring.ai.vectorstore.hanadb.topK=3

创建一个名为 CricketWorldCupEntity 类,该类继承自 HanaVectorEntity

package com.interviewpedia.spring.ai.hana;

import jakarta.persistence.Column;
import jakarta.persistence.Entity;
import jakarta.persistence.Table;
import lombok.Data;
import lombok.NoArgsConstructor;
import lombok.extern.jackson.Jacksonized;
import org.springframework.ai.vectorstore.HanaVectorEntity;

@Entity
@Table(name = "CRICKET_WORLD_CUP")
@Data
@Jacksonized
@NoArgsConstructor
public class CricketWorldCup extends HanaVectorEntity {
    @Column(name = "content")
    private String content;
}
  • 创建一个名为 CricketWorldCupRepositoryRepository,它实现 HanaVectorRepository 接口

package com.interviewpedia.spring.ai.hana;

import jakarta.persistence.EntityManager;
import jakarta.persistence.PersistenceContext;
import jakarta.transaction.Transactional;
import org.springframework.ai.vectorstore.HanaVectorRepository;
import org.springframework.stereotype.Repository;

import java.util.List;

@Repository
public class CricketWorldCupRepository implements HanaVectorRepository<CricketWorldCup> {
    @PersistenceContext
    private EntityManager entityManager;

    @Override
    @Transactional
    public void save(String tableName, String id, String embedding, String content) {
        String sql = String.format("""
                INSERT INTO %s (_ID, EMBEDDING, CONTENT)
                VALUES(:_id, TO_REAL_VECTOR(:embedding), :content)
                """, tableName);

        entityManager.createNativeQuery(sql)
                .setParameter("_id", id)
                .setParameter("embedding", embedding)
                .setParameter("content", content)
                .executeUpdate();
    }

    @Override
    @Transactional
    public int deleteEmbeddingsById(String tableName, List<String> idList) {
        String sql = String.format("""
                DELETE FROM %s WHERE _ID IN (:ids)
                """, tableName);

        return entityManager.createNativeQuery(sql)
                .setParameter("ids", idList)
                .executeUpdate();
    }

    @Override
    @Transactional
    public int deleteAllEmbeddings(String tableName) {
        String sql = String.format("""
                DELETE FROM %s
                """, tableName);

        return entityManager.createNativeQuery(sql).executeUpdate();
    }

    @Override
    public List<CricketWorldCup> cosineSimilaritySearch(String tableName, int topK, String queryEmbedding) {
        String sql = String.format("""
                SELECT TOP :topK * FROM %s
                ORDER BY COSINE_SIMILARITY(EMBEDDING, TO_REAL_VECTOR(:queryEmbedding)) DESC
                """, tableName);

        return entityManager.createNativeQuery(sql, CricketWorldCup.class)
                .setParameter("topK", topK)
                .setParameter("queryEmbedding", queryEmbedding)
                .getResultList();
    }
}
  • 现在,创建一个 REST 控制器类 CricketWorldCupHanaController,并将 ChatModelVectorStore 作为依赖项自动装配。在这个控制器类中,创建以下 REST 端点

    • /ai/hana-vector-store/cricket-world-cup/purge-embeddings - 用于从向量存储中清除所有嵌入

    • /ai/hana-vector-store/cricket-world-cup/upload - 用于上传 Cricket_World_Cup.pdf,以便其数据作为嵌入存储在 SAP Hana Cloud 向量数据库中

    • /ai/hana-vector-store/cricket-world-cup - 用于使用 SAP Hana DB 中的余弦相似度 实现 RAG

package com.interviewpedia.spring.ai.hana;

import lombok.extern.slf4j.Slf4j;
import org.springframework.ai.chat.model.ChatModel;
import org.springframework.ai.chat.messages.UserMessage;
import org.springframework.ai.chat.prompt.Prompt;
import org.springframework.ai.chat.prompt.SystemPromptTemplate;
import org.springframework.ai.document.Document;
import org.springframework.ai.reader.pdf.PagePdfDocumentReader;
import org.springframework.ai.transformer.splitter.TokenTextSplitter;
import org.springframework.ai.vectorstore.HanaCloudVectorStore;
import org.springframework.ai.vectorstore.VectorStore;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.core.io.Resource;
import org.springframework.http.ResponseEntity;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.PostMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;
import org.springframework.web.multipart.MultipartFile;

import java.io.IOException;
import java.util.List;
import java.util.Map;
import java.util.function.Function;
import java.util.function.Supplier;
import java.util.stream.Collectors;

@RestController
@Slf4j
public class CricketWorldCupHanaController {
    private final VectorStore hanaCloudVectorStore;
    private final ChatModel chatModel;

    @Autowired
    public CricketWorldCupHanaController(ChatModel chatModel, VectorStore hanaCloudVectorStore) {
        this.chatModel = chatModel;
        this.hanaCloudVectorStore = hanaCloudVectorStore;
    }

    @PostMapping("/ai/hana-vector-store/cricket-world-cup/purge-embeddings")
    public ResponseEntity<String> purgeEmbeddings() {
        int deleteCount = ((HanaCloudVectorStore) this.hanaCloudVectorStore).purgeEmbeddings();
        log.info("{} embeddings purged from CRICKET_WORLD_CUP table in Hana DB", deleteCount);
        return ResponseEntity.ok().body(String.format("%d embeddings purged from CRICKET_WORLD_CUP table in Hana DB", deleteCount));
    }

    @PostMapping("/ai/hana-vector-store/cricket-world-cup/upload")
    public ResponseEntity<String> handleFileUpload(@RequestParam("pdf") MultipartFile file) throws IOException {
        Resource pdf = file.getResource();
        Supplier<List<Document>> reader = new PagePdfDocumentReader(pdf);
        Function<List<Document>, List<Document>> splitter = new TokenTextSplitter();
        List<Document> documents = splitter.apply(reader.get());
        log.info("{} documents created from pdf file: {}", documents.size(), pdf.getFilename());
        hanaCloudVectorStore.accept(documents);
        return ResponseEntity.ok().body(String.format("%d documents created from pdf file: %s",
                documents.size(), pdf.getFilename()));
    }

    @GetMapping("/ai/hana-vector-store/cricket-world-cup")
    public Map<String, String> hanaVectorStoreSearch(@RequestParam(value = "message") String message) {
        var documents = this.hanaCloudVectorStore.similaritySearch(message);
        var inlined = documents.stream().map(Document::getContent).collect(Collectors.joining(System.lineSeparator()));
        var similarDocsMessage = new SystemPromptTemplate("Based on the following: {documents}")
                .createMessage(Map.of("documents", inlined));

        var userMessage = new UserMessage(message);
        Prompt prompt = new Prompt(List.of(similarDocsMessage, userMessage));
        String generation = chatModel.call(prompt).getResult().getOutput().getContent();
        log.info("Generation: {}", generation);
        return Map.of("generation", generation);
    }
}
  • 使用来自维基百科的 上下文 pdf 文件

转到 维基百科下载 Cricket World Cup 页面作为 PDF 文件。

wikipedia

使用我们在上一步中创建的文件上传 REST 端点上传此 PDF 文件。