3.5 Vectorize data¶
By leveraging an embedding model, such as Azure OpenAI's text-embedding-ada-002
, you can generate vector representations of textual data and store them in a vector store like Azure Database for PostgreSQL - Flexible Server. This approach facilitates efficient and accurate similarity searches, significantly enhancing the copilot's ability to retrieve relevant information and provide contextually rich interactions.
The azure_openai
schema installed by the azure_ai
extension contains the create_embeddings()
function. This function enables you to generate embeddings for text input by invoking an embedding model deployed in Azure OpenAI directly from a query.
Function signatures for the create_embeddings() function | |
---|---|
1 2 3 4 5 |
|
Learn more about the create_embeddings()
function, its overloads, and expected arguments in the function documentation.
Generate embeddings¶
The azure_ai
extension makes calling the Azure OpenAI embedding API trivial. In its simplest form, the create_embeddings()
function can be called with two arguments, deployment_name
and input
, as shown below:
SQL | |
---|---|
1 |
|
To demonstrate how to generate vector embeddings through a SQL query, execute the following query in pgAdmin.
-
Return to the open instance of pgAdmin on your local machine and ensure it is connected to your PostgreSQL database.
-
In the pgAdmin Object Explorer, expand databases under your PostgreSQL server.
-
Right-click the contracts database and select Query Tool from the context menu.
-
Run the following query, which creates a vector embedding for the
result
field in theinvoice_validation_results
table. Thedeployment_name
parameter in the function is set toembeddings
, which is the name of the deployment of thetext-embedding-ada-002
model in your Azure OpenAI service (it was created with that name by the Bicep deployment script):Execute the SQL command below!
SQL 1 2 3 4 5
SELECT invoice_id, azure_openai.create_embeddings('embeddings', result) AS embedding FROM invoice_validation_results LIMIT 1;
You can view the deployment name of your embedding model in Azure AI Foundry.
- Open Azure AI Foundry from the landing page of your Azure OpenAI service.
- In Azure AI Foundry, select the Deployments option from the resource navigation menu.
-
Observe the Name associated with the
text-embedding-ada-002
model.
-
The results of the query will look similar to this:
SQL 1 2 3
id | embedding ------+------------------------------------------------------------------------- 1 | {-0.031766646,-0.033289704,0.0009468119,...,0.016508864,0.031440277}
The 1,536 dimensions in the vector are abbreviated in the above output for brevity.
Vectorize existing data¶
You added vector
columns to the deliverables
, invoice_line_items
, invoice_validation_results
, sow_chunks
, and sow_validation_results
tables. You will now use the azure_openai.create_embeddings()
function in SQL UPDATE
statements to generate embeddings for the text data already in each table.
Each table query may take several minutes, depending on the configured TPM limits.
Using pgAdmin, execute the SQL statement for each table.
Select the tab for each table below and execute the UPDATE
statement to create embeddings for the specified column.
Copy and paste the following SQL statement into a new query window in pgAdmin, then execute the query.
SQL | |
---|---|
1 2 3 |
|
Copy and paste the following SQL statement into a new query window in pgAdmin, then execute the query.
SQL | |
---|---|
1 2 3 |
|
Copy and paste the following SQL statement into a new query window in pgAdmin, then execute the query.
SQL | |
---|---|
1 2 3 |
|
Copy and paste the following SQL statement into a new query window in pgAdmin, then execute the query.
SQL | |
---|---|
1 2 3 |
|
Copy and paste the following SQL statement into a new query window in pgAdmin, then execute the query.
SQL | |
---|---|
1 2 3 |
|
Be sure to run the UPDATE
statements for each table to create all the embeddings before moving on to the next step.
Generate Embeddings on Database INSERT
using a Trigger
The code for this application performs the calls to the azure_openai
extension to generate the vector embedding directly within the database INSERT
command. There are times where it's more useful to use an INSERT
trigger on the table. The trigger will automatically generate the embeddings anytime a row is inserted into table.
The following is an example of what an INSERT
trigger for generating vector embedding on the sow_chunks
table might look:
Example INSERT Trigger to Generate Embeddings | |
---|---|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
|