5.4 Enable the Chat Endpoint¶
In this step, you will review the backend API code for the /chat
endpoint in the completions
router. You will then add the completions
router to the FastAPI application to make the /chat
endpoint available.
Review Chat Endpoint Implementation¶
The Woodgove Bank API exposes endpoints in various routers... The chat
endpoint resides in the completions
router, defined in the src/api/app/routers/completions.py
file. Open it now in VS Code and explore the code in sections. You can also expand the section below to see the code inline and review explanations for each line of code.
Chat endpoint code
src/api/app/routers/completions.py | |
---|---|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 |
|
-
Import libraries (lines 1-7): Required classes and functions are imported from various libraries.
-
Intialize the router (lines 10-15): This is the
completions
router, assigning the route prefix, dependencies, and other metadata. -
Define the chat endpoint (lines 17-23): The
/chat
endpoint is the entry point into the Woodgove Bank copilot implementation. It expects aCompletionRequest
, which contains the user query, the chat history, and the maximum number of history messages to include in the prompt, and returns a text response.- It accepts POST requests from clients and extracts required parameters.
- It invokes the get_chat_completion function with those parameters.
- It returns the LLM's response to the client.
-
Chat endpoint implementation (lines 24-78). The "/completions/chat" route maps to the endpoint where we can invoke the Contoso Chat implementation.
-
Get the system prompt (line 27): The system prompt defines the copilot's persona, providing instructions about how the copilot should behave, respond to questions, and interact with customers. It also provides guidance about the RAG design pattern and how function calls (tools) should be used when answering questions. You will look at this in detail in the Prompt Engineering step of this section.
-
Build messages collection (lines 30-35): The messages collection provides the LLM with system and user prompts and chat history messages. Each message consists of a
role
and messagecontent
. The role will besystem
,assistant
, oruser
. After thesystem
message, all subsequent messages must beuser
/assistant
pairs, with a user query followed by an assistant response. -
Build the model prompt (lines 38-45): The LangChain
ChatPromptTemplate
class allows you to build a model prompt from a collection of messages.- The system prompt is added to provide instructions to the model.
- The chat history is inserted as context about previous questions and responses.
- The user input provides the model with the current question it is attempting to answer.
- An agent scratchpad placeholder is included to allow responses from tools assigned to the agent to augment the model with grounding data.
- The resulting prompt provides a structured input for the conversational AI agent, helping it to generate a response based on the given context.
-
Implement function calling (lines 48-72):
- Line 48 instantiates the
ChatFunctions
class, which contains the methods for interacting with the PostgreSQL database. You can review the functions in thesrc/api/app/functions/chat_functions.py
file. - The
tools
array created in lines 51-72 is the collection of functions available to the LangChain agent for performing retrieval operations to augment the model prompt during response generation. -
Tools are created using the
StructuredTool.from_function
method provided by LangChain.About the LangChain
StructuredTool
classThe
StructuredTool
class is a wrapper that allows LangChain agents to interact with functions. Thefrom_function
method creates a tool from the given function, describing the function using its input parameters and docstring description. To use it with async methods, you pass the function's name to thecoroutine
input parameter.In Python, a docstring (short for documentation string) is a special type of string used to document a function, method, class, or module. It provides a convenient way of associating documentation with Python code and is typically enclosed within triple quotes (""" or '''). Docstrings are placed immediately after the definition of the function (or method, class, or module) they document.
Using the
StructuredTool.from_function
method automates the creation of the JSON function definitions required by Azure OpenAI function calling methods, simplifying function calling when using LangChain.
- Line 48 instantiates the
-
Create a LangChain agent (lines 75-76): The LangChain agent is responsible for interacting with the LLM to generate a response.
-
Using the
create_openai_functions_agent
method, a LangChain agent is instantiated. This agent handles function calling via thetools
provided to the agent.About the
create_openai_functions_agent
functionThe
create_openai_functions_agent
function in LangChain creates an agent that can call external functions to perform tasks using a specified language model and tools. This enables the integration of various services and functionalities into the agent's workflow, providing flexibility and enhanced capabilities. -
LangChain's
AgentExecutor
class manages the agent's execution flow. It handles the processing of inputs, the invocation of tools or models, and the handling of outputs.About LangChain's
AgentExecutor
The
AgentExecutor
ensures that all the steps required to generate a response are executed in the correct order. It abstracts the complexities of execution for agents, providing an additional layer of functionality and structure, and making it easier to build, manage, and scale sophisticated agents.
-
-
Invoke the agent (line 77): The agent executor's async
invoke
method sends the incoming user message and chat history to the LLM.- The
input
andchat_history
tokens were defined in the prompt object created using theChatPromptTemplate
. Theinvoke
method injects these into the model prompt, allowing the LLM to use that information when generating a response. - The LangChain agent uses the LLM to determine if tool calls are necessary by evaluating the user query.
- Any tools required to answer the question are called, and the model prompt is augmented with grounding data from their results to formulate the final response.
- The
-
Return the response (line 78): The agent's completion response is returned to the user.
-
Enable Chat Endpoint Calls¶
To enable the /chat
endpoint to be called from the Woodgrove Bank Contract Management Portal, you will add the completions router to the FastAPI app.
-
In the VS Code Explorer, navigate to the
src/api/app
folder and open themain.py
file. -
Locate the block of code where the API endpoint routers are added (lines 44-56).
-
Insert the following code at the start of that block (just below the
# Add routers to API endpoints
comment on line 43) to add thecompletions/chat
endpoint to the exposed API.Insert the code below onto line 43 of
main.py
!Python 1
app.include_router(completions.router)
-
The updated list of routers should look like this:
Python 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
# Add routers to API endpoints app.include_router(completions.router) app.include_router(deliverables.router) app.include_router(documents.router) app.include_router(embeddings.router) app.include_router(invoices.router) app.include_router(invoice_line_items.router) app.include_router(milestones.router) app.include_router(sows.router) app.include_router(status.router) app.include_router(statuses.router) app.include_router(validation.router) app.include_router(validation_results.router) app.include_router(vendors.router) app.include_router(webhooks.router)
-
Save the
main.py
file.
Congratulations! Your API is now enabled for copilot interactions!