5.4 Enable the Chat Endpoint¶
In this step, you will review the backend API code for the /chat endpoint in the completions router. You will then add the completions router to the FastAPI application to make the /chat endpoint available.
Review Chat Endpoint Implementation¶
The Woodgove Bank API exposes endpoints in various routers... The chat endpoint resides in the completions router, defined in the src/api/app/routers/completions.py file. Open it now in VS Code and explore the code in sections. You can also expand the section below to see the code inline and review explanations for each line of code.
Chat endpoint code
| src/api/app/routers/completions.py | |
|---|---|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 | |
-
Import libraries (lines 1-7): Required classes and functions are imported from various libraries.
-
Intialize the router (lines 10-15): This is the
completionsrouter, assigning the route prefix, dependencies, and other metadata. -
Define the chat endpoint (lines 17-23): The
/chatendpoint is the entry point into the Woodgove Bank copilot implementation. It expects aCompletionRequest, which contains the user query, the chat history, and the maximum number of history messages to include in the prompt, and returns a text response.- It accepts POST requests from clients and extracts required parameters.
- It invokes the get_chat_completion function with those parameters.
- It returns the LLM's response to the client.
-
Chat endpoint implementation (lines 24-78). The "/completions/chat" route maps to the endpoint where we can invoke the Contoso Chat implementation.
-
Get the system prompt (line 27): The system prompt defines the copilot's persona, providing instructions about how the copilot should behave, respond to questions, and interact with customers. It also provides guidance about the RAG design pattern and how function calls (tools) should be used when answering questions. You will look at this in detail in the Prompt Engineering step of this section.
-
Build messages collection (lines 30-35): The messages collection provides the LLM with system and user prompts and chat history messages. Each message consists of a
roleand messagecontent. The role will besystem,assistant, oruser. After thesystemmessage, all subsequent messages must beuser/assistantpairs, with a user query followed by an assistant response. -
Build the model prompt (lines 38-45): The LangChain
ChatPromptTemplateclass allows you to build a model prompt from a collection of messages.- The system prompt is added to provide instructions to the model.
- The chat history is inserted as context about previous questions and responses.
- The user input provides the model with the current question it is attempting to answer.
- An agent scratchpad placeholder is included to allow responses from tools assigned to the agent to augment the model with grounding data.
- The resulting prompt provides a structured input for the conversational AI agent, helping it to generate a response based on the given context.
-
Implement function calling (lines 48-72):
- Line 48 instantiates the
ChatFunctionsclass, which contains the methods for interacting with the PostgreSQL database. You can review the functions in thesrc/api/app/functions/chat_functions.pyfile. - The
toolsarray created in lines 51-72 is the collection of functions available to the LangChain agent for performing retrieval operations to augment the model prompt during response generation. -
Tools are created using the
StructuredTool.from_functionmethod provided by LangChain.About the LangChain
StructuredToolclassThe
StructuredToolclass is a wrapper that allows LangChain agents to interact with functions. Thefrom_functionmethod creates a tool from the given function, describing the function using its input parameters and docstring description. To use it with async methods, you pass the function's name to thecoroutineinput parameter.In Python, a docstring (short for documentation string) is a special type of string used to document a function, method, class, or module. It provides a convenient way of associating documentation with Python code and is typically enclosed within triple quotes (""" or '''). Docstrings are placed immediately after the definition of the function (or method, class, or module) they document.
Using the
StructuredTool.from_functionmethod automates the creation of the JSON function definitions required by Azure OpenAI function calling methods, simplifying function calling when using LangChain.
- Line 48 instantiates the
-
Create a LangChain agent (lines 75-76): The LangChain agent is responsible for interacting with the LLM to generate a response.
-
Using the
create_openai_functions_agentmethod, a LangChain agent is instantiated. This agent handles function calling via thetoolsprovided to the agent.About the
create_openai_functions_agentfunctionThe
create_openai_functions_agentfunction in LangChain creates an agent that can call external functions to perform tasks using a specified language model and tools. This enables the integration of various services and functionalities into the agent's workflow, providing flexibility and enhanced capabilities. -
LangChain's
AgentExecutorclass manages the agent's execution flow. It handles the processing of inputs, the invocation of tools or models, and the handling of outputs.About LangChain's
AgentExecutorThe
AgentExecutorensures that all the steps required to generate a response are executed in the correct order. It abstracts the complexities of execution for agents, providing an additional layer of functionality and structure, and making it easier to build, manage, and scale sophisticated agents.
-
-
Invoke the agent (line 77): The agent executor's async
invokemethod sends the incoming user message and chat history to the LLM.- The
inputandchat_historytokens were defined in the prompt object created using theChatPromptTemplate. Theinvokemethod injects these into the model prompt, allowing the LLM to use that information when generating a response. - The LangChain agent uses the LLM to determine if tool calls are necessary by evaluating the user query.
- Any tools required to answer the question are called, and the model prompt is augmented with grounding data from their results to formulate the final response.
- The
-
Return the response (line 78): The agent's completion response is returned to the user.
-
Enable Chat Endpoint Calls¶
To enable the /chat endpoint to be called from the Woodgrove Bank Contract Management Portal, you will add the completions router to the FastAPI app.
-
In the VS Code Explorer, navigate to the
src/api/appfolder and open themain.pyfile. -
Locate the block of code where the API endpoint routers are added (lines 44-56).
-
Insert the following code at the start of that block (just below the
# Add routers to API endpointscomment on line 43) to add thecompletions/chatendpoint to the exposed API.Insert the code below onto line 43 of
main.py!Python 1app.include_router(completions.router) -
The updated list of routers should look like this:
Python 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
# Add routers to API endpoints app.include_router(completions.router) app.include_router(deliverables.router) app.include_router(documents.router) app.include_router(embeddings.router) app.include_router(invoices.router) app.include_router(invoice_line_items.router) app.include_router(milestones.router) app.include_router(sows.router) app.include_router(status.router) app.include_router(statuses.router) app.include_router(validation.router) app.include_router(validation_results.router) app.include_router(vendors.router) app.include_router(webhooks.router) -
Save the
main.pyfile.
Congratulations! Your API is now enabled for copilot interactions!