5.4 Enable the Chat Endpoint¶

In this step, you will review the backend API code for the /chat endpoint in the completions router. You will then add the completions router to the FastAPI application to make the /chat endpoint available.

Review Chat Endpoint Implementation¶

The Woodgove Bank API exposes endpoints in various routers... The chat endpoint resides in the completions router, defined in the src/api/app/routers/completions.py file. Open it now in VS Code and explore the code in sections. You can also expand the section below to see the code inline and review explanations for each line of code.

Chat endpoint code

src/api/app/routers/completions.py
from app.functions.chat_functions import ChatFunctions
from app.lifespan_manager import get_chat_client, get_db_connection_pool, get_embedding_client, get_prompt_service
from app.models import CompletionRequest
from fastapi import APIRouter, Depends
from langchain.agents import AgentExecutor, create_openai_functions_agent
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.tools import StructuredTool

# Initialize the router
router = APIRouter(
    prefix = "/completions",
    tags = ["Completions"],
    dependencies = [Depends(get_chat_client)],
    responses = {404: {"description": "Not found"}}
)

@router.post('/chat', response_model = str)
async def generate_chat_completion(
    request: CompletionRequest,
    llm = Depends(get_chat_client),
    db_pool = Depends(get_db_connection_pool),
    embedding_client = Depends(get_embedding_client),
    prompt_service = Depends(get_prompt_service)):
    """Generate a chat completion using the Azure OpenAI API."""

    # Retrieve the copilot prompt
    system_prompt = prompt_service.get_prompt("copilot")

    # Provide the copilot with a persona using the system prompt.
    messages = [{ "role": "system", "content": system_prompt }]

    # Add the chat history to the messages list
    # Chat history provides context of previous questions and responses for the copilot.
    for message in request.chat_history[-request.max_history:]:
        messages.append({"role": message.role, "content": message.content})

    # Create a chat prompt template
    prompt = ChatPromptTemplate.from_messages(
        [
            ("system", system_prompt),
            MessagesPlaceholder("chat_history", optional=True),
            ("user", "{input}"),
            MessagesPlaceholder("agent_scratchpad")
        ]
    )

    # Get the chat functions
    cf = ChatFunctions(db_pool, embedding_client)

    # Define tools for the agent to retrieve data from the database
    tools = [
        # Hybrid search functions
        StructuredTool.from_function(coroutine=cf.find_invoice_line_items),
        StructuredTool.from_function(coroutine=cf.find_invoice_validation_results),
        StructuredTool.from_function(coroutine=cf.find_milestone_deliverables),
        StructuredTool.from_function(coroutine=cf.find_sow_chunks),
        StructuredTool.from_function(coroutine=cf.find_sow_validation_results),
        # Get invoice data functions
        StructuredTool.from_function(coroutine=cf.get_invoice_id),
        StructuredTool.from_function(coroutine=cf.get_invoice_line_items),
        StructuredTool.from_function(coroutine=cf.get_invoice_validation_results),
        StructuredTool.from_function(coroutine=cf.get_invoices),
        # Get SOW data functions
        StructuredTool.from_function(coroutine=cf.get_sow_chunks),
        StructuredTool.from_function(coroutine=cf.get_sow_id),
        StructuredTool.from_function(coroutine=cf.get_sow_milestones),
        StructuredTool.from_function(coroutine=cf.get_milestone_deliverables),
        StructuredTool.from_function(coroutine=cf.get_sow_validation_results),
        StructuredTool.from_function(coroutine=cf.get_sows),
        # Get vendor data functions
        StructuredTool.from_function(coroutine=cf.get_vendors)
    ]

    # Create an agent
    agent = create_openai_functions_agent(llm=llm, tools=tools, prompt=prompt)
    agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True, return_intermediate_steps=True)
    completion = await agent_executor.ainvoke({"input": request.message, "chat_history": messages})
    return completion['output']

Import libraries (lines 1-7): Required classes and functions are imported from various libraries.
Intialize the router (lines 10-15): This is the completions router, assigning the route prefix, dependencies, and other metadata.
Define the chat endpoint (lines 17-23): The /chat endpoint is the entry point into the Woodgove Bank copilot implementation. It expects a CompletionRequest, which contains the user query, the chat history, and the maximum number of history messages to include in the prompt, and returns a text response.
- It accepts POST requests from clients and extracts required parameters.
- It invokes the get_chat_completion function with those parameters.
- It returns the LLM's response to the client.
Chat endpoint implementation (lines 24-78). The "/completions/chat" route maps to the endpoint where we can invoke the Contoso Chat implementation.
- Get the system prompt (line 27): The system prompt defines the copilot's persona, providing instructions about how the copilot should behave, respond to questions, and interact with customers. It also provides guidance about the RAG design pattern and how function calls (tools) should be used when answering questions. You will look at this in detail in the Prompt Engineering step of this section.
- Build messages collection (lines 30-35): The messages collection provides the LLM with system and user prompts and chat history messages. Each message consists of a role and message content. The role will be system, assistant, or user. After the system message, all subsequent messages must be user / assistant pairs, with a user query followed by an assistant response.
- Build the model prompt (lines 38-45): The LangChain ChatPromptTemplate class allows you to build a model prompt from a collection of messages.
  - The system prompt is added to provide instructions to the model.
  - The chat history is inserted as context about previous questions and responses.
  - The user input provides the model with the current question it is attempting to answer.
  - An agent scratchpad placeholder is included to allow responses from tools assigned to the agent to augment the model with grounding data.
  - The resulting prompt provides a structured input for the conversational AI agent, helping it to generate a response based on the given context.
- Implement function calling (lines 48-72):
  - Line 48 instantiates the ChatFunctions class, which contains the methods for interacting with the PostgreSQL database. You can review the functions in the src/api/app/functions/chat_functions.py file.
  - The tools array created in lines 51-72 is the collection of functions available to the LangChain agent for performing retrieval operations to augment the model prompt during response generation.
  - Tools are created using the StructuredTool.from_function method provided by LangChain.
    
    About the LangChain StructuredTool class
    
    The StructuredTool class is a wrapper that allows LangChain agents to interact with functions. The from_function method creates a tool from the given function, describing the function using its input parameters and docstring description. To use it with async methods, you pass the function's name to the coroutine input parameter.
    
    In Python, a docstring (short for documentation string) is a special type of string used to document a function, method, class, or module. It provides a convenient way of associating documentation with Python code and is typically enclosed within triple quotes (""" or '''). Docstrings are placed immediately after the definition of the function (or method, class, or module) they document.
    
    Using the StructuredTool.from_function method automates the creation of the JSON function definitions required by Azure OpenAI function calling methods, simplifying function calling when using LangChain.
- Create a LangChain agent (lines 75-76): The LangChain agent is responsible for interacting with the LLM to generate a response.
  - Using the create_openai_functions_agent method, a LangChain agent is instantiated. This agent handles function calling via the tools provided to the agent.
    
    About the create_openai_functions_agent function
    
    The create_openai_functions_agent function in LangChain creates an agent that can call external functions to perform tasks using a specified language model and tools. This enables the integration of various services and functionalities into the agent's workflow, providing flexibility and enhanced capabilities.
  - LangChain's AgentExecutor class manages the agent's execution flow. It handles the processing of inputs, the invocation of tools or models, and the handling of outputs.
    
    About LangChain's AgentExecutor
    
    The AgentExecutor ensures that all the steps required to generate a response are executed in the correct order. It abstracts the complexities of execution for agents, providing an additional layer of functionality and structure, and making it easier to build, manage, and scale sophisticated agents.
- Invoke the agent (line 77): The agent executor's async invoke method sends the incoming user message and chat history to the LLM.
  - The input and chat_history tokens were defined in the prompt object created using the ChatPromptTemplate. The invoke method injects these into the model prompt, allowing the LLM to use that information when generating a response.
  - The LangChain agent uses the LLM to determine if tool calls are necessary by evaluating the user query.
  - Any tools required to answer the question are called, and the model prompt is augmented with grounding data from their results to formulate the final response.
- Return the response (line 78): The agent's completion response is returned to the user.

Enable Chat Endpoint Calls¶

To enable the /chat endpoint to be called from the Woodgrove Bank Contract Management Portal, you will add the completions router to the FastAPI app.

In the VS Code Explorer, navigate to the src/api/app folder and open the main.py file.
Locate the block of code where the API endpoint routers are added (lines 44-56).
Insert the following code at the start of that block (just below the # Add routers to API endpoints comment on line 43) to add the completions/chat endpoint to the exposed API.

Insert the code below onto line 43 of main.py!
Python
1
app.include_router(completions.router)

The updated list of routers should look like this:

Python
# Add routers to API endpoints
app.include_router(completions.router)
app.include_router(deliverables.router)
app.include_router(documents.router)
app.include_router(embeddings.router)
app.include_router(invoices.router)
app.include_router(invoice_line_items.router)
app.include_router(milestones.router)
app.include_router(sows.router)
app.include_router(status.router)
app.include_router(statuses.router)
app.include_router(validation.router)
app.include_router(validation_results.router)
app.include_router(vendors.router)
app.include_router(webhooks.router)

Save the main.py file.

Congratulations! Your API is now enabled for copilot interactions!