Skip to content

Log prompts and chains

You can log your prompts and chains to Comet using the open-source comet-llm Python SDK.

The LLM SDK also allows you to log a user feedback score using the LLM API.

Install and configure the LLM SDK

To install and configure the LLM SDK, you can run the following:

  1. Install the SDK
    pip install comet_llm
    
  2. Configure the SDK:
    1
    2
    3
    import comet_llm
    
    comet_llm.init()
    

You can find a full guide on how to configure the SDK here.

Note

If you are using OpenAI or LangChain, you don't need to log each prompt or chain manually as we have dedicated integrations with these frameworks. You can read more about the OpenAI integration here and the LangChain integration here.

Log prompts

The LLM SDK supports logging prompts with it's associated response as well as any associated metadata like token usage. This can be achieved through the function log_prompt:

import comet_llm

comet_llm.log_prompt(
    prompt="Answer the question and if the question can't be answered, say \"I don't know\"\n\n---\n\nQuestion: What is your name?\nAnswer:",
    prompt_template="Answer the question and if the question can't be answered, say \"I don't know\"\n\n---\n\nQuestion: {{question}}?\nAnswer:",
    prompt_template_variables={"question": "What is your name?"},
    metadata= {
        "usage.prompt_tokens": 7,
        "usage.completion_tokens": 5,
        "usage.total_tokens": 12,
    },
    output=" My name is Alex.",
    duration=16.598,
)

Log chains

The LLM SDK supports logging a chain of executions that may include more than one LLM call, context retrieval, or data pre- or post-processing.

First start a chain with its inputs:

import comet_llm

comet_llm.start_chain({"user_question": user_question})

For each step in the chain, you can create a Span object. The Span object keeps track of the input, outputs, and duration of the step. You can have as many Spans as needed, and they can be nested within each other. Here is very simple example:

with comet_llm.Span(
    category="YOUR-SPAN-CATEGORY", # You can use any string here
    inputs=INPUTS, # You can pass any object in that dict as long as it can be dumped in JSON
) as span:
    YOUR_CODE_HERE

    span.set_outputs(outputs=OUTPUTS) # You can pass any object in that dict as long as it can be dumped in JSON

Here is a more realistic example of Spans including nesting of them:

def retrieve_context(user_question):
    # Retrieve the context
    with comet_llm.Span(
        category="context-retrieval",
        name="Retrieve Context",
        inputs={"user_question": user_question},
    ) as context_span:
        context = get_context(user_question)

        context_span.set_outputs(outputs={"context": context})

    return context


def llm_call(user_question, context):
    prompt_template = """You are a helpful chatbot. You have access to the following context:
    {context}
    Analyze the following user question and decide if you can answer it, if the question can't be answered, say \"I don't know\":
    {user_question}
    """

    prompt = prompt_template.format(user_question=user_question, context=context)

    with comet_llm.Span(
        category="llm-call",
        inputs={"prompt_template": prompt_template, "prompt": prompt},
    ) as llm_span:
        # Call your LLM model here
        result = "Yes we are currently open"
        usage = {"prompt_tokens": 52, "completion_tokens": 12, "total_tokens": 64}

        llm_span.set_outputs(outputs={"result": result}, metadata={"usage": usage})

    return result


with comet_llm.Span(
    category="llm-reasoning",
    inputs={
        "user_question": user_question,
    },
) as span:
    context = retrieve_context(user_question)

    result = llm_call(user_question, context)

    span.set_outputs(outputs={"result": result})

Then finally end your chain and upload it with:

comet_llm.end_chain(outputs={"result": result})

For more information, refer to the following references:

Log user feedback score

There are two ways to log a user feedback score to Comet:

  1. Using the log_user_feedback method
  2. By logging your own metadata attribute

The benefit of using the log_user_feedback method is that Comet will display the average of this score when you group prompts. In addition, you can update this score in the Prompt Table using the thumbs up / thumbs down feature.

However since Comet only supports 0 and 1 as valid feedback scores, in some scenarios you might want to log the feedback score as a metadata attribute.

Log user feedback score with log_user_feedback

In order to use the log_user_feedback method, you will first need to retrieve the prompt and then log the score:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
import comet_llm

comet_llm.init()

# Log a prompt
logged_prompt = comet_llm.log_prompt(
    prompt="This is a test prompt",
    output="This is a test response"
)

# Retrieve the prompt and add the user-feedback score
api = comet_llm.API()

llm_trace = api.get_llm_trace_by_key(logged_prompt.id)
llm_trace.log_user_feedback(1)

Log user feedback score as metadata

You can log a user feedback score as metadata either when you log the prompt or at a later date. In the example below we will log the score after the prompt has been logged:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
import comet_llm

comet_llm.init()

# Log a prompt
logged_prompt = comet_llm.log_prompt(
    prompt="This is a test prompt",
    output="This is a test response",
    metadata={"key": "value"}
)

# Retrieve the prompt and add the user-feedback score
api = comet_llm.API()

llm_trace = api.get_llm_trace_by_key(logged_prompt.id)
llm_trace.log_metadata({
    "custom_feedback_score": 1
})
Apr. 29, 2024