run_simulation
==============

.. currentmodule:: opik.simulation

.. autofunction:: run_simulation

Description
-----------

The ``run_simulation`` function orchestrates multi-turn conversation simulations between a simulated user and your application. It manages the conversation flow, tracks traces, and returns comprehensive results for evaluation.

Key Features
------------

- **Multi-turn conversations**: Runs conversations for a specified number of turns
- **Automatic tracing**: Automatically decorates your app with ``@track`` if not already decorated
- **Thread management**: Groups all traces from a simulation under a single thread ID
- **Error handling**: Gracefully handles errors and continues simulation
- **Flexible configuration**: Supports custom parameters and metadata

Function Signature
------------------

.. code-block:: python

   run_simulation(
       app: Callable,
       user_simulator: SimulatedUser,
       initial_message: Optional[str] = None,
       max_turns: int = 5,
       thread_id: Optional[str] = None,
       project_name: Optional[str] = None,
       **app_kwargs: Any
   ) -> Dict[str, Any]

Parameters
----------

**app** (Callable)
   Your application function that processes messages. Must have signature:
   ``app(message: str, *, thread_id: str, **kwargs) -> Dict[str, str]``
   
   The function will be automatically decorated with ``@track`` if not already decorated.

**user_simulator** (SimulatedUser)
   Instance of ``SimulatedUser`` that generates user responses.

**initial_message** (str, optional)
   Optional initial message from the user. If ``None``, the simulator will generate one.

**max_turns** (int, optional)
   Maximum number of conversation turns. Defaults to 5.

**thread_id** (str, optional)
   Thread ID for grouping traces. If ``None``, a new ID will be generated.

**project_name** (str, optional)
   Project name for trace logging. Included in trace metadata.

**app_kwargs** (Any)
   Additional keyword arguments passed to the app function.

Returns
-------

**Dict[str, Any]**
   Dictionary containing:
   
   - **thread_id** (str): The thread ID used for this simulation
   - **conversation_history** (List[Dict[str, str]]): Complete conversation as message dictionaries
   - **project_name** (str, optional): Project name if provided

App Function Requirements
-------------------------

Your app function must follow this signature:

.. code-block:: python

   def my_app(user_message: str, *, thread_id: str, **kwargs) -> Dict[str, str]:
       # Process the user message
       # Manage conversation history internally using thread_id
       # Return assistant response as message dict
       return {"role": "assistant", "content": "Your response"}

**Key Requirements:**

1. **First parameter**: Must accept the user message as a string
2. **thread_id parameter**: Must accept thread_id as a keyword-only argument
3. **Return format**: Must return a dictionary with 'role' and 'content' keys
4. **History management**: Your app is responsible for managing conversation history internally

Examples
--------

Basic Usage
~~~~~~~~~~~

.. code-block:: python

   from opik.simulation import SimulatedUser, run_simulation
   from opik import track

   # Create a simulated user
   user_simulator = SimulatedUser(
       persona="You are a customer who wants help with a product",
       model="openai/gpt-4o-mini"
   )

   # Define your agent with conversation history management
   agent_history = {}

   @track
   def customer_service_agent(user_message: str, *, thread_id: str, **kwargs):
       if thread_id not in agent_history:
           agent_history[thread_id] = []
       
       # Add user message to history
       agent_history[thread_id].append({"role": "user", "content": user_message})
       
       # Process with full conversation context
       messages = agent_history[thread_id]
       
       # Your agent logic here (e.g., call LLM)
       response = "I can help you with that. What specific issue are you experiencing?"
       
       # Add assistant response to history
       agent_history[thread_id].append({"role": "assistant", "content": response})
       
       return {"role": "assistant", "content": response}

   # Run the simulation
   simulation = run_simulation(
       app=customer_service_agent,
       user_simulator=user_simulator,
       max_turns=5,
       project_name="customer_service_evaluation"
   )

   print(f"Thread ID: {simulation['thread_id']}")
   print(f"Conversation length: {len(simulation['conversation_history'])}")

Custom Initial Message
~~~~~~~~~~~~~~~~~~~~~~

.. code-block:: python

   # Start with a specific initial message
   simulation = run_simulation(
       app=customer_service_agent,
       user_simulator=user_simulator,
       initial_message="I'm having trouble with my order",
       max_turns=3
   )

Custom Thread ID
~~~~~~~~~~~~~~~~

.. code-block:: python

   # Use a custom thread ID for easier tracking
   custom_thread_id = "simulation_test_001"
   
   simulation = run_simulation(
       app=customer_service_agent,
       user_simulator=user_simulator,
       thread_id=custom_thread_id,
       max_turns=5
   )

Multiple Simulations
~~~~~~~~~~~~~~~~~~~

.. code-block:: python

   # Run multiple simulations with different personas
   personas = [
       "You are a frustrated customer who wants a refund",
       "You are a happy customer who wants to buy more",
       "You are a confused user who needs help with setup"
   ]

   simulations = []
   for i, persona in enumerate(personas):
       simulator = SimulatedUser(persona=persona)
       simulation = run_simulation(
           app=customer_service_agent,
           user_simulator=simulator,
           max_turns=5,
           project_name="multi_persona_evaluation"
       )
       simulations.append(simulation)
       print(f"Simulation {i+1} completed: {simulation['thread_id']}")

Integration with Evaluation
~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. code-block:: python

   from opik.evaluation import evaluate_threads
   from opik.evaluation.metrics import ConversationThreadMetric

   # Run simulations
   simulation = run_simulation(
       app=customer_service_agent,
       user_simulator=user_simulator,
       max_turns=5,
       project_name="evaluation_test"
   )

   # Evaluate the simulation thread
   results = evaluate_threads(
       project_name="evaluation_test",
       filter_string=f'thread_id = "{simulation["thread_id"]}"',
       metrics=[ConversationThreadMetric()]
   )

Advanced Usage with Tags
~~~~~~~~~~~~~~~~~~~~~~~~

.. code-block:: python

   # Add custom tags and metadata to traces
   simulation = run_simulation(
       app=customer_service_agent,
       user_simulator=user_simulator,
       max_turns=5,
       project_name="tagged_simulation",
       simulation_id="test_001",  # Custom parameter
       tags=["simulation", "customer_service"]  # Custom parameter
   )

   # Your app can access these parameters
   @track
   def tagged_agent(user_message: str, *, thread_id: str, simulation_id: str = None, tags: List[str] = None, **kwargs):
       # Use simulation_id and tags for custom logic
       if simulation_id:
           print(f"Running simulation: {simulation_id}")
       
       return {"role": "assistant", "content": "Response"}

Error Handling
~~~~~~~~~~~~~~

.. code-block:: python

   @track
   def error_prone_agent(user_message: str, *, thread_id: str, **kwargs):
       # This might raise an exception
       if "error" in user_message.lower():
           raise ValueError("Simulated error")
       
       return {"role": "assistant", "content": "Normal response"}

   # run_simulation handles errors gracefully
   simulation = run_simulation(
       app=error_prone_agent,
       user_simulator=user_simulator,
       max_turns=3
   )

   # Errors are captured in the conversation history
   for message in simulation['conversation_history']:
       if "Error processing message" in message.get('content', ''):
           print(f"Error occurred: {message['content']}")

Best Practices
--------------

1. **Thread Management**: Always use the provided ``thread_id`` to manage conversation history
2. **Error Handling**: Implement proper error handling in your app function
3. **Return Format**: Always return message dictionaries with 'role' and 'content' keys
4. **History Management**: Keep conversation history in a thread-safe way if running concurrent simulations
5. **Resource Management**: Be mindful of token usage with long conversations
6. **Testing**: Use fixed responses in SimulatedUser for deterministic testing

Notes
-----

- The function automatically decorates your app with ``@track`` if not already decorated
- All traces from a simulation are grouped under the same thread ID
- The function handles errors gracefully and continues the simulation
- Conversation history is returned as a list of message dictionaries
- Custom parameters passed via ``**app_kwargs`` are forwarded to your app function