Overview

The Osiris Agent is an advanced analysis and optimization co-pilot that helps developers deliver better context and tools to large language models (LLMs). Built on the Model Context Protocol (MCP) standard, Osiris Agent provides intelligent insights and automated optimizations for your LLM applications. For detailed information about the underlying MCP specification, refer to the official documentation.

Benefits of Osiris Agent

Osiris Agent acts as your intelligent co-pilot, providing comprehensive analysis and optimization capabilities for workflows and agents. It enables:

  • Smart Debugging: Automatically identifies performance bottlenecks and issues in complex workflows
  • AI-Powered Optimization: Analyzes trace data to suggest targeted code improvements
  • Rapid Experimentation: Provides automated evaluation feedback for faster iteration
  • Development Acceleration: Minimizes manual analysis of trace data across large-scale deployments

For example, when working with a complex workflow containing thousands of traces, Osiris Agent can analyze trace patterns, identify root causes of issues, and suggest specific code optimizations based on evaluation metrics.

Integration Guide

Prerequisites

  • Docker installed on your system
  • Judgment API credentials
  • LLM API key (Gemini, OpenAI, or Anthropic)

Configuration

To integrate the Osiris agent functionality with an existing AI agent, create an MCP configuration file with the following structure:

{
    "mcpServers": {
        "mcp-server": {
            "command": "docker",
            "args": [
                "run",
                "-i",
                "--rm",
                "-e",
                "JUDGMENT_ORG_ID=<YOUR JUDGMENT ORG ID>",
                "-e",
                "JUDGMENT_API_KEY=<YOUR JUDGMENT API KEY>",
                "-e",
                "GEMINI_API_KEY=<YOUR GEMINI API KEY>",
                "public.ecr.aws/i6q0e6k6/judgment/mcp-server:latest"
            ]
        }
    }
}

The current implementation requires an LLM API key. We recommend using Gemini, but OpenAI and Anthropic are also supported. For Anthropic, use the ANTHROPIC_API_KEY environment variable. For OpenAI, use the OPENAI_API_KEY environment variable.

Supported AI Agents

Osiris Agent is compatible with several AI agents, including:

  • Cursor
  • Windsurf
  • Claude Desktop

Rule File Configuration

For optimal performance, we recommend adding a rule file to your AI agent configuration. This file helps Osiris Agent understand the available tools and how to optimize your code based on trace analysis.

Available Tools

Osiris Agent provides the following tools for analyzing and optimizing your LLM applications:

Analysis Tools

ToolDescriptionParameters
get_experiment_analysisRetrieves analysis for a specific experimentproject_name, eval_name
get_trace_analysisRetrieves analysis for a specific tracetrace_id
get_experiment_analysis_projectRetrieves analysis for multiple experiments in a projectproject_name, num_exps (default: “10”)
get_trace_analysis_projectRetrieves analysis for multiple traces in a projectproject_name, num_traces (default: “10”)

Evaluation Tool

ToolDescriptionParameters
run_evaluationExecutes an evaluation directly through Osiris Agentproject_name, evaluation_run

The evaluation_run parameter accepts a comprehensive configuration object that can include:

  • Examples with input/output pairs
  • Scorer configurations
  • Model specifications
  • Logging preferences

This tool enables rapid experimentation with different evaluation configurations without modifying your application code.

Learning Resources

Video Tutorials

To help you get started with Osiris Agent, we’ve created a series of video tutorials that demonstrate key concepts and practical implementations:

TopicDescriptionDuration
Getting Started with OsirisLearn the basics of setting up and configuring Osiris Agent for your first project7 min