Datasets

Overview

In most scenarios, you will have multiple Examples that you want to evaluate together.
In judgeval, an evaluation dataset (EvalDataset) is a collection of Examples that you can scale evaluations across.

Creating a Dataset

Creating an EvalDataset is as simple as supplying a list of Examples.

create_dataset.py
from judgeval.data import Example
from judgeval.data.datasets import EvalDataset

examples = [
    Example(input="...", actual_output="..."), 
    Example(input="...", actual_output="..."), 
    ...
]


dataset = EvalDataset(
    examples=examples
)

You can also add Examples to an existing EvalDataset using the add_example method.

add_to_dataset.py
...

dataset.add_example(Example(...))

Saving/Loading Datasets

judgeval supports saving and loading datasets in the following formats:

JSON
CSV

From Judgment

You easily can save/load an EvalDataset from Judgment’s cloud.

push_dataset.py
# Saving
...
from judgeval import JudgmentClient

client = JudgmentClient()
client.push_dataset(alias="my_dataset", dataset=dataset)

pull_dataset.py
# Loading
from judgeval import JudgmentClient

client = JudgmentClient()
dataset = client.pull_dataset(alias="my_dataset")

From JSON

You can save/load an EvalDataset with a JSON file. Your JSON file should have the following structure:

structure.json
{
    "examples": [
        {
            "input": "...", 
            "actual_output": "..."
        }, 
        ...
    ]
}

Here’s an example of how use judgeval to save/load from JSON.

json_dataset.py
from judgeval.data.datasets import EvalDataset

# saving
dataset = EvalDataset(...)  # filled with examples
dataset.save_as("json", "/path/to/save/dir", "save_name")

# loading
new_dataset = EvalDataset()
new_dataset.add_from_json("/path/to/your/json/file.json")

From CSV

You can save/load an EvalDataset with a .csv file. Your CSV should contain rows that can be mapped to Examples via column names. TODO: this section needs to be updated because the CSV format is not yet finalized.

Here’s an example of how use judgeval to save/load from CSV.

csv_dataset.py
from judgeval.data.datasets import EvalDataset

# saving
dataset = EvalDataset(...)  # filled with examples
dataset.save_as("csv", "/path/to/save/dir", "save_name")

# loading
new_dataset = EvalDataset()
new_dataset.add_from_csv("/path/to/your/csv/file.csv")

From YAML

You can save/load an EvalDataset with a .yaml file. Your YAML should contain rows that can be mapped to Examples via column names.

Here’s an example of how use judgeval to save/load from YAML.

yaml_dataset.py
from judgeval.data.datasets import EvalDataset

# saving
dataset = EvalDataset(...)  # filled with examples
dataset.save_as("yaml", "/path/to/save/dir", "save_name")

# loading
new_dataset = EvalDataset()
new_dataset.add_from_yaml("/path/to/your/yaml/file.yaml")

example.yaml
examples:
  - input: ...
    actual_output: ...
    expected_output: ...

Evaluate On Your Dataset

You can use the JudgmentClient to evaluate the Examples in your dataset using scorers.

evaluate_dataset.py
...

dataset = client.pull_dataset(alias="my_dataset")
res = client.evaluate_dataset(
    dataset=dataset,
    scorers=[FaithfulnessScorer(threshold=0.9)],
    model="gpt-4o",
)

Conclusion

Congratulations! 🎉

You’ve now learned how to create, save, and evaluate an EvalDataset in judgeval.

You can also view and manage your datasets via the Judgment platform.

Welcome!

Evaluation (Experiments)

Monitoring

Integrations

Alerts

API Reference

Overview

Creating a Dataset

Saving/Loading Datasets

From Judgment

From JSON

From CSV

From YAML

Evaluate On Your Dataset

Conclusion

Welcome!

Evaluation (Experiments)

Monitoring

Integrations

Alerts

API Reference

​Overview

​Creating a Dataset

​Saving/Loading Datasets

​From Judgment

​From JSON

​From CSV

​From YAML

​Evaluate On Your Dataset

​Conclusion

Overview

Creating a Dataset

Saving/Loading Datasets

From Judgment

From JSON

From CSV

From YAML

Evaluate On Your Dataset

Conclusion