Datasets
Overview
In most scenarios, you will have multiple Example
s that you want to evaluate together.
In judgeval
, an evaluation dataset (EvalDataset
) is a collection of Example
s that you can scale evaluations across.
Creating a Dataset
Creating an EvalDataset
is as simple as supplying a list of Example
s.
You can also add Example
s to an existing EvalDataset
using the add_example
method.
Saving/Loading Datasets
judgeval
supports saving and loading datasets in the following formats:
- JSON
- CSV
From Judgment
You easily can save/load an EvalDataset
from Judgment’s cloud.
From JSON
You can save/load an EvalDataset
with a JSON file. Your JSON file should have the following structure:
Here’s an example of how use judgeval
to save/load from JSON.
From CSV
You can save/load an EvalDataset
with a .csv
file. Your CSV should contain rows that can be mapped to Example
s via column names.
TODO: this section needs to be updated because the CSV format is not yet finalized.
Here’s an example of how use judgeval
to save/load from CSV.
From YAML
You can save/load an EvalDataset
with a .yaml
file. Your YAML should contain rows that can be mapped to Example
s via column names.
Here’s an example of how use judgeval
to save/load from YAML.
Evaluate On Your Dataset
You can use the JudgmentClient
to evaluate the Example
s in your dataset using scorers.
Conclusion
Congratulations! 🎉
You’ve now learned how to create, save, and evaluate an EvalDataset
in judgeval
.
You can also view and manage your datasets via the Judgment platform.