Contextual Relevancy
The contextual relevancy scorer is a default LLM judge scorer that measures how relevant the contexts in retrieval_context
are for an input
.
In practice, this scorer helps determine whether your RAG pipeline’s retriever effectively retrieves relevant contexts for a query.
Required Fields
To run the contextual relevancy scorer, you must include the following fields in your Example
:
input
actual_output
retrieval_context
Scorer Breakdown
ContextualRelevancy
scores are calculated by first extracting all statements in retrieval_context
and then classifying
which ones are relevant to the input
.
The score is then calculated as:
Our contextual relevancy scorer is based on Stanford NLP’s ARES paper (Saad-Falcon et. al., 2024).
Sample Implementation
The ContextualRelevancy
scorer uses an LLM judge, so you’ll receive a reason for the score in the reason
field of the results.
This allows you to double-check the accuracy of the evaluation and understand how the score was calculated.