Correctness Evaluator
Correctness evaluates the relevance and correctness of a generated answer against a reference answer.
This is useful for measuring if the response was correct. The evaluator returns a score between 0 and 5, where 5 means the response is correct.
Usage
Firstly, you need to install the package:
Set the OpenAI API key:
Import the required modules:
Let's setup gpt-4 for better results: