Add Human Feedback
BetaHuman feedback is a valuable metric to assess the performance of your AI models. By incorporating human feedback, you can gain deeper insights into how the model’s responses are perceived and how well it performs from a user-centric perspective. This feedback can then be used in evaluations to calculate performance metrics, driving optimization and ultimately enhancing the reliability, accuracy, and efficiency of your AI application.
Human feedback measures the performance of your dataset based on direct human input. The metric is calculated as the percentage of positive feedback (thumbs up) given on logs, which are annotated in the Logs tab of the Cloudflare dashboard. This feedback helps refine model performance by considering real-world evaluations of its output.
This tutorial will guide you through the process of adding human feedback to your evaluations in AI Gateway.
- Log into the Cloudflare dashboard ↗ and select your account.
- Go to AI > AI Gateway.
- Go to Logs.
- The Logs tab displays all logs associated with your datasets. These logs show key information, including:
- Timestamp: When the interaction occurred.
- Status: Whether the request was successful, cached, or failed.
- Model: The model used in the request.
- Tokens: The number of tokens consumed by the response.
- Cost: The cost based on token usage.
- Duration: The time taken to complete the response.
- Feedback: Where you can provide human feedback on each log.
- Click on the log entry you want to review. This expands the log, allowing you to see more detailed information.
- In the expanded log, you can view additional details such as:
- The user prompt.
- The model response.
- HTTP response details.
- Endpoint information.
- You will see two icons:
- Thumbs up: Indicates positive feedback.
- Thumbs down: Indicates negative feedback.
- Click either the thumbs up or thumbs down icon based on how you rate the model response for that particular log entry.
After providing feedback on your logs, it becomes a part of the evaluation process.
When you run an evaluation (as outlined in the Set Up Evaluations guide), the human feedback metric will be calculated based on the percentage of logs that received thumbs-up feedback.
After running the evaluation, review the results on the Evaluations tab. You will be able to see the performance of the model based on cost, speed, and now human feedback, represented as the percentage of positive feedback (thumbs up).
The human feedback score is displayed as a percentage, showing the distribution of positively rated responses from the database.
For more information on running evaluations, refer to the documentation Set Up Evaluations.