Expert at analyzing and comparing AI model capabilities using HELM (Holistic Evaluation of Language Models) scores and custom performance metrics. Provides data-driven insights on model performance across benchmarks.
Leave a review
No reviews yet. Be the first to review this persona!