Python SDK for running evaluations on LLM generated responses
-
Updated
Jun 3, 2024 - Python
Python SDK for running evaluations on LLM generated responses
The LLM Evaluation Framework
OCTIS: Comparing Topic Models is Simple! A python package to optimize and evaluate topic models (accepted at EACL2021 demo track)
Counting-Stars (★)
Open source Python SDK for agent monitoring, LLM cost tracking, benchmarking, and more. Integrates with most LLMs and agent frameworks like CrewAI, Langchain, and Autogen
Evaluates neuron segmentations in terms of statistics related to the number of splits and merges
Python client for Kolena's machine learning testing platform
LightEval is a lightweight LLM evaluation suite that Hugging Face has been using internally with the recently released LLM data processing library datatrove and LLM training library nanotron.
A list of works on evaluation of visual generation models, including evaluation metrics, models, and systems
Valor is a centralized evaluation store which makes it easy to measure, explore, and rank model performance.
Pip compatible CodeBLEU metric implementation available for linux/macos/win
Random Forest Assisted Suggestions for Salifort Motors Employee Retention: Plan, Analyze, Construct and Execute
FAISS and Annoy indexing + search evaluation workflow
Official repository for “PATE: Proximity-Aware Time series anomaly Evaluation”.
[NeurIPS'21 Outstanding Paper] Library for reliable evaluation on RL and ML benchmarks, even with only a handful of seeds.
XGBoost Predictive Model for TikTok's Claim Classification: EDA, Hypothesis Testing, Logistic Regression, Tree-Based Models
Open-Source Evaluation for GenAI Application Pipelines
Metrics to evaluate the quality of responses of your Retrieval Augmented Generation (RAG) applications.
📈 Implementation of eight evaluation metrics to access the similarity between two images. The eight metrics are as follows: RMSE, PSNR, SSIM, ISSM, FSIM, SRE, SAM, and UIQ.
Continuation of an abandoned project fast-coco-eval
Add a description, image, and links to the evaluation-metrics topic page so that developers can more easily learn about it.
To associate your repository with the evaluation-metrics topic, visit your repo's landing page and select "manage topics."