benchmark-platform

Here are 2 public repositories matching this topic...

MetaEvo / MetaBox

MetaBox: Benchmarking Platform for Meta-Black-Box Optimization

deep-reinforcement-learning hyperparameter-optimization evolutionary-algorithms black-box-optimization protein-protein-docking learning-to-optimize real-parameter-optimization meta-black-box-optimization benchmark-platform

Updated Oct 10, 2025
Python

chengjun-xu / ai-eval-platform

Star

大模型评测平台 — 本地/API/HuggingFace/OpenCompass 三路后端，支持数据生产(Self-Instruct/Evol-Instruct)、长尾场景生成、弱项挖掘、回归分析、污染检测、Bad Case归因。可扩展的 Benchmark 系统和 LLM-as-Judge 自动评分。

python flask humaneval ai-evaluation gsm8k mmlu llm-evaluation benchmark-platform rag-evaluation llm-as-judge opencompass llm-benchmark data-contamination-detection

Updated Jun 3, 2026
Python

Improve this page

Add a description, image, and links to the benchmark-platform topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the benchmark-platform topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly