HiPhO Leaderboard

HiPhO (High School Physics Olympiad Benchmark)





Metric: exam score. Tip: click any column header to sort.
Legend: Closed-source MLLM Open-source MLLM Open-source LLM
1. Cells are color-coded based on official medal thresholds. Models are ranked by Gold ↓, then Silver ↓, then Bronze ↓, with ties broken by IPhO 2025 score ↓.
2. Medal cutoffs are derived from the theoretical exam scores of human medalists.
3. Only the theoretical components of each exam are evaluated; experimental and diagram-drawing problems are excluded, so Full Mark (Model) ≤ Full Mark (Human).
4. Each model was run 8 times. Problem scores were averaged and summed to compute the final exam score.