跳转至

大模型排名(仅供参考)

该排名仅供参考,本站部分模型不在表单上,另外部分排名表里的模型本站也没有,如果用来参考可以用相关模型横向对比一下。
排名更新时间:2024年6月25日

综合排名 模型名称 参数量 MMLU CEval AGIEval GSM8K MATH BBH MT Bench
1 GPT-4o 88.7 / / 90.5 76.6 / /
2 Claude 3.5 Sonnet 88.7 / / 96.4 71.1 / /
3 Claude3-Opus 0.0 86.8 / / 95.0 60.1 / 9.43
4 GPT-4 1750.0 86.4 68.7 / 87.1 42.5 / 9.32
5 Llama3-400B-Instruct-InTraining 4000.0 86.1 / / 94.1 57.8 / /
6 Llama3-400B-InTraining 4000.0 84.8 / / / / / /
7 Qwen2-72B 727.0 84.2 91.0 / 89.5 51.1 82.4 /
8 Gemini-ultra 0.0 83.7 / / 88.9 53.2 / /
9 Qwen2-72B-Instruct 72.0 82.3 83.8 / 91.1 59.7 / 9.12
10 Llama3-70B-Instruct 700.0 82.0 / / 93.0 50.4 / /
11 Gemini 1.5 Pro 0.0 81.9 / / 91.7 58.5 / /
12 GLM4 0.0 81.5 / / 87.6 47.9 82.3 /
13 Grok-1.5 81.3 / / 90.0 50.6 / /
14 Mistral Large 0.0 81.2 / / 81.0 45.0 / 8.66
15 YAYI2-30B 300.0 80.5 80.9 62.0 71.2 / / /
16 Qwen1.5-110B 1100.0 80.4 / / 85.4 49.6 74.8 8.88
17 Llama3-70B 700.0 79.5 / / / / / /
18 Gemini-pro 1000.0 79.13 / / 86.5 / / /
19 Claude3-Sonnet 0.0 79.0 / / 92.3 43.1 / 9.18
20 DeepSeek-V2-236B 2360.0 78.5 81.7 / 79.2 43.6 78.9 /
21 PaLM 2 3400.0 78.3 / / 80.7 / / /
22 Phi-3-medium 14B-preview 140.0 78.2 / 48.4 90.3 / / 8.91
23 Mixtral-8×22B-MoE 1410.0 77.75 / / 78.6 41.8 / /
24 Qwen1.5-72B-Chat 720.0 77.5 84.1 / 79.5 34.1 65.5 8.67
25 Qwen-72B 720.0 77.4 83.3 62.5 78.9 / / /
26 Yi-1.5-34B 340.0 77.1 / 71.1 82.7 41.0 76.4 /
27 Qwen2-57B-A14B 570.0 76.5 87.7 / 80.7 43.0 67.0 /
28 Yi-34B 340.0 76.3 81.4 / / / / /
29 Yi-34B-200K 340.0 76.1 81.9 / / / / /
30 Phi-3-small 7B 70.0 75.3 / 45.0 88.9 / / 8.7
31 Claude3-Haiku 0.0 75.2 / / 88.9 38.9 / /
32 Gemma2-27B 270.0 75.0 / / 75.0 / / /
33 GLM-4-9B 90.0 74.7 / / 84.0 30.4 / /
34 DBRX Instruct 1320.0 73.7 / / 72.8 / / 8.39
35 Qwen1.5-32B 320.0 73.4 83.5 / 77.4 36.1 / 8.3
36 Grok-1 3140.0 73.0 / / 62.9 / / /
37 GLM-4-9B-Chat 90.0 72.4 75.6 / 79.6 50.6 / 8.35
38 Apollo-7B 70.0 71.86 / / / / / /
39 DeepSeek-V2-236B-Chat 2360.0 71.1 65.2 / 84.4 32.6 71.7 /
40 XVERSE-65B 650.0 70.8 / 61.8 60.3 / / /
41 Mixtral-8×7B-MoE 450.0 70.6 / / 74.4 28.4 / 8.3
42 Qwen2-7B 70.0 70.3 83.2 / 79.9 44.2 62.6 /
43 GPT-3.5 1750.0 70.0 54.4 / 57.1 / / 8.39
44 Yi-1.5-9B 90.0 69.5 / 62.7 73.7 32.6 72.4 /
45 PaLM 5400.0 69.3 / / 56.5 / / /
46 LLaMA2 70B 700.0 68.9 / 54.2 56.8 / / /
47 Phi-3-mini 3.8B 38.0 68.8 / 37.5 82.5 / / 8.38
48 Llama3-8B-Instruct 80.0 68.4 / / 79.6 30.0 / /
49 Yi-9B 90.0 68.4 / / 52.3 15.9 / /
50 Aquila2-34B 340.0 67.79 63.07 / 58.4 / / /
51 Jamba-v0.1 520.0 67.4 / / 59.9 / 45.4 /
52 Llama3-8B 80.0 66.6 / / / / / /
53 Qwen-14B 140.0 66.3 72.1 / 61.3 / / /
54 Grok-0 330.0 65.7 / / 56.8 / / /
55 Gemma 7B 70.0 64.3 / 41.7 46.4 24.3 55.1 /
56 Yi-6B-200K 60.0 64.0 73.5 / / / / /
57 Starling-7B-LM-Beta 70.0 63.9 / / / / / 8.09
58 LLaMA 65B 650.0 63.4 38.8 47.6 50.9 / / /
59 Yi-6B 60.0 63.2 72.0 / / / / /
60 LLaMA2 34B 340.0 62.6 / 43.4 42.2 / / /
61 Qwen1.5-MoE-A2.7B 143.0 62.5 / / 61.5 / / 7.17
62 StableLM2-12B 120.0 62.09 / / 56.03 / / 8.15
63 ChatGLM3-6B-Base 60.0 61.4 69.0 53.7 72.3 / / /
64 StableLM2-12B-Chat 120.0 61.14 / / 57.7 / / 8.15
65 XVERSE-13B-Chat 130.0 60.2 53.1 48.3 / / / /
66 XVERSE-MoE-A4.2B 258.0 60.2 60.5 48.0 51.2 / / /
67 Mistral 7B 73.0 60.1 / 43.0 52.1 / / /
68 DeciLM-7B 70.4 59.76 / / 47.38 / / /
69 Baichuan2-13B-Base 130.0 59.17 58.1 48.17 52.77 / / /
70 MiniCPM-MoE-8x2B 136.0 58.9 58.11 / 61.5 10.52 39.22 /
71 LLaMA 33B 330.0 57.8 / 41.7 35.6 / / /
72 Phi-2 27.0 56.7 / / 61.1 / / /
73 Qwen-7B 70.0 56.7 59.6 / 51.6 / / /
74 Qwen2-1.5B 15.0 56.5 70.6 / 58.5 21.7 37.2 /
75 ChatGLM2 12B 120.0 56.18 61.6 / 40.94 / / /
76 XVERSE-13B 130.0 55.1 54.7 41.4 / / / /
77 LLaMA2 13B 130.0 54.84 / 39.1 28.7 / / /
78 Baichuan2-7B-Base 70.0 54.16 54.0 42.73 24.49 / / /
79 GPT-3 1750.0 53.9 / / / / / /
80 MiniCPM-2B-DPO 24.0 53.46 51.13 / 53.83 10.24 36.87 7.25
81 Baichuan 13B - Chat 130.0 52.1 51.5 / 26.6 / / /
82 Baichuan 13B - Base 130.0 51.62 52.4 / 26.6 / / /
83 InternLM 7B 70.0 51.0 53.4 37.6 31.2 / / /
84 InternLM Chat 7B 8K 70.0 50.8 53.2 42.5 31.2 / / /
85 ChatGLM2-6B 62.0 47.86 51.7 / 32.37 / / /
86 LLaMA 13B 130.0 46.94 / 33.9 17.8 / / /
87 Stable LM Zephyr 3B 30.0 45.9 30.34 / 52.54 12.2 37.86 6.64
88 Qwen2-0.5B 4.0 45.4 58.2 / 58.5 10.7 28.4 /
89 Qwen-1.8B 18.0 45.3 / / 32.3 / / /
90 LLaMA2 7B 70.0 45.3 / 29.3 14.6 / / /
91 GLM-130B 1300.0 44.8 44.0 / / / / /
92 Ziya-LLaMA-13B-Pretrain-v1 130.0 43.9 30.2 27.2 / / / /
93 OpenLLaMA 13B 130.0 42.4 24.7 24.0 / / / /
94 Gemma 2B 20.0 42.3 / 24.2 17.7 11.8 35.2 /
95 Gemma 2B - It 20.0 42.3 / 24.2 17.7 11.8 35.2 /
96 Baichuan 7B 70.0 42.3 42.8 34.44 9.7 / / /
97 Stable LM 2 - 1.6B 16.0 38.93 / / 17.82 / / /
98 RecurrentGemma-2B 27.0 38.4 / 23.8 13.4 11.8 / /
99 Phi-1.5 13.0 37.6 / / 40.2 / / /
100 DeepSeek Coder-6.7B Instruct 67.0 37.2 / / 62.8 28.6 46.9 /
101 ChatGLM-6B 62.0 36.9 38.9 / 4.82 / / /
102 LLaMA 7B 70.0 35.1 27.1 23.9 11.0 / / /
103 MOSS 160.0 27.4 33.13 26.8 / / / /
104 OPT 1750.0 25.2 25.0 24.2 / / / /
105 Pythia 120.0 25.1 26.2 25.3 / / / /
106 TinyLlama 11.0 24.3 25.02 / 2.27 / / /
107 Phi-1 13.0 / / / / / / /
108 CodeGemma-2B 20.0 / / / 41.2 20.9 / /
109 CodeGemma-7B 70.0 / / / 44.2 19.9 / /
110 CodeGemma-7B-IT 70.0 / / / 41.2 20.9 / /
111 WizardLM-2-70B 70.0 / / / / / / 8.92
112 WizardLM-2-7B 70.0 / / / / / / 8.28
113 Aquila-7B 70.0 / 25.5 25.58 / / / /
114 CPM-Bee 100.0 / 54.1 / / / / /
115 WizardLM-2 8x22B 1760.0 / / / / / / 9.12