Scenema Audio — Best-of-N: Diminishing Returns

10 Path A prompts × 100 candidates — 29 ranking methods — Page 1/5

Scenema Main Scenema CC Scenema CC2 Scenema AC DramaBox Best-of-N DramaBox Main

[Prompts 0-1] · Prompts 2-3 · Prompts 4-5 · Prompts 6-7 · Prompts 8-9

Ranking Method:

Methodology: Ranking Method Formulas & Text Prompts

All methods use reward = (1 − WER) × max(score, 0). The score varies per method as described below.

#	Key	Score Formula	Text Prompt(s)
0	standard	Content Enjoyment	—
1	clap_lq	cos(audio, quality_text)	"pleasant, realistic, genuine, authentic, natural performance, high-quality recording"
2	clap_sq	cos(audio, quality_text)	"pleasant, realistic, genuine, authentic, natural performance, high-quality recording"
3	clap_lp	cos(audio, prompt)	Original DramaBox prompt
4	clap_sp	cos(audio, prompt)	Original DramaBox prompt
5	v1_nat_L	cos(audio, nat)	"natural, spontaneous, lifelike speech with genuine emotion"
6	v2_auth_L	cos(audio, auth)	"authentic, emotionally truthful, deeply felt voice performance"
7	v3_pro_L	cos(audio, pro)	"professional studio recording, crystal clear high-fidelity audio"
8	v4_expr_L	cos(audio, expr)	"expressive, dynamic voice acting with rich emotional range"
9	v5_cine_L	cos(audio, cine)	"immersive cinematic narration, compelling storytelling"
10	v6_nat_S	cos(audio, nat)	"natural, spontaneous, lifelike speech with genuine emotion"
11	v7_auth_S	cos(audio, auth)	"authentic, emotionally truthful, deeply felt voice performance"
12	v8_pro_S	cos(audio, pro)	"professional studio recording, crystal clear high-fidelity audio"
13	v9_nr_L	cos(audio, nat) − cos(audio, rob)	+ "natural, spontaneous, lifelike speech with genuine emotion" / − "robotic, mechanical, monotonous, synthetic computer speech"
14	v10_ac_L	cos(audio, auth) − cos(audio, cheap)	+ "authentic, emotionally truthful, deeply felt voice performance" / − "cheap, amateurish, rehearsed, stilted text-to-speech output"
15	v11_pd_L	cos(audio, pro) − cos(audio, dist)	+ "professional studio recording, crystal clear high-fidelity audio" / − "distorted, noisy, muffled, low-quality poor recording"
16	v12_ef_L	cos(audio, expr) − cos(audio, flat)	+ "expressive, dynamic voice acting with rich emotional range" / − "flat, lifeless, boring, emotionally dead recitation"
17	v13_ff_L	cos(audio, full_pos) − cos(audio, full_neg)	+ "natural spontaneous genuine authentic high-quality voice performance" / − "robotic distorted monotonous rehearsed cheap artificial synthetic"
18	v14_wr_L	cos(audio, warm) − cos(audio, rob)	+ "warm, pleasant, engaging conversational human voice" / − "robotic, mechanical, monotonous, synthetic computer speech"
19	v15_nr_S	cos(audio, nat) − cos(audio, rob)	+ "natural, spontaneous, lifelike speech with genuine emotion" / − "robotic, mechanical, monotonous, synthetic computer speech"
20	v16_ac_S	cos(audio, auth) − cos(audio, cheap)	+ "authentic, emotionally truthful, deeply felt voice performance" / − "cheap, amateurish, rehearsed, stilted text-to-speech output"
21	v17_pd_S	cos(audio, pro) − cos(audio, dist)	+ "professional studio recording, crystal clear high-fidelity audio" / − "distorted, noisy, muffled, low-quality poor recording"
22	v18_ef_S	cos(audio, expr) − cos(audio, flat)	+ "expressive, dynamic voice acting with rich emotional range" / − "flat, lifeless, boring, emotionally dead recitation"
23	v19_ff_S	cos(audio, full_pos) − cos(audio, full_neg)	+ "natural spontaneous genuine authentic high-quality voice performance" / − "robotic distorted monotonous rehearsed cheap artificial synthetic"
24	v20_wr_S	cos(audio, warm) − cos(audio, rob)	+ "warm, pleasant, engaging conversational human voice" / − "robotic, mechanical, monotonous, synthetic computer speech"
25	v21_san_L	cos(audio, sanitized_prompt)	Quoted speech removed (Large)
26	v22_san_S	cos(audio, sanitized_prompt)	Quoted speech removed (Small)
27	v23_snr_L	cos(audio, sanitized) − cos(audio, neg_san)	Sanitized / − "robotic, distorted, uncanny" (Large)
28	v24_snr_S	cos(audio, sanitized) − cos(audio, neg_san)	Sanitized / − "robotic, distorted, uncanny" (Small)

Cross-Method Diminishing Returns Comparison

Method	N=5	N=10	N=25	N=50	N=100	Gain N=5→100	Knee Point
Standard: (1−WER) × Content Enjoyment	4.0428	4.1408	4.3040	4.3142	4.3628	+0.3199	N=50
VoiceCLAP-Large × Quality Text	0.9382	0.9669	0.9912	1.0029	1.0105	+0.0723	N=50
VoiceCLAP-Small × Quality Text	0.7145	0.7557	0.7905	0.7989	0.8163	+0.1018	N=50
VoiceCLAP-Large × Prompt Match	1.3999	1.4481	1.4806	1.4939	1.5071	+0.1072	N=50
VoiceCLAP-Small × Prompt Match	0.7935	0.8272	0.8358	0.8533	0.8745	+0.0811	N=25
v1 Natural (Large)	1.0182	1.0507	1.0753	1.0955	1.1007	+0.0825	N=50
v2 Authentic (Large)	1.0417	1.0757	1.1070	1.1228	1.1283	+0.0866	N=50
v3 Professional (Large)	0.8839	0.9104	0.9344	0.9464	0.9553	+0.0714	N=50
v4 Expressive (Large)	1.0674	1.0989	1.1503	1.1399	1.1609	+0.0936	N=50
v5 Cinematic (Large)	1.0103	1.0374	1.0845	1.0736	1.0972	+0.0869	N=50
v6 Natural (Small)	0.7287	0.7761	0.8128	0.8190	0.8507	+0.1220	N=50
v7 Authentic (Small)	0.6651	0.7163	0.7486	0.7537	0.7703	+0.1053	N=50
v8 Professional (Small)	0.6109	0.6477	0.6647	0.6831	0.7008	+0.0899	N=100
v9 Natural−Robotic (Large)	1.7977	1.8396	1.8937	1.9131	1.9235	+0.1258	N=50
v10 Authentic−Cheap (Large)	1.8655	1.9179	1.9764	1.9911	2.0030	+0.1375	N=50
v11 Professional−Distorted (Large)	1.7750	1.8247	1.8765	1.8977	1.9133	+0.1383	N=50
v12 Expressive−Flat (Large)	1.8149	1.8563	1.9296	1.9202	1.9482	+0.1333	N=50
v13 FullPos−FullNeg (Large)	1.7943	1.8421	1.8963	1.9123	1.9265	+0.1321	N=50
v14 Warm−Robotic (Large)	1.6974	1.7359	1.7887	1.8043	1.8119	+0.1145	N=50
v15 Natural−Robotic (Small)	1.8026	1.8541	1.9326	1.9311	1.9696	+0.1670	N=50
v16 Authentic−Cheap (Small)	1.7817	1.8385	1.9087	1.9118	1.9354	+0.1537	N=50
v17 Professional−Distorted (Small)	1.6513	1.7063	1.7464	1.7580	1.7837	+0.1324	N=50
v18 Expressive−Flat (Small)	1.7013	1.7937	1.8276	1.8426	1.8982	+0.1969	N=50
v19 FullPos−FullNeg (Small)	1.6088	1.6750	1.7174	1.7289	1.7616	+0.1528	N=50
v20 Warm−Robotic (Small)	1.7561	1.7931	1.8711	1.8821	1.9198	+0.1637	N=50
v21 Sanitized Prompt (Large)	1.1625	1.1956	1.2300	1.2426	1.2543	+0.0918	N=50
v22 Sanitized Prompt (Small)	0.7789	0.8160	0.8245	0.8353	0.8623	+0.0834	N=25
v23 Sanitized−Uncanny (Large)	1.9633	2.0168	2.0701	2.0903	2.0995	+0.1362	N=50
v24 Sanitized−Uncanny (Small)	1.6995	1.7653	1.7926	1.8202	1.8531	+0.1536	N=50

Diminishing Returns — All Methods Overlaid

Marginal Improvement per Additional Candidate

Method	N=5→10	N=10→25	N=25→50	N=50→100
Standard: (1−WER) × Content Enjoyment	0.01959/cand (2.4%)	0.01088/cand (3.9%)	0.00041/cand (0.2%)	0.00097/cand (1.1%)
VoiceCLAP-Large × Quality Text	0.00572/cand (3.1%)	0.00162/cand (2.5%)	0.00047/cand (1.2%)	0.00015/cand (0.8%)
VoiceCLAP-Small × Quality Text	0.00823/cand (5.8%)	0.00232/cand (4.6%)	0.00033/cand (1.1%)	0.00035/cand (2.2%)
VoiceCLAP-Large × Prompt Match	0.00965/cand (3.4%)	0.00216/cand (2.2%)	0.00053/cand (0.9%)	0.00026/cand (0.9%)
VoiceCLAP-Small × Prompt Match	0.00676/cand (4.3%)	0.00057/cand (1.0%)	0.00070/cand (2.1%)	0.00043/cand (2.5%)
v1 Natural (Large)	0.00650/cand (3.2%)	0.00164/cand (2.3%)	0.00081/cand (1.9%)	0.00010/cand (0.5%)
v2 Authentic (Large)	0.00679/cand (3.3%)	0.00209/cand (2.9%)	0.00063/cand (1.4%)	0.00011/cand (0.5%)
v3 Professional (Large)	0.00531/cand (3.0%)	0.00160/cand (2.6%)	0.00048/cand (1.3%)	0.00018/cand (0.9%)
v4 Expressive (Large)	0.00630/cand (3.0%)	0.00343/cand (4.7%)	-0.00042/cand (-0.9%)	0.00042/cand (1.8%)
v5 Cinematic (Large)	0.00543/cand (2.7%)	0.00314/cand (4.5%)	-0.00044/cand (-1.0%)	0.00047/cand (2.2%)
v6 Natural (Small)	0.00948/cand (6.5%)	0.00244/cand (4.7%)	0.00025/cand (0.8%)	0.00063/cand (3.9%)
v7 Authentic (Small)	0.01026/cand (7.7%)	0.00215/cand (4.5%)	0.00020/cand (0.7%)	0.00033/cand (2.2%)
v8 Professional (Small)	0.00736/cand (6.0%)	0.00113/cand (2.6%)	0.00074/cand (2.8%)	0.00035/cand (2.6%)
v9 Natural−Robotic (Large)	0.00837/cand (2.3%)	0.00361/cand (2.9%)	0.00077/cand (1.0%)	0.00021/cand (0.5%)
v10 Authentic−Cheap (Large)	0.01047/cand (2.8%)	0.00390/cand (3.0%)	0.00059/cand (0.7%)	0.00024/cand (0.6%)
v11 Professional−Distorted (Large)	0.00994/cand (2.8%)	0.00346/cand (2.8%)	0.00085/cand (1.1%)	0.00031/cand (0.8%)
v12 Expressive−Flat (Large)	0.00828/cand (2.3%)	0.00489/cand (3.9%)	-0.00037/cand (-0.5%)	0.00056/cand (1.5%)
v13 FullPos−FullNeg (Large)	0.00956/cand (2.7%)	0.00361/cand (2.9%)	0.00064/cand (0.8%)	0.00028/cand (0.7%)
v14 Warm−Robotic (Large)	0.00770/cand (2.3%)	0.00352/cand (3.0%)	0.00062/cand (0.9%)	0.00015/cand (0.4%)
v15 Natural−Robotic (Small)	0.01031/cand (2.9%)	0.00523/cand (4.2%)	-0.00006/cand (-0.1%)	0.00077/cand (2.0%)
v16 Authentic−Cheap (Small)	0.01137/cand (3.2%)	0.00468/cand (3.8%)	0.00012/cand (0.2%)	0.00047/cand (1.2%)
v17 Professional−Distorted (Small)	0.01100/cand (3.3%)	0.00267/cand (2.4%)	0.00046/cand (0.7%)	0.00051/cand (1.5%)
v18 Expressive−Flat (Small)	0.01849/cand (5.4%)	0.00226/cand (1.9%)	0.00060/cand (0.8%)	0.00111/cand (3.0%)
v19 FullPos−FullNeg (Small)	0.01325/cand (4.1%)	0.00283/cand (2.5%)	0.00046/cand (0.7%)	0.00065/cand (1.9%)
v20 Warm−Robotic (Small)	0.00740/cand (2.1%)	0.00520/cand (4.3%)	0.00044/cand (0.6%)	0.00075/cand (2.0%)
v21 Sanitized Prompt (Large)	0.00663/cand (2.9%)	0.00229/cand (2.9%)	0.00050/cand (1.0%)	0.00023/cand (0.9%)
v22 Sanitized Prompt (Small)	0.00742/cand (4.8%)	0.00056/cand (1.0%)	0.00043/cand (1.3%)	0.00054/cand (3.2%)
v23 Sanitized−Uncanny (Large)	0.01069/cand (2.7%)	0.00355/cand (2.6%)	0.00081/cand (1.0%)	0.00018/cand (0.4%)
v24 Sanitized−Uncanny (Small)	0.01315/cand (3.9%)	0.00182/cand (1.5%)	0.00110/cand (1.5%)	0.00066/cand (1.8%)

Ablation: Pronunciation Suffix Effect

Comparing N=10 without suffix vs N=10 with suffix.

Ranking Method	Without Suffix (N=10)			With Suffix (N=10)			Delta
	Mean	Best	Median	Mean	Best	Median	Δ Mean	Δ Best
Standard: (1−WER) × Content Enjoyment	3.3615	4.1847	3.6175	3.3763	3.9955	3.5269	+0.0149	-0.1891
VoiceCLAP-Large × Quality Text	0.7838	0.9695	0.8348	0.7841	0.9143	0.8175	+0.0003	-0.0552
VoiceCLAP-Small × Quality Text	0.5774	0.7412	0.6003	0.5810	0.7199	0.6050	+0.0036	-0.0213
VoiceCLAP-Large × Prompt Match	1.1613	1.4424	1.2490	1.1714	1.3765	1.2262	+0.0101	-0.0659
VoiceCLAP-Small × Prompt Match	0.6495	0.8131	0.6829	0.6531	0.7878	0.6749	+0.0036	-0.0253
v1 Natural (Large)	0.8459	1.0597	0.9060	0.8467	0.9954	0.8844	+0.0008	-0.0643
v2 Authentic (Large)	0.8674	1.0861	0.9346	0.8735	1.0270	0.9118	+0.0061	-0.0591
v3 Professional (Large)	0.7392	0.9144	0.7867	0.7382	0.8605	0.7725	-0.0010	-0.0539
v4 Expressive (Large)	0.8872	1.1027	0.9527	0.8911	1.0474	0.9281	+0.0039	-0.0553
v5 Cinematic (Large)	0.8382	1.0399	0.9002	0.8417	0.9907	0.8810	+0.0034	-0.0492
v6 Natural (Small)	0.6090	0.7785	0.6431	0.6235	0.7449	0.6467	+0.0144	-0.0336
v7 Authentic (Small)	0.5564	0.7202	0.5876	0.5714	0.6861	0.5949	+0.0151	-0.0341
v8 Professional (Small)	0.4890	0.6297	0.5065	0.4877	0.6010	0.5041	-0.0013	-0.0286
v9 Natural−Robotic (Large)	1.4963	1.8591	1.6107	1.5032	1.7597	1.5698	+0.0069	-0.0994
v10 Authentic−Cheap (Large)	1.5558	1.9359	1.6813	1.5684	1.8436	1.6392	+0.0126	-0.0923
v11 Professional−Distorted (Large)	1.4769	1.8386	1.5808	1.4800	1.7328	1.5483	+0.0031	-0.1057
v12 Expressive−Flat (Large)	1.5067	1.8652	1.6254	1.5134	1.7795	1.5797	+0.0067	-0.0857
v13 FullPos−FullNeg (Large)	1.4975	1.8539	1.6158	1.5039	1.7564	1.5741	+0.0064	-0.0975
v14 Warm−Robotic (Large)	1.4152	1.7466	1.5236	1.4226	1.6600	1.4859	+0.0074	-0.0866
v15 Natural−Robotic (Small)	1.5035	1.8746	1.6153	1.5182	1.7863	1.5854	+0.0147	-0.0883
v16 Authentic−Cheap (Small)	1.4822	1.8690	1.5956	1.5071	1.7842	1.5690	+0.0249	-0.0848
v17 Professional−Distorted (Small)	1.3712	1.7032	1.4657	1.3774	1.6152	1.4430	+0.0062	-0.0880
v18 Expressive−Flat (Small)	1.4142	1.7884	1.5018	1.4204	1.7138	1.4643	+0.0063	-0.0745
v19 FullPos−FullNeg (Small)	1.3365	1.6734	1.4319	1.3446	1.5883	1.4059	+0.0081	-0.0851
v20 Warm−Robotic (Small)	1.4482	1.8182	1.5464	1.4453	1.7334	1.5037	-0.0029	-0.0849
v21 Sanitized Prompt (Large)	0.9617	1.2109	1.0314	0.9631	1.1460	1.0088	+0.0014	-0.0649
v22 Sanitized Prompt (Small)	0.6298	0.7958	0.6605	0.6325	0.7626	0.6558	+0.0026	-0.0332
v23 Sanitized−Uncanny (Large)	1.6294	2.0328	1.7549	1.6349	1.9304	1.7133	+0.0055	-0.1024
v24 Sanitized−Uncanny (Small)	1.4003	1.7655	1.4991	1.4042	1.6709	1.4626	+0.0039	-0.0946

Per-Prompt Ablation: Standard Reward (N=10)

#	Lang	No Suffix Mean	No Suffix Best	With Suffix Mean	With Suffix Best	Δ Mean	Δ Best
0	English	3.6461	4.5530	3.4901	3.7667	-0.1560	-0.7863
1	French	4.0515	5.1346	4.7163	5.0872	+0.6649	-0.0474
2	English	1.5349	2.6575	1.2702	1.3610	-0.2647	-1.2965
3	German	4.9259	5.0158	4.9545	5.0567	+0.0286	+0.0409
4	French	4.3312	4.5559	3.9856	4.5478	-0.3456	-0.0081
5	French	3.3635	4.5606	2.3323	4.4187	-1.0312	-0.1419
6	English	3.2741	3.8898	3.1473	3.7320	-0.1268	-0.1578
7	German	2.1434	2.9305	2.4512	2.9967	+0.3079	+0.0662
8	Spanish	2.8368	4.5719	3.7254	4.6909	+0.8886	+0.1190
9	French	3.5076	3.9773	3.6906	4.2977	+0.1829	+0.3204

Standard: (1−WER) × Content Enjoyment — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	4.0428	4.1408	4.3040	4.3142	4.3628
Std Dev	0.9987	1.0256	0.8478	0.8574	0.8669
Avg Mean	3.3009	3.4083	3.3088	3.3431	3.3545

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	4.6092	4.4613	4.7767	4.8450	4.8450
1	French	4.8753	4.9405	4.9800	4.9405	5.1346
2	English	2.0074	2.0326	2.7127	2.6575	2.7127
3	German	4.9731	5.0104	5.0779	5.0760	5.0958
4	French	4.4928	4.5622	4.7721	4.6559	4.7721
5	French	3.9766	4.7602	4.8669	4.8669	4.8669
6	English	3.8362	3.9019	3.9019	3.9869	3.9869
7	German	2.5906	2.5978	2.9544	2.9718	3.0040
8	Spanish	4.7735	4.8485	4.6745	4.8485	4.8772
9	French	4.2935	4.2923	4.3229	4.2935	4.3324

VoiceCLAP-Large × Quality Text — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	0.9382	0.9669	0.9912	1.0029	1.0105
Std Dev	0.2386	0.2322	0.1934	0.1895	0.1935
Avg Mean	0.7730	0.8039	0.7775	0.7847	0.7867

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	1.1147	1.1060	1.1467	1.1844	1.1862
1	French	1.1369	1.1469	1.1469	1.1469	1.1535
2	English	0.5416	0.5584	0.7194	0.7339	0.7339
3	German	1.1707	1.1889	1.1718	1.1918	1.1918
4	French	1.1962	1.2330	1.2631	1.2276	1.2631
5	French	0.7821	0.9345	0.9020	0.9020	0.9345
6	English	0.8798	0.9053	0.9172	0.9319	0.9319
7	German	0.6082	0.6142	0.7200	0.7231	0.7231
8	Spanish	0.8656	0.9006	0.8644	0.9006	0.9006
9	French	1.0866	1.0808	1.0607	1.0866	1.0866

VoiceCLAP-Small × Quality Text — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	0.7145	0.7557	0.7905	0.7989	0.8163
Std Dev	0.2503	0.2709	0.2397	0.2265	0.2350
Avg Mean	0.5870	0.6080	0.5867	0.5937	0.5937

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	0.8147	0.7474	0.7407	0.8147	0.8147
1	French	0.8488	0.9494	0.9430	0.9430	1.0170
2	English	0.3480	0.3979	0.4652	0.5123	0.5123
3	German	1.0229	1.0576	1.0635	1.0671	1.0702
4	French	0.9377	1.1231	1.1231	1.1062	1.1231
5	French	0.4575	0.4484	0.5350	0.5576	0.5576
6	English	0.6366	0.7100	0.8218	0.7449	0.8218
7	German	0.4228	0.4753	0.5481	0.5481	0.5481
8	Spanish	0.6364	0.6185	0.6427	0.6692	0.6692
9	French	1.0197	1.0292	1.0223	1.0258	1.0292

VoiceCLAP-Large × Prompt Match — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	1.3999	1.4481	1.4806	1.4939	1.5071
Std Dev	0.4026	0.4093	0.3382	0.3519	0.3453
Avg Mean	1.1457	1.1872	1.1505	1.1642	1.1670

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	1.6556	1.6680	1.7490	1.7484	1.7490
1	French	1.8017	1.8237	1.8017	1.8334	1.8334
2	English	0.6004	0.6204	0.8337	0.7960	0.8337
3	German	1.8638	1.8633	1.8698	1.8768	1.8768
4	French	1.5106	1.5326	1.5106	1.5326	1.5465
5	French	1.4225	1.6769	1.6505	1.6505	1.6850
6	English	1.2575	1.3315	1.3354	1.3846	1.3846
7	German	0.8502	0.8613	1.0097	1.0097	1.0097
8	Spanish	1.6026	1.6564	1.5870	1.6564	1.6564
9	French	1.4338	1.4472	1.4584	1.4508	1.4959

VoiceCLAP-Small × Prompt Match — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	0.7935	0.8272	0.8358	0.8533	0.8745
Std Dev	0.2638	0.2790	0.2590	0.2505	0.2431
Avg Mean	0.6365	0.6639	0.6464	0.6498	0.6509

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	1.1269	1.2026	1.2313	1.2273	1.2313
1	French	0.7761	0.8312	0.8086	0.8157	0.8526
2	English	0.3177	0.3220	0.4026	0.4640	0.4640
3	German	1.1149	1.1149	1.1285	1.1585	1.1585
4	French	0.7659	0.7684	0.7746	0.7720	0.7769
5	French	0.7775	0.9295	0.8426	0.8637	0.9295
6	English	1.0807	1.1258	1.1101	1.1258	1.1258
7	German	0.5122	0.5182	0.5631	0.5778	0.6298
8	Spanish	0.6455	0.6840	0.7005	0.7105	0.7414
9	French	0.8172	0.7758	0.7964	0.8172	0.8355

v1 Natural (Large) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	1.0182	1.0507	1.0753	1.0955	1.1007
Std Dev	0.2379	0.2240	0.1868	0.1745	0.1764
Avg Mean	0.8342	0.8666	0.8390	0.8465	0.8491

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	1.2314	1.2080	1.2740	1.2921	1.2921
1	French	1.2232	1.2332	1.2332	1.2377	1.2377
2	English	0.5997	0.6276	0.7881	0.8523	0.8523
3	German	1.1695	1.1722	1.1507	1.1986	1.1986
4	French	1.2457	1.2799	1.3074	1.2838	1.3074
5	French	0.8866	1.0465	1.0195	1.0195	1.0465
6	English	0.9340	0.9862	0.9862	1.0048	1.0048
7	German	0.7097	0.7272	0.8389	0.8410	0.8410
8	Spanish	0.9461	0.9893	0.9350	0.9893	0.9893
9	French	1.2362	1.2370	1.2198	1.2362	1.2370

v2 Authentic (Large) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	1.0417	1.0757	1.1070	1.1228	1.1283
Std Dev	0.2386	0.2339	0.1810	0.1710	0.1751
Avg Mean	0.8528	0.8875	0.8582	0.8678	0.8704

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	1.1051	1.0830	1.1501	1.1702	1.1702
1	French	1.2936	1.3127	1.3127	1.3222	1.3222
2	English	0.5765	0.6058	0.7828	0.8170	0.8170
3	German	1.2334	1.2331	1.2296	1.2522	1.2522
4	French	1.2353	1.2564	1.2991	1.2639	1.2991
5	French	0.9750	1.1631	1.1466	1.1466	1.1631
6	English	0.9358	0.9734	0.9799	1.0093	1.0093
7	German	0.7218	0.7421	0.8478	0.8630	0.8630
8	Spanish	1.1311	1.1742	1.1216	1.1742	1.1742
9	French	1.2093	1.2128	1.1998	1.2093	1.2128

v3 Professional (Large) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	0.8839	0.9104	0.9344	0.9464	0.9553
Std Dev	0.2272	0.2160	0.1796	0.1750	0.1772
Avg Mean	0.7294	0.7574	0.7333	0.7396	0.7416

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	1.0549	1.0490	1.0790	1.1192	1.1192
1	French	1.0704	1.0783	1.0783	1.0783	1.0923
2	English	0.5017	0.5310	0.6894	0.7025	0.7025
3	German	1.1204	1.1388	1.1290	1.1388	1.1388
4	French	1.1153	1.1394	1.1697	1.1394	1.1697
5	French	0.7279	0.8725	0.8282	0.8282	0.8725
6	English	0.8527	0.8724	0.9049	0.9047	0.9049
7	German	0.5705	0.5782	0.6768	0.6910	0.6910
8	Spanish	0.8131	0.8501	0.8116	0.8501	0.8501
9	French	1.0116	0.9944	0.9771	1.0116	1.0116

v4 Expressive (Large) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	1.0674	1.0989	1.1503	1.1399	1.1609
Std Dev	0.2425	0.2529	0.2048	0.1955	0.2083
Avg Mean	0.8691	0.9041	0.8739	0.8839	0.8864

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	1.2289	1.2245	1.3037	1.3001	1.3037
1	French	1.2979	1.3219	1.3527	1.3219	1.3527
2	English	0.5868	0.6071	0.8132	0.8134	0.8134
3	German	1.1185	1.1288	1.1510	1.1510	1.1510
4	French	1.2537	1.2548	1.3456	1.2817	1.3456
5	French	1.0679	1.3005	1.2465	1.2465	1.3005
6	English	1.1771	1.1903	1.2155	1.1905	1.2155
7	German	0.6678	0.6654	0.7671	0.7616	0.7671
8	Spanish	1.1366	1.1717	1.1317	1.1717	1.1717
9	French	1.1385	1.1239	1.1762	1.1606	1.1881

v5 Cinematic (Large) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	1.0103	1.0374	1.0845	1.0736	1.0972
Std Dev	0.2397	0.2533	0.2088	0.1943	0.2124
Avg Mean	0.8174	0.8536	0.8249	0.8344	0.8363

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	1.0651	1.0344	1.1200	1.1066	1.1200
1	French	1.2557	1.2760	1.2809	1.2583	1.2809
2	English	0.5464	0.5541	0.7294	0.7332	0.7332
3	German	1.1563	1.1557	1.1629	1.1597	1.1629
4	French	1.2169	1.2297	1.3178	1.2297	1.3178
5	French	1.0553	1.2633	1.2140	1.2140	1.2633
6	English	1.0399	1.0740	1.1273	1.1275	1.1275
7	German	0.6163	0.6187	0.7168	0.7167	0.7168
8	Spanish	1.0128	1.0402	1.0107	1.0402	1.0487
9	French	1.1381	1.1281	1.1657	1.1496	1.2004

v6 Natural (Small) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	0.7287	0.7761	0.8128	0.8190	0.8507
Std Dev	0.2415	0.2469	0.2125	0.2120	0.2348
Avg Mean	0.6028	0.6300	0.6110	0.6160	0.6182

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	0.5059	0.5173	0.5334	0.5646	0.5646
1	French	1.0599	1.0794	1.1280	1.1280	1.1919
2	English	0.3369	0.4198	0.4931	0.5142	0.5142
3	German	0.9284	1.0025	1.0025	1.0094	1.0473
4	French	0.8590	0.9032	0.9129	0.9639	0.9639
5	French	0.5827	0.6050	0.7279	0.7124	0.7279
6	English	0.6672	0.7264	0.8368	0.7746	0.8368
7	German	0.5587	0.5796	0.6511	0.6488	0.6511
8	Spanish	0.7423	0.7995	0.8124	0.8272	0.8808
9	French	1.0463	1.1286	1.0296	1.0466	1.1286

v7 Authentic (Small) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	0.6651	0.7163	0.7486	0.7537	0.7703
Std Dev	0.2203	0.2239	0.1956	0.2015	0.2119
Avg Mean	0.5480	0.5734	0.5544	0.5592	0.5620

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	0.4735	0.4737	0.5110	0.5215	0.5215
1	French	0.9586	1.0144	1.0517	1.0517	1.1154
2	English	0.3252	0.4292	0.4978	0.4965	0.4978
3	German	0.9092	0.9629	0.9629	0.9911	0.9939
4	French	0.7944	0.8742	0.8151	0.8742	0.8742
5	French	0.5869	0.5844	0.7027	0.7094	0.7094
6	English	0.5076	0.6049	0.6958	0.6361	0.6958
7	German	0.4675	0.4787	0.5227	0.5371	0.5371
8	Spanish	0.7505	0.8420	0.8443	0.8420	0.8590
9	French	0.8771	0.8990	0.8817	0.8771	0.8990

v8 Professional (Small) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	0.6109	0.6477	0.6647	0.6831	0.7008
Std Dev	0.2137	0.2175	0.1938	0.1962	0.1948
Avg Mean	0.5024	0.5159	0.4972	0.5037	0.5041

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	0.9077	0.8387	0.7916	0.9145	0.9145
1	French	0.6426	0.7534	0.7256	0.7340	0.7567
2	English	0.3374	0.3942	0.4532	0.5248	0.5248
3	German	0.8259	0.8405	0.8592	0.8675	0.8862
4	French	0.7733	0.9628	0.9628	0.9288	0.9628
5	French	0.3198	0.3729	0.3862	0.3862	0.4062
6	English	0.5761	0.6278	0.6672	0.6410	0.6672
7	German	0.3825	0.3985	0.4904	0.4671	0.4904
8	Spanish	0.5511	0.5038	0.5166	0.5523	0.5845
9	French	0.7924	0.7844	0.7942	0.8146	0.8146

v9 Natural−Robotic (Large) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	1.7977	1.8396	1.8937	1.9131	1.9235
Std Dev	0.4141	0.4187	0.3319	0.3083	0.3150
Avg Mean	1.4730	1.5270	1.4786	1.4947	1.4977

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	2.0661	2.0505	2.1619	2.1282	2.1619
1	French	2.1434	2.1459	2.1522	2.1463	2.1522
2	English	0.9397	0.9205	1.1950	1.2670	1.2670
3	German	2.1502	2.1543	2.1609	2.1822	2.1822
4	French	2.0839	2.1082	2.1479	2.1064	2.1479
5	French	1.6871	1.9733	1.9552	1.9552	1.9787
6	English	1.6684	1.7464	1.7464	1.8327	1.8327
7	German	1.2654	1.2715	1.4843	1.4843	1.4843
8	Spanish	1.8729	1.9281	1.8559	1.9281	1.9281
9	French	2.1002	2.0969	2.0773	2.1002	2.1002

v10 Authentic−Cheap (Large) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	1.8655	1.9179	1.9764	1.9911	2.0030
Std Dev	0.4594	0.4635	0.3655	0.3525	0.3586
Avg Mean	1.5224	1.5874	1.5349	1.5534	1.5569

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	1.8768	1.8537	1.9771	1.9443	1.9771
1	French	2.3525	2.3673	2.3673	2.3777	2.3777
2	English	0.9413	0.9562	1.2536	1.3008	1.3008
3	German	2.2198	2.2221	2.2121	2.2457	2.2457
4	French	2.2105	2.2448	2.2996	2.2351	2.2996
5	French	1.8523	2.1548	2.1511	2.1511	2.1548
6	English	1.6756	1.7574	1.7574	1.8312	1.8312
7	German	1.2547	1.2891	1.4908	1.4908	1.4908
8	Spanish	2.0719	2.1348	2.0683	2.1348	2.1526
9	French	2.2000	2.1986	2.1863	2.2000	2.2000

v11 Professional−Distorted (Large) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	1.7750	1.8247	1.8765	1.8977	1.9133
Std Dev	0.4303	0.4140	0.3341	0.3139	0.3261
Avg Mean	1.4484	1.5107	1.4600	1.4749	1.4790

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	2.0527	2.0382	2.1174	2.1641	2.1641
1	French	2.1680	2.1726	2.1726	2.1850	2.1900
2	English	0.9756	1.0290	1.3382	1.3799	1.3799
3	German	2.0677	2.0745	2.0411	2.0854	2.0874
4	French	2.2031	2.2408	2.3294	2.2408	2.3294
5	French	1.6320	1.9314	1.8749	1.8749	1.9314
6	English	1.7127	1.7319	1.7685	1.7832	1.7832
7	German	1.1227	1.1594	1.3337	1.3774	1.3774
8	Spanish	1.7627	1.8335	1.7722	1.8335	1.8372
9	French	2.0527	2.0358	2.0174	2.0527	2.0527

v12 Expressive−Flat (Large) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	1.8149	1.8563	1.9296	1.9202	1.9482
Std Dev	0.4287	0.4519	0.3640	0.3593	0.3693
Avg Mean	1.4762	1.5332	1.4822	1.5004	1.5031

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	2.0991	2.0966	2.2230	2.1829	2.2230
1	French	2.2124	2.2356	2.2663	2.2356	2.2663
2	English	0.8995	0.8832	1.1952	1.1778	1.1952
3	German	2.0349	2.0397	2.0653	2.0653	2.0653
4	French	2.0778	2.0615	2.1639	2.0778	2.1639
5	French	1.8498	2.2058	2.1574	2.1574	2.2058
6	English	1.9146	1.9434	1.9497	2.0029	2.0029
7	German	1.1719	1.1811	1.3584	1.3512	1.3584
8	Spanish	1.9742	2.0180	1.9580	2.0180	2.0180
9	French	1.9146	1.8979	1.9585	1.9334	1.9831

v13 FullPos−FullNeg (Large) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	1.7943	1.8421	1.8963	1.9123	1.9265
Std Dev	0.4474	0.4435	0.3601	0.3461	0.3552
Avg Mean	1.4693	1.5295	1.4792	1.4960	1.4989

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	2.0454	2.0362	2.1145	2.1440	2.1495
1	French	2.2890	2.2936	2.2936	2.2952	2.2952
2	English	0.9306	0.9466	1.2411	1.2727	1.2727
3	German	2.1082	2.1209	2.0988	2.1393	2.1393
4	French	2.1436	2.1673	2.2611	2.1752	2.2611
5	French	1.7027	1.9983	1.9584	1.9584	1.9983
6	English	1.6638	1.7470	1.7470	1.8141	1.8141
7	German	1.1577	1.1723	1.3648	1.3687	1.3687
8	Spanish	1.7973	1.8505	1.7913	1.8505	1.8609
9	French	2.1048	2.0886	2.0928	2.1048	2.1048

v14 Warm−Robotic (Large) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	1.6974	1.7359	1.7887	1.8043	1.8119
Std Dev	0.4156	0.4236	0.3399	0.3324	0.3368
Avg Mean	1.3936	1.4459	1.3999	1.4153	1.4178

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	1.9632	1.9719	2.0616	2.0705	2.0788
1	French	2.0440	2.0538	2.0585	2.0550	2.0622
2	English	0.8698	0.8430	1.1213	1.1390	1.1390
3	German	2.1664	2.1821	2.1692	2.1821	2.1821
4	French	1.9631	1.9845	2.0474	2.0074	2.0474
5	French	1.5726	1.8374	1.8202	1.8202	1.8412
6	English	1.5978	1.6422	1.6411	1.7239	1.7239
7	German	1.1361	1.1446	1.3392	1.3392	1.3392
8	Spanish	1.7527	1.7974	1.7458	1.7974	1.7974
9	French	1.9081	1.9020	1.8825	1.9081	1.9081

v15 Natural−Robotic (Small) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	1.8026	1.8541	1.9326	1.9311	1.9696
Std Dev	0.5010	0.4799	0.3846	0.3942	0.4028
Avg Mean	1.4779	1.5389	1.4892	1.5032	1.5074

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	1.7270	1.7081	1.7852	1.8093	1.8093
1	French	2.3623	2.3793	2.4191	2.3883	2.4598
2	English	0.8584	0.9144	1.1863	1.1794	1.1863
3	German	2.3182	2.3214	2.3273	2.3333	2.3333
4	French	2.2212	2.2375	2.2341	2.2474	2.2894
5	French	1.6948	1.8322	1.9607	1.9607	2.0144
6	English	1.5675	1.7655	1.9466	1.7968	1.9466
7	German	1.2139	1.2447	1.4508	1.4508	1.4508
8	Spanish	1.8193	1.9025	1.8971	1.9025	1.9630
9	French	2.2429	2.2356	2.1185	2.2429	2.2429

v16 Authentic−Cheap (Small) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	1.7817	1.8385	1.9087	1.9118	1.9354
Std Dev	0.5341	0.5212	0.4440	0.4598	0.4623
Avg Mean	1.4495	1.5095	1.4612	1.4749	1.4803

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	1.4267	1.4622	1.5581	1.5558	1.5581
1	French	2.4885	2.4814	2.5158	2.4835	2.5732
2	English	0.8388	0.8956	1.1289	1.0909	1.1289
3	German	2.4015	2.4581	2.4679	2.4914	2.4980
4	French	2.0734	2.0838	2.0420	2.0838	2.0841
5	French	1.8814	1.9653	2.1144	2.1027	2.1144
6	English	1.4083	1.6056	1.7207	1.6402	1.7207
7	German	1.2321	1.2453	1.4454	1.4454	1.4454
8	Spanish	1.9827	2.1401	2.1290	2.1401	2.1476
9	French	2.0838	2.0481	1.9650	2.0838	2.0838

v17 Professional−Distorted (Small) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	1.6513	1.7063	1.7464	1.7580	1.7837
Std Dev	0.4352	0.4605	0.3781	0.3729	0.3824
Avg Mean	1.3513	1.4076	1.3615	1.3761	1.3789

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	1.9622	1.9468	2.0330	2.0690	2.0690
1	French	1.9589	2.0657	2.0189	2.0335	2.0781
2	English	0.8182	0.7728	1.0274	1.0490	1.0490
3	German	2.0161	2.0477	2.0879	2.0561	2.0879
4	French	2.0757	2.2022	2.1892	2.1122	2.2022
5	French	1.4342	1.7543	1.7053	1.7053	1.7543
6	English	1.5985	1.6178	1.6909	1.7206	1.7206
7	German	1.0469	1.0500	1.2384	1.1971	1.2384
8	Spanish	1.6495	1.6846	1.5991	1.6846	1.6852
9	French	1.9524	1.9206	1.8735	1.9524	1.9524

v18 Expressive−Flat (Small) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	1.7013	1.7937	1.8276	1.8426	1.8982
Std Dev	0.4111	0.4399	0.3261	0.3341	0.3534
Avg Mean	1.3805	1.4300	1.3837	1.3990	1.4019

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	2.0113	2.0734	2.0734	2.1087	2.1087
1	French	1.8646	2.0749	2.0749	2.1048	2.1626
2	English	0.8097	0.8343	1.1538	1.1034	1.1538
3	German	1.9289	1.9227	1.9575	1.9506	1.9797
4	French	2.0792	2.0168	2.1658	2.0792	2.1658
5	French	1.6738	2.1693	1.8946	1.9074	2.1693
6	English	1.7307	1.7948	1.7573	1.8398	1.8398
7	German	1.1270	1.1637	1.3555	1.3880	1.3880
8	Spanish	1.9344	2.0282	1.9304	2.0282	2.0979
9	French	1.8532	1.8592	1.9131	1.9163	1.9163

v19 FullPos−FullNeg (Small) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	1.6088	1.6750	1.7174	1.7289	1.7616
Std Dev	0.3849	0.3989	0.3002	0.3166	0.3198
Avg Mean	1.3245	1.3711	1.3261	1.3382	1.3428

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	1.8097	1.7644	1.8451	1.9185	1.9185
1	French	1.9501	2.0587	2.0406	2.0406	2.0958
2	English	0.8356	0.8575	1.1132	1.1050	1.1132
3	German	1.9650	2.0260	2.0101	2.0182	2.0464
4	French	1.8949	2.0296	2.0049	1.9771	2.0296
5	French	1.4371	1.7113	1.6936	1.7124	1.7480
6	English	1.4514	1.5264	1.6523	1.5550	1.6523
7	German	1.1311	1.1354	1.3329	1.3050	1.3329
8	Spanish	1.7368	1.7817	1.7025	1.7817	1.8034
9	French	1.8759	1.8590	1.7791	1.8759	1.8759

v20 Warm−Robotic (Small) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	1.7561	1.7931	1.8711	1.8821	1.9198
Std Dev	0.5128	0.4824	0.3923	0.4187	0.4156
Avg Mean	1.4272	1.4791	1.4304	1.4457	1.4476

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	2.0760	1.9657	2.0306	2.0822	2.0975
1	French	2.0265	2.0437	2.0827	2.0473	2.1600
2	English	0.8115	0.8188	1.1059	1.0800	1.1059
3	German	2.2480	2.2188	2.2751	2.3131	2.3131
4	French	2.2909	2.3624	2.3603	2.3928	2.3928
5	French	1.6177	1.8221	1.8468	1.8619	1.9305
6	English	1.5157	1.7344	1.8718	1.7344	1.8718
7	German	1.0554	1.1254	1.3563	1.3563	1.3563
8	Spanish	1.7153	1.7404	1.7171	1.7489	1.7656
9	French	2.2044	2.0996	2.0646	2.2044	2.2044

v21 Sanitized Prompt (Large) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	1.1625	1.1956	1.2300	1.2426	1.2543
Std Dev	0.3060	0.3030	0.2460	0.2545	0.2517
Avg Mean	0.9454	0.9796	0.9509	0.9609	0.9635

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	1.4779	1.4527	1.5485	1.5434	1.5485
1	French	1.4304	1.4440	1.4440	1.4551	1.4551
2	English	0.5605	0.5952	0.7945	0.7625	0.7945
3	German	1.3969	1.3842	1.3788	1.3906	1.4043
4	French	1.2123	1.2185	1.2123	1.2185	1.2325
5	French	1.1994	1.4025	1.3866	1.3961	1.4199
6	English	1.1788	1.2434	1.2325	1.2911	1.2911
7	German	0.6810	0.6975	0.8356	0.8356	0.8356
8	Spanish	1.1934	1.2387	1.1831	1.2387	1.2387
9	French	1.2943	1.2796	1.2842	1.2943	1.3225

v22 Sanitized Prompt (Small) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	0.7789	0.8160	0.8245	0.8353	0.8623
Std Dev	0.2527	0.2791	0.2632	0.2584	0.2538
Avg Mean	0.6175	0.6439	0.6281	0.6316	0.6324

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	1.1392	1.2228	1.2553	1.2746	1.2746
1	French	0.7895	0.8362	0.7981	0.8180	0.8362
2	English	0.3173	0.3265	0.3999	0.4420	0.4420
3	German	0.9779	0.9935	1.0080	0.9989	1.0138
4	French	0.6726	0.6710	0.6748	0.6828	0.6828
5	French	0.7903	0.9830	0.8756	0.8983	0.9830
6	English	1.1144	1.1540	1.1778	1.1778	1.1778
7	German	0.5287	0.5264	0.5855	0.5855	0.6361
8	Spanish	0.7127	0.7282	0.7284	0.7282	0.8130
9	French	0.7466	0.7187	0.7416	0.7466	0.7637

v23 Sanitized−Uncanny (Large) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	1.9633	2.0168	2.0701	2.0903	2.0995
Std Dev	0.5137	0.5115	0.4097	0.4161	0.4159
Avg Mean	1.6025	1.6612	1.6097	1.6279	1.6312

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	2.4235	2.4099	2.5149	2.5075	2.5149
1	French	2.4211	2.4300	2.4300	2.4361	2.4361
2	English	0.9111	0.9389	1.2605	1.2413	1.2605
3	German	2.3757	2.3748	2.3629	2.3808	2.3907
4	French	2.1337	2.1473	2.1297	2.1389	2.1473
5	French	1.9674	2.2905	2.2750	2.2811	2.3235
6	English	1.9078	2.0176	2.0176	2.1223	2.1223
7	German	1.2078	1.2391	1.4548	1.4548	1.4548
8	Spanish	2.0756	2.1304	2.0623	2.1304	2.1331
9	French	2.2095	2.1894	2.1934	2.2095	2.2118

v24 Sanitized−Uncanny (Small) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	1.6995	1.7653	1.7926	1.8202	1.8531
Std Dev	0.4661	0.5016	0.4490	0.4356	0.4366
Avg Mean	1.3750	1.4230	1.3857	1.3972	1.4009

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	2.2043	2.2733	2.3453	2.3853	2.3853
1	French	1.9651	2.0335	2.0216	2.0457	2.0783
2	English	0.6984	0.6728	0.8523	0.9319	0.9319
3	German	2.2074	2.2074	2.2496	2.2390	2.2496
4	French	1.8575	1.8757	1.8739	1.8575	1.9090
5	French	1.6828	2.0877	1.8975	1.9718	2.0877
6	English	1.9177	2.0165	2.0698	2.0698	2.0698
7	German	1.1686	1.1817	1.3182	1.3182	1.3646
8	Spanish	1.5776	1.6667	1.6131	1.6667	1.7392
9	French	1.7159	1.6377	1.6843	1.7159	1.7159

Prompt #0 — English (Silicon Valley accent)

Language: English Accent: Silicon Valley accent Scored: 100/100

DramaBox Prompt

A young woman, possessing an extremely high fundamental frequency and bright, delicate harmonic texture, with a brisk, elevated momentum and a Silicon Valley accent; this is a pristine, high-quality studio voice recording with no background noise. She delivers the lines with a teasing lightness that occasionally borders on nervous energy, punctuated by small moments of genuine relief. (A brief, high-pitched Giggle escapes as she begins.) "Honestly, you think finding a solid Firestone review is that hard? Boggle, really. But look, that Lys thing actually worked." (She pauses, a subtle Contemplation washing over her features, then manages a slight, contained Chuckle.) "Just wait, I'll show you." She concludes with a soft, almost satisfied sigh, allowing the tension to dissipate.

Prompt #1 — French

Language: French Scored: 100/100

DramaBox Prompt

High-pitched, delicately resonant, and possessing the slightly strained clarity of a young adult female soprano; the voice is bright and purely head-dominant, engineered for intimate projection.

Pauses briefly, gathering strength. "Malgré la profondeur de cette sombre forêt, je sens toujours cette confiance absolue en mon chemin, guidée par la lumière."
A slight, almost imperceptible hardening of tone. "Même au cœur de cette nuit insondable, ma boussole intérieure me montre la seule direction véritable."
She finishes, a note of unwavering certainty settling.

The pace remains glacially slow throughout the utterance. The delivery conveys immense, quiet self-assurance.

[Prompts 0-1] · Prompts 2-3 · Prompts 4-5 · Prompts 6-7 · Prompts 8-9