Scenema Audio + Chatterbox VC — Best-of-N: Diminishing Returns

10 Path A prompts × 100 candidates — 29 ranking methods — Page 1/5

Scenema Main Scenema Best-of-N Scenema Raw DramaBox Best-of-N DramaBox Main

[Prompts 0-1] · Prompts 2-3 · Prompts 4-5 · Prompts 6-7 · Prompts 8-9

Ranking Method:

Methodology: Ranking Method Formulas & Text Prompts

All methods use reward = (1 − WER) × max(score, 0). The score varies per method as described below.

#	Key	Score Formula	Text Prompt(s)
0	standard	Content Enjoyment	—
1	clap_lq	cos(audio, quality_text)	"pleasant, realistic, genuine, authentic, natural performance, high-quality recording"
2	clap_sq	cos(audio, quality_text)	"pleasant, realistic, genuine, authentic, natural performance, high-quality recording"
3	clap_lp	cos(audio, prompt)	Original DramaBox prompt
4	clap_sp	cos(audio, prompt)	Original DramaBox prompt
5	v1_nat_L	cos(audio, nat)	"natural, spontaneous, lifelike speech with genuine emotion"
6	v2_auth_L	cos(audio, auth)	"authentic, emotionally truthful, deeply felt voice performance"
7	v3_pro_L	cos(audio, pro)	"professional studio recording, crystal clear high-fidelity audio"
8	v4_expr_L	cos(audio, expr)	"expressive, dynamic voice acting with rich emotional range"
9	v5_cine_L	cos(audio, cine)	"immersive cinematic narration, compelling storytelling"
10	v6_nat_S	cos(audio, nat)	"natural, spontaneous, lifelike speech with genuine emotion"
11	v7_auth_S	cos(audio, auth)	"authentic, emotionally truthful, deeply felt voice performance"
12	v8_pro_S	cos(audio, pro)	"professional studio recording, crystal clear high-fidelity audio"
13	v9_nr_L	cos(audio, nat) − cos(audio, rob)	+ "natural, spontaneous, lifelike speech with genuine emotion" / − "robotic, mechanical, monotonous, synthetic computer speech"
14	v10_ac_L	cos(audio, auth) − cos(audio, cheap)	+ "authentic, emotionally truthful, deeply felt voice performance" / − "cheap, amateurish, rehearsed, stilted text-to-speech output"
15	v11_pd_L	cos(audio, pro) − cos(audio, dist)	+ "professional studio recording, crystal clear high-fidelity audio" / − "distorted, noisy, muffled, low-quality poor recording"
16	v12_ef_L	cos(audio, expr) − cos(audio, flat)	+ "expressive, dynamic voice acting with rich emotional range" / − "flat, lifeless, boring, emotionally dead recitation"
17	v13_ff_L	cos(audio, full_pos) − cos(audio, full_neg)	+ "natural spontaneous genuine authentic high-quality voice performance" / − "robotic distorted monotonous rehearsed cheap artificial synthetic"
18	v14_wr_L	cos(audio, warm) − cos(audio, rob)	+ "warm, pleasant, engaging conversational human voice" / − "robotic, mechanical, monotonous, synthetic computer speech"
19	v15_nr_S	cos(audio, nat) − cos(audio, rob)	+ "natural, spontaneous, lifelike speech with genuine emotion" / − "robotic, mechanical, monotonous, synthetic computer speech"
20	v16_ac_S	cos(audio, auth) − cos(audio, cheap)	+ "authentic, emotionally truthful, deeply felt voice performance" / − "cheap, amateurish, rehearsed, stilted text-to-speech output"
21	v17_pd_S	cos(audio, pro) − cos(audio, dist)	+ "professional studio recording, crystal clear high-fidelity audio" / − "distorted, noisy, muffled, low-quality poor recording"
22	v18_ef_S	cos(audio, expr) − cos(audio, flat)	+ "expressive, dynamic voice acting with rich emotional range" / − "flat, lifeless, boring, emotionally dead recitation"
23	v19_ff_S	cos(audio, full_pos) − cos(audio, full_neg)	+ "natural spontaneous genuine authentic high-quality voice performance" / − "robotic distorted monotonous rehearsed cheap artificial synthetic"
24	v20_wr_S	cos(audio, warm) − cos(audio, rob)	+ "warm, pleasant, engaging conversational human voice" / − "robotic, mechanical, monotonous, synthetic computer speech"
25	v21_san_L	cos(audio, sanitized_prompt)	Quoted speech removed (Large)
26	v22_san_S	cos(audio, sanitized_prompt)	Quoted speech removed (Small)
27	v23_snr_L	cos(audio, sanitized) − cos(audio, neg_san)	Sanitized / − "robotic, distorted, uncanny" (Large)
28	v24_snr_S	cos(audio, sanitized) − cos(audio, neg_san)	Sanitized / − "robotic, distorted, uncanny" (Small)

Cross-Method Diminishing Returns Comparison

Method	N=5	N=10	N=25	N=50	N=100	Gain N=5→100	Knee Point
Standard: (1−WER) × Content Enjoyment	3.9499	4.0503	4.1326	4.2322	4.3374	+0.3875	N=100
VoiceCLAP-Large × Quality Text	0.9211	0.9425	0.9569	0.9797	1.0083	+0.0872	N=50
VoiceCLAP-Small × Quality Text	0.7002	0.7131	0.7621	0.7904	0.8128	+0.1125	N=100
VoiceCLAP-Large × Prompt Match	1.3721	1.4216	1.4328	1.4692	1.4961	+0.1240	N=25
VoiceCLAP-Small × Prompt Match	0.7432	0.8048	0.8119	0.8366	0.8575	+0.1143	N=25
v1 Natural (Large)	0.9969	1.0212	1.0380	1.0663	1.0956	+0.0987	N=100
v2 Authentic (Large)	1.0189	1.0470	1.0676	1.0947	1.1260	+0.1071	N=100
v3 Professional (Large)	0.8686	0.8893	0.9063	0.9264	0.9556	+0.0870	N=50
v4 Expressive (Large)	1.0448	1.0798	1.1098	1.1203	1.1697	+0.1249	N=50
v5 Cinematic (Large)	0.9862	1.0159	1.0463	1.0487	1.0960	+0.1098	N=50
v6 Natural (Small)	0.7051	0.7428	0.7822	0.8010	0.8394	+0.1343	N=100
v7 Authentic (Small)	0.6484	0.6718	0.7154	0.7355	0.7643	+0.1159	N=100
v8 Professional (Small)	0.5983	0.6205	0.6340	0.6668	0.6859	+0.0876	N=100
v9 Natural−Robotic (Large)	1.7625	1.8124	1.8343	1.8778	1.9138	+0.1513	N=25
v10 Authentic−Cheap (Large)	1.8266	1.8804	1.9070	1.9459	1.9885	+0.1619	N=25
v11 Professional−Distorted (Large)	1.7341	1.7810	1.8064	1.8522	1.9012	+0.1671	N=25
v12 Expressive−Flat (Large)	1.7763	1.8355	1.8676	1.8895	1.9485	+0.1722	N=50
v13 FullPos−FullNeg (Large)	1.7583	1.8020	1.8264	1.8647	1.9145	+0.1561	N=25
v14 Warm−Robotic (Large)	1.6636	1.7092	1.7270	1.7695	1.8088	+0.1452	N=25
v15 Natural−Robotic (Small)	1.7629	1.8004	1.8645	1.8902	1.9578	+0.1949	N=50
v16 Authentic−Cheap (Small)	1.7445	1.7845	1.8444	1.8749	1.9338	+0.1893	N=50
v17 Professional−Distorted (Small)	1.5994	1.6570	1.6846	1.7406	1.7715	+0.1721	N=100
v18 Expressive−Flat (Small)	1.6577	1.7797	1.7808	1.8097	1.9099	+0.2522	N=25
v19 FullPos−FullNeg (Small)	1.5700	1.6128	1.6612	1.6989	1.7469	+0.1769	N=50
v20 Warm−Robotic (Small)	1.7081	1.7511	1.8092	1.8405	1.9087	+0.2006	N=50
v21 Sanitized Prompt (Large)	1.1298	1.1680	1.1868	1.2151	1.2431	+0.1134	N=100
v22 Sanitized Prompt (Small)	0.7170	0.7874	0.7996	0.8311	0.8531	+0.1360	N=100
v23 Sanitized−Uncanny (Large)	1.9141	1.9748	1.9965	2.0393	2.0803	+0.1662	N=25
v24 Sanitized−Uncanny (Small)	1.6078	1.7076	1.7431	1.7802	1.8432	+0.2353	N=50

Diminishing Returns — All Methods Overlaid

Marginal Improvement per Additional Candidate

Method	N=5→10	N=10→25	N=25→50	N=50→100
Standard: (1−WER) × Content Enjoyment	0.02008/cand (2.5%)	0.00549/cand (2.0%)	0.00398/cand (2.4%)	0.00210/cand (2.5%)
VoiceCLAP-Large × Quality Text	0.00428/cand (2.3%)	0.00096/cand (1.5%)	0.00091/cand (2.4%)	0.00057/cand (2.9%)
VoiceCLAP-Small × Quality Text	0.00258/cand (1.8%)	0.00327/cand (6.9%)	0.00113/cand (3.7%)	0.00045/cand (2.8%)
VoiceCLAP-Large × Prompt Match	0.00990/cand (3.6%)	0.00075/cand (0.8%)	0.00145/cand (2.5%)	0.00054/cand (1.8%)
VoiceCLAP-Small × Prompt Match	0.01232/cand (8.3%)	0.00047/cand (0.9%)	0.00099/cand (3.0%)	0.00042/cand (2.5%)
v1 Natural (Large)	0.00488/cand (2.4%)	0.00111/cand (1.6%)	0.00113/cand (2.7%)	0.00059/cand (2.7%)
v2 Authentic (Large)	0.00560/cand (2.7%)	0.00138/cand (2.0%)	0.00108/cand (2.5%)	0.00063/cand (2.9%)
v3 Professional (Large)	0.00415/cand (2.4%)	0.00113/cand (1.9%)	0.00080/cand (2.2%)	0.00058/cand (3.2%)
v4 Expressive (Large)	0.00700/cand (3.4%)	0.00200/cand (2.8%)	0.00042/cand (0.9%)	0.00099/cand (4.4%)
v5 Cinematic (Large)	0.00595/cand (3.0%)	0.00203/cand (3.0%)	0.00010/cand (0.2%)	0.00095/cand (4.5%)
v6 Natural (Small)	0.00753/cand (5.3%)	0.00263/cand (5.3%)	0.00075/cand (2.4%)	0.00077/cand (4.8%)
v7 Authentic (Small)	0.00468/cand (3.6%)	0.00290/cand (6.5%)	0.00080/cand (2.8%)	0.00058/cand (3.9%)
v8 Professional (Small)	0.00443/cand (3.7%)	0.00090/cand (2.2%)	0.00131/cand (5.2%)	0.00038/cand (2.9%)
v9 Natural−Robotic (Large)	0.00998/cand (2.8%)	0.00146/cand (1.2%)	0.00174/cand (2.4%)	0.00072/cand (1.9%)
v10 Authentic−Cheap (Large)	0.01076/cand (2.9%)	0.00177/cand (1.4%)	0.00156/cand (2.0%)	0.00085/cand (2.2%)
v11 Professional−Distorted (Large)	0.00938/cand (2.7%)	0.00170/cand (1.4%)	0.00183/cand (2.5%)	0.00098/cand (2.6%)
v12 Expressive−Flat (Large)	0.01183/cand (3.3%)	0.00214/cand (1.7%)	0.00088/cand (1.2%)	0.00118/cand (3.1%)
v13 FullPos−FullNeg (Large)	0.00874/cand (2.5%)	0.00163/cand (1.4%)	0.00153/cand (2.1%)	0.00099/cand (2.7%)
v14 Warm−Robotic (Large)	0.00913/cand (2.7%)	0.00119/cand (1.0%)	0.00170/cand (2.5%)	0.00079/cand (2.2%)
v15 Natural−Robotic (Small)	0.00752/cand (2.1%)	0.00427/cand (3.6%)	0.00103/cand (1.4%)	0.00135/cand (3.6%)
v16 Authentic−Cheap (Small)	0.00802/cand (2.3%)	0.00399/cand (3.4%)	0.00122/cand (1.7%)	0.00118/cand (3.1%)
v17 Professional−Distorted (Small)	0.01152/cand (3.6%)	0.00184/cand (1.7%)	0.00224/cand (3.3%)	0.00062/cand (1.8%)
v18 Expressive−Flat (Small)	0.02440/cand (7.4%)	0.00008/cand (0.1%)	0.00115/cand (1.6%)	0.00200/cand (5.5%)
v19 FullPos−FullNeg (Small)	0.00856/cand (2.7%)	0.00322/cand (3.0%)	0.00151/cand (2.3%)	0.00096/cand (2.8%)
v20 Warm−Robotic (Small)	0.00859/cand (2.5%)	0.00388/cand (3.3%)	0.00125/cand (1.7%)	0.00136/cand (3.7%)
v21 Sanitized Prompt (Large)	0.00765/cand (3.4%)	0.00125/cand (1.6%)	0.00113/cand (2.4%)	0.00056/cand (2.3%)
v22 Sanitized Prompt (Small)	0.01407/cand (9.8%)	0.00081/cand (1.5%)	0.00126/cand (3.9%)	0.00044/cand (2.6%)
v23 Sanitized−Uncanny (Large)	0.01214/cand (3.2%)	0.00145/cand (1.1%)	0.00171/cand (2.1%)	0.00082/cand (2.0%)
v24 Sanitized−Uncanny (Small)	0.01996/cand (6.2%)	0.00236/cand (2.1%)	0.00148/cand (2.1%)	0.00126/cand (3.5%)

Ablation: Pronunciation Suffix Effect

Comparing N=10 without suffix vs N=10 with suffix.

Ranking Method	Without Suffix (N=10)			With Suffix (N=10)			Delta
	Mean	Best	Median	Mean	Best	Median	Δ Mean	Δ Best
Standard: (1−WER) × Content Enjoyment	3.2689	3.9994	3.4583	3.1703	3.8596	3.2474	-0.0985	-0.1398
VoiceCLAP-Large × Quality Text	0.7617	0.9283	0.8028	0.6545	0.7825	0.6742	-0.1071	-0.1459
VoiceCLAP-Small × Quality Text	0.5434	0.6768	0.5619	0.6545	0.7825	0.6742	+0.1111	+0.1057
VoiceCLAP-Large × Prompt Match	1.1312	1.3906	1.2016	0.6545	0.7825	0.6742	-0.4767	-0.6081
VoiceCLAP-Small × Prompt Match	0.6277	0.7815	0.6585	0.6545	0.7825	0.6742	+0.0269	+0.0009
v1 Natural (Large)	0.8204	1.0064	0.8678	0.6545	0.7825	0.6742	-0.1659	-0.2239
v2 Authentic (Large)	0.8421	1.0298	0.8928	0.6545	0.7825	0.6742	-0.1876	-0.2473
v3 Professional (Large)	0.7172	0.8772	0.7569	0.6545	0.7825	0.6742	-0.0627	-0.0948
v4 Expressive (Large)	0.8646	1.0551	0.9225	0.6545	0.7825	0.6742	-0.2101	-0.2726
v5 Cinematic (Large)	0.8114	0.9955	0.8627	0.6545	0.7825	0.6742	-0.1569	-0.2130
v6 Natural (Small)	0.5832	0.7281	0.6083	0.6545	0.7825	0.6742	+0.0714	+0.0544
v7 Authentic (Small)	0.5299	0.6751	0.5559	0.6545	0.7825	0.6742	+0.1247	+0.1074
v8 Professional (Small)	0.4637	0.5698	0.4835	0.6545	0.7825	0.6742	+0.1908	+0.2126
v9 Natural−Robotic (Large)	1.4585	1.7750	1.5435	1.3091	1.5649	1.3485	-0.1494	-0.2100
v10 Authentic−Cheap (Large)	1.5104	1.8412	1.6025	1.3091	1.5649	1.3485	-0.2013	-0.2762
v11 Professional−Distorted (Large)	1.4294	1.7589	1.5130	1.3091	1.5649	1.3485	-0.1204	-0.1940
v12 Expressive−Flat (Large)	1.4713	1.7933	1.5638	1.3091	1.5649	1.3485	-0.1622	-0.2284
v13 FullPos−FullNeg (Large)	1.4548	1.7743	1.5384	1.3091	1.5649	1.3485	-0.1457	-0.2094
v14 Warm−Robotic (Large)	1.3816	1.6771	1.4605	1.3091	1.5649	1.3485	-0.0725	-0.1122
v15 Natural−Robotic (Small)	1.4460	1.7797	1.5273	1.3091	1.5649	1.3485	-0.1369	-0.2147
v16 Authentic−Cheap (Small)	1.4264	1.7975	1.5036	1.3091	1.5649	1.3485	-0.1174	-0.2326
v17 Professional−Distorted (Small)	1.3229	1.6166	1.3960	1.3091	1.5649	1.3485	-0.0138	-0.0517
v18 Expressive−Flat (Small)	1.3873	1.7335	1.4635	1.3091	1.5649	1.3485	-0.0782	-0.1686
v19 FullPos−FullNeg (Small)	1.2837	1.5825	1.3548	1.3091	1.5649	1.3485	+0.0254	-0.0175
v20 Warm−Robotic (Small)	1.3895	1.7197	1.4676	1.3091	1.5649	1.3485	-0.0804	-0.1548
v21 Sanitized Prompt (Large)	0.9259	1.1499	0.9847	0.6545	0.7825	0.6742	-0.2713	-0.3674
v22 Sanitized Prompt (Small)	0.6045	0.7650	0.6294	0.6545	0.7825	0.6742	+0.0500	+0.0175
v23 Sanitized−Uncanny (Large)	1.5768	1.9403	1.6740	1.3091	1.5649	1.3485	-0.2677	-0.3754
v24 Sanitized−Uncanny (Small)	1.3486	1.6856	1.4263	1.3091	1.5649	1.3485	-0.0395	-0.1207

Per-Prompt Ablation: Standard Reward (N=10)

#	Lang	No Suffix Mean	No Suffix Best	With Suffix Mean	With Suffix Best	Δ Mean	Δ Best
0	English	3.4860	4.4661	3.3486	3.5739	-0.1374	-0.8922
1	French	4.0988	4.9332	4.6067	5.0536	+0.5079	+0.1204
2	English	1.1279	1.3485	1.1376	1.3552	+0.0097	+0.0067
3	German	4.9285	5.0369	4.9212	4.9992	-0.0073	-0.0377
4	French	4.1896	4.5076	3.7950	4.4085	-0.3946	-0.0991
5	French	2.4379	4.2481	1.8008	4.2462	-0.6371	-0.0019
6	English	3.1621	3.9004	3.0621	3.7563	-0.1000	-0.1441
7	German	2.2189	2.9709	2.2829	2.5923	+0.0640	-0.3786
8	Spanish	3.7877	4.5971	3.2769	4.6148	-0.5108	+0.0177
9	French	3.2513	3.9850	3.4716	3.9956	+0.2203	+0.0106

Standard: (1−WER) × Content Enjoyment — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	3.9499	4.0503	4.1326	4.2322	4.3374
Std Dev	1.0361	1.1191	1.1542	0.9862	0.8503
Avg Mean	3.3231	3.3387	3.2637	3.2617	3.2778

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	4.3500	4.4322	4.7383	4.7751	4.8316
1	French	4.8991	4.7082	4.9703	5.0533	5.0533
2	English	1.9858	1.3609	1.3513	2.0546	2.6745
3	German	5.0848	5.0334	5.0562	5.0881	5.0943
4	French	4.4485	4.5793	4.7352	4.6121	4.7352
5	French	3.5508	4.8004	4.7510	4.7510	4.8004
6	English	3.6361	3.8367	3.8307	3.8307	3.9862
7	German	2.4792	2.9760	2.9955	3.0226	3.0226
8	Spanish	4.7640	4.7223	4.4861	4.7223	4.7640
9	French	4.3008	4.0539	4.4119	4.4119	4.4119

VoiceCLAP-Large × Quality Text — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	0.9211	0.9425	0.9569	0.9797	1.0083
Std Dev	0.2484	0.2534	0.2678	0.2269	0.1915
Avg Mean	0.7732	0.7865	0.7630	0.7635	0.7672

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	1.0773	1.0829	1.1419	1.1794	1.1794
1	French	1.1405	1.0858	1.1414	1.1502	1.1502
2	English	0.5424	0.3723	0.3766	0.5424	0.7031
3	German	1.1659	1.1632	1.1786	1.1740	1.1786
4	French	1.1891	1.2254	1.2540	1.2254	1.2540
5	French	0.7087	0.9286	0.8981	0.8981	0.9286
6	English	0.8410	0.9019	0.9239	0.9076	0.9239
7	German	0.5704	0.6978	0.7206	0.7206	0.7654
8	Spanish	0.8780	0.8871	0.8216	0.8871	0.8871
9	French	1.0975	1.0797	1.1122	1.1122	1.1122

VoiceCLAP-Small × Quality Text — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	0.7002	0.7131	0.7621	0.7904	0.8128
Std Dev	0.2619	0.2686	0.2662	0.2568	0.2409
Avg Mean	0.5708	0.5822	0.5655	0.5663	0.5663

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	0.7558	0.7150	0.7519	0.8193	0.8193
1	French	0.8168	0.8367	0.9473	0.9586	0.9910
2	English	0.3473	0.2701	0.2792	0.3510	0.4441
3	German	0.9741	1.0200	1.0530	1.0835	1.0835
4	French	0.9963	1.1118	1.0419	1.1118	1.1118
5	French	0.4261	0.4036	0.5839	0.5839	0.5839
6	English	0.6510	0.6714	0.7872	0.7143	0.7872
7	German	0.3838	0.5485	0.5099	0.5820	0.5820
8	Spanish	0.5974	0.6164	0.6089	0.6460	0.6618
9	French	1.0538	0.9380	1.0582	1.0538	1.0631

VoiceCLAP-Large × Prompt Match — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	1.3721	1.4216	1.4328	1.4692	1.4961
Std Dev	0.4051	0.4346	0.4325	0.3975	0.3458
Avg Mean	1.1526	1.1629	1.1362	1.1359	1.1405

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	1.5854	1.6481	1.7236	1.7487	1.7487
1	French	1.8114	1.7128	1.8114	1.8295	1.8295
2	English	0.6009	0.4222	0.4322	0.6009	0.8299
3	German	1.8505	1.8610	1.8732	1.8732	1.8732
4	French	1.5178	1.5404	1.5178	1.5404	1.5404
5	French	1.2678	1.6804	1.6614	1.6614	1.6804
6	English	1.2336	1.3489	1.3319	1.3631	1.3631
7	German	0.8176	0.9321	1.0047	1.0056	1.0056
8	Spanish	1.6013	1.6346	1.5167	1.6346	1.6346
9	French	1.4345	1.4352	1.4553	1.4345	1.4553

VoiceCLAP-Small × Prompt Match — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	0.7432	0.8048	0.8119	0.8366	0.8575
Std Dev	0.2636	0.2904	0.2888	0.2739	0.2613
Avg Mean	0.6323	0.6431	0.6310	0.6272	0.6293

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	1.0831	1.1955	1.2086	1.1961	1.2086
1	French	0.7432	0.7705	0.7957	0.7961	0.8134
2	English	0.3105	0.2363	0.2268	0.3107	0.3718
3	German	1.0727	1.0810	1.1113	1.1630	1.1630
4	French	0.7075	0.7617	0.7606	0.7637	0.7637
5	French	0.5903	0.9346	0.7457	0.8766	0.9346
6	English	1.0183	1.1058	1.1293	1.1214	1.1293
7	German	0.4189	0.5340	0.6233	0.6045	0.6233
8	Spanish	0.6782	0.6574	0.6805	0.6972	0.7299
9	French	0.8091	0.7712	0.8369	0.8369	0.8369

v1 Natural (Large) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	0.9969	1.0212	1.0380	1.0663	1.0956
Std Dev	0.2571	0.2545	0.2695	0.2241	0.1802
Avg Mean	0.8320	0.8452	0.8195	0.8203	0.8246

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	1.1857	1.1952	1.2568	1.2876	1.2876
1	French	1.2315	1.1750	1.2315	1.2445	1.2445
2	English	0.6025	0.4230	0.4229	0.6025	0.8102
3	German	1.1574	1.1430	1.1649	1.1572	1.1726
4	French	1.2454	1.2471	1.2950	1.2807	1.2950
5	French	0.8118	1.0528	1.0213	1.0213	1.0528
6	English	0.9111	0.9781	0.9676	0.9781	0.9784
7	German	0.6136	0.7872	0.8525	0.8525	0.8758
8	Spanish	0.9475	0.9763	0.9047	0.9763	0.9763
9	French	1.2620	1.2348	1.2625	1.2620	1.2625

v2 Authentic (Large) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	1.0189	1.0470	1.0676	1.0947	1.1260
Std Dev	0.2615	0.2638	0.2730	0.2190	0.1773
Avg Mean	0.8521	0.8651	0.8396	0.8412	0.8456

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	1.0628	1.0670	1.1388	1.1636	1.1636
1	French	1.3066	1.2607	1.3066	1.3200	1.3200
2	English	0.5752	0.4125	0.4117	0.6001	0.8088
3	German	1.2259	1.2139	1.2329	1.2223	1.2363
4	French	1.2335	1.2342	1.2831	1.2528	1.2831
5	French	0.8896	1.1530	1.1364	1.1364	1.1530
6	English	0.8826	0.9831	0.9943	0.9831	0.9943
7	German	0.6352	0.7913	0.8354	0.8683	0.8683
8	Spanish	1.1407	1.1630	1.0670	1.1630	1.1630
9	French	1.2372	1.1908	1.2696	1.2372	1.2696

v3 Professional (Large) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	0.8686	0.8893	0.9063	0.9264	0.9556
Std Dev	0.2310	0.2348	0.2530	0.2122	0.1816
Avg Mean	0.7296	0.7414	0.7192	0.7193	0.7230

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	1.0141	1.0347	1.0811	1.1173	1.1173
1	French	1.0660	1.0103	1.0754	1.0801	1.0869
2	English	0.4986	0.3644	0.3644	0.5153	0.6674
3	German	1.1199	1.1236	1.1369	1.1286	1.1369
4	French	1.1103	1.1361	1.1688	1.1361	1.1688
5	French	0.6536	0.8675	0.8342	0.8342	0.8675
6	English	0.8106	0.8538	0.9262	0.8932	0.9262
7	German	0.5695	0.6619	0.6619	0.6801	0.7060
8	Spanish	0.8302	0.8381	0.7731	0.8381	0.8381
9	French	1.0133	1.0030	1.0409	1.0409	1.0409

v4 Expressive (Large) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	1.0448	1.0798	1.1098	1.1203	1.1697
Std Dev	0.2537	0.2871	0.2920	0.2333	0.1949
Avg Mean	0.8741	0.8862	0.8619	0.8627	0.8664

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	1.1818	1.2041	1.2898	1.2948	1.2948
1	French	1.3248	1.2933	1.3248	1.3150	1.3482
2	English	0.6020	0.4137	0.4137	0.6054	0.8385
3	German	1.1259	1.1303	1.1303	1.1303	1.1303
4	French	1.2500	1.2681	1.3472	1.2977	1.3472
5	French	0.9723	1.2897	1.2420	1.2420	1.2897
6	English	1.1058	1.1917	1.2326	1.1752	1.2326
7	German	0.5900	0.7144	0.8013	0.8013	0.8126
8	Spanish	1.1330	1.1583	1.0709	1.1583	1.1583
9	French	1.1626	1.1348	1.2452	1.1826	1.2452

v5 Cinematic (Large) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	0.9862	1.0159	1.0463	1.0487	1.0960
Std Dev	0.2490	0.2770	0.2902	0.2319	0.2100
Avg Mean	0.8194	0.8336	0.8095	0.8105	0.8141

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	1.0256	1.0255	1.1112	1.1020	1.1112
1	French	1.2755	1.1957	1.2755	1.2539	1.2755
2	English	0.5470	0.3853	0.3853	0.5470	0.7394
3	German	1.1468	1.1458	1.1464	1.1584	1.1584
4	French	1.2074	1.2290	1.3202	1.2269	1.3202
5	French	0.9483	1.2607	1.2090	1.2090	1.2607
6	English	0.9812	1.0776	1.1659	1.1002	1.1659
7	German	0.5633	0.6707	0.7077	0.7216	0.7216
8	Spanish	1.0284	1.0300	0.9650	1.0300	1.0300
9	French	1.1382	1.1388	1.1769	1.1382	1.1769

v6 Natural (Small) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	0.7051	0.7428	0.7822	0.8010	0.8394
Std Dev	0.2490	0.2466	0.2551	0.2276	0.2172
Avg Mean	0.5952	0.6060	0.5918	0.5900	0.5929

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	0.4792	0.5143	0.5181	0.5728	0.5728
1	French	1.0104	1.0417	1.1146	1.1009	1.1587
2	English	0.3315	0.2907	0.2863	0.3804	0.4947
3	German	0.9029	0.9546	0.9873	0.9872	1.0199
4	French	0.8190	0.8630	0.8701	0.9605	0.9605
5	French	0.5005	0.5527	0.7225	0.7225	0.7225
6	English	0.6633	0.7047	0.8337	0.7120	0.8337
7	German	0.4840	0.6682	0.6294	0.6974	0.6974
8	Spanish	0.8113	0.7829	0.7950	0.8275	0.8690
9	French	1.0490	1.0547	1.0647	1.0490	1.0647

v7 Authentic (Small) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	0.6484	0.6718	0.7154	0.7355	0.7643
Std Dev	0.2243	0.2197	0.2322	0.2159	0.2032
Avg Mean	0.5389	0.5498	0.5341	0.5322	0.5360

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	0.4540	0.4522	0.4923	0.5184	0.5184
1	French	0.8982	0.9189	1.0325	1.0340	1.0933
2	English	0.3151	0.2946	0.2949	0.3762	0.4996
3	German	0.8777	0.9326	0.9656	0.9917	0.9917
4	French	0.7368	0.8217	0.8058	0.8405	0.8548
5	French	0.5138	0.5512	0.6930	0.6930	0.6930
6	English	0.5382	0.5364	0.6964	0.6203	0.6964
7	German	0.4153	0.5608	0.4886	0.5608	0.5608
8	Spanish	0.8523	0.8358	0.8042	0.8366	0.8523
9	French	0.8831	0.8142	0.8803	0.8831	0.8831

v8 Professional (Small) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	0.5983	0.6205	0.6340	0.6668	0.6859
Std Dev	0.2116	0.2117	0.2131	0.2144	0.1982
Avg Mean	0.4916	0.4993	0.4819	0.4839	0.4848

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	0.8533	0.8135	0.8216	0.9215	0.9215
1	French	0.6250	0.6815	0.7290	0.7672	0.7672
2	English	0.3390	0.2751	0.2717	0.3544	0.4492
3	German	0.7857	0.8350	0.8485	0.8485	0.8485
4	French	0.8347	0.9167	0.8709	0.9167	0.9167
5	French	0.3013	0.3754	0.3971	0.3971	0.3971
6	English	0.5794	0.5949	0.6289	0.6330	0.6330
7	German	0.3944	0.4776	0.4529	0.4767	0.4776
8	Spanish	0.4833	0.4990	0.5164	0.5436	0.6166
9	French	0.7873	0.7363	0.8028	0.8093	0.8317

v9 Natural−Robotic (Large) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	1.7625	1.8124	1.8343	1.8778	1.9138
Std Dev	0.4270	0.4701	0.4782	0.3921	0.3250
Avg Mean	1.4787	1.4975	1.4580	1.4581	1.4641

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	1.9710	2.0491	2.1476	2.1351	2.1476
1	French	2.1428	2.0394	2.1498	2.1792	2.1792
2	English	0.9406	0.6121	0.6323	0.9406	1.2229
3	German	2.1491	2.1612	2.1636	2.1636	2.1636
4	French	2.0645	2.0823	2.1428	2.1103	2.1428
5	French	1.5301	1.9780	1.9490	1.9490	1.9780
6	English	1.6344	1.7506	1.7401	1.7506	1.7506
7	German	1.1934	1.4529	1.5171	1.5171	1.5171
8	Spanish	1.8767	1.9098	1.7736	1.9098	1.9098
9	French	2.1225	2.0889	2.1268	2.1225	2.1268

v10 Authentic−Cheap (Large) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	1.8266	1.8804	1.9070	1.9459	1.9885
Std Dev	0.4869	0.5073	0.5149	0.4312	0.3630
Avg Mean	1.5242	1.5477	1.5052	1.5075	1.5142

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	1.7902	1.8367	1.9457	1.9314	1.9457
1	French	2.3703	2.2671	2.3703	2.3819	2.3819
2	English	0.9469	0.6536	0.6631	0.9666	1.2856
3	German	2.2040	2.2057	2.2057	2.2149	2.2149
4	French	2.2093	2.2185	2.2718	2.2225	2.2718
5	French	1.6745	2.1506	2.1444	2.1444	2.1506
6	English	1.5897	1.7623	1.7631	1.7623	1.7631
7	German	1.1506	1.4217	1.4693	1.5019	1.5019
8	Spanish	2.1024	2.1049	1.9721	2.1049	2.1049
9	French	2.2286	2.1832	2.2646	2.2286	2.2646

v11 Professional−Distorted (Large) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	1.7341	1.7810	1.8064	1.8522	1.9012
Std Dev	0.4503	0.4617	0.4840	0.3898	0.3347
Avg Mean	1.4446	1.4704	1.4253	1.4278	1.4349

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	1.9732	2.0076	2.0894	2.1378	2.1378
1	French	2.1766	2.0661	2.1766	2.1824	2.1880
2	English	0.9795	0.7086	0.7086	1.0278	1.3325
3	German	2.0586	2.0397	2.0602	2.0567	2.0854
4	French	2.1923	2.2329	2.3106	2.2317	2.3106
5	French	1.4664	1.9309	1.8909	1.8909	1.9309
6	English	1.6002	1.7258	1.7735	1.7493	1.7735
7	German	1.0369	1.2607	1.2942	1.3620	1.3704
8	Spanish	1.7946	1.7953	1.6723	1.7953	1.7953
9	French	2.0625	2.0422	2.0878	2.0878	2.0878

v12 Expressive−Flat (Large) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	1.7763	1.8355	1.8676	1.8895	1.9485
Std Dev	0.4260	0.5065	0.5070	0.4145	0.3568
Avg Mean	1.4899	1.5074	1.4696	1.4693	1.4744

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	2.0280	2.1126	2.2107	2.1920	2.2107
1	French	2.2424	2.1651	2.2424	2.2355	2.2673
2	English	0.9147	0.5840	0.6014	0.9147	1.2084
3	German	2.0131	2.0551	2.0551	2.0551	2.0551
4	French	2.0520	2.0891	2.1766	2.0949	2.1766
5	French	1.6742	2.1944	2.1445	2.1445	2.1944
6	English	1.8062	1.9320	1.9528	1.9274	1.9528
7	German	1.1379	1.3116	1.4041	1.4041	1.4041
8	Spanish	1.9660	1.9985	1.8712	1.9985	1.9985
9	French	1.9285	1.9121	2.0168	1.9285	2.0168

v13 FullPos−FullNeg (Large) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	1.7583	1.8020	1.8264	1.8647	1.9145
Std Dev	0.4620	0.4861	0.5036	0.4216	0.3558
Avg Mean	1.4727	1.4948	1.4533	1.4549	1.4610

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	1.9608	2.0103	2.0847	2.1336	2.1336
1	French	2.2944	2.1780	2.2949	2.3020	2.3020
2	English	0.9360	0.6393	0.6423	0.9472	1.2411
3	German	2.0983	2.1017	2.1064	2.1075	2.1075
4	French	2.1395	2.1546	2.2458	2.1727	2.2458
5	French	1.5380	1.9911	1.9511	1.9511	1.9911
6	English	1.5882	1.7128	1.7285	1.7128	1.7504
7	German	1.0933	1.3085	1.3728	1.3728	1.4117
8	Spanish	1.8120	1.8251	1.7040	1.8251	1.8280
9	French	2.1227	2.0986	2.1334	2.1227	2.1334

v14 Warm−Robotic (Large) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	1.6636	1.7092	1.7270	1.7695	1.8088
Std Dev	0.4263	0.4658	0.4792	0.3978	0.3466
Avg Mean	1.4010	1.4205	1.3838	1.3839	1.3890

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	1.8779	1.9566	2.0789	2.0521	2.0789
1	French	2.0346	1.9460	2.0595	2.0815	2.0815
2	English	0.8646	0.5522	0.5703	0.8646	1.0986
3	German	2.1651	2.1746	2.1773	2.1773	2.2009
4	French	1.9420	1.9618	2.0369	1.9870	2.0369
5	French	1.4221	1.8388	1.8078	1.8078	1.8388
6	English	1.5420	1.6624	1.6211	1.6624	1.6624
7	German	1.0902	1.3163	1.3516	1.3516	1.3788
8	Spanish	1.7649	1.7784	1.6591	1.7784	1.7784
9	French	1.9325	1.9052	1.9078	1.9325	1.9325

v15 Natural−Robotic (Small) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	1.7629	1.8004	1.8645	1.8902	1.9578
Std Dev	0.5017	0.4960	0.5355	0.4633	0.3993
Avg Mean	1.4708	1.4898	1.4513	1.4492	1.4561

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	1.6488	1.7065	1.7676	1.7916	1.8014
1	French	2.3153	2.2224	2.4053	2.3848	2.4490
2	English	0.8467	0.6330	0.6096	0.8785	1.1780
3	German	2.3092	2.3154	2.3200	2.3361	2.3361
4	French	2.0907	2.1790	2.2116	2.2459	2.2919
5	French	1.5033	1.7800	1.9643	1.9643	1.9643
6	English	1.6529	1.7269	1.9498	1.6928	1.9498
7	German	1.1440	1.4311	1.3928	1.4866	1.4866
8	Spanish	1.8794	1.8827	1.7967	1.8827	1.8827
9	French	2.2383	2.1274	2.2274	2.2383	2.2383

v16 Authentic−Cheap (Small) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	1.7445	1.7845	1.8444	1.8749	1.9338
Std Dev	0.5379	0.5445	0.5657	0.5210	0.4555
Avg Mean	1.4425	1.4595	1.4211	1.4186	1.4266

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	1.3809	1.4605	1.4941	1.5194	1.5194
1	French	2.4449	2.3596	2.4919	2.4988	2.5757
2	English	0.8119	0.6120	0.6047	0.8305	1.1598
3	German	2.4088	2.4501	2.4501	2.4869	2.4869
4	French	1.8841	2.0720	2.0840	2.0720	2.0906
5	French	1.6578	1.8901	2.0938	2.0938	2.0938
6	English	1.5176	1.5212	1.7346	1.5726	1.7346
7	German	1.1548	1.4413	1.3831	1.4581	1.4581
8	Spanish	2.1296	2.1277	2.0185	2.1277	2.1296
9	French	2.0541	1.9108	2.0891	2.0891	2.0891

v17 Professional−Distorted (Small) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	1.5994	1.6570	1.6846	1.7406	1.7715
Std Dev	0.4479	0.4860	0.5001	0.4269	0.3855
Avg Mean	1.3421	1.3697	1.3318	1.3323	1.3378

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	1.8642	1.9530	2.0432	2.0555	2.0555
1	French	1.9058	1.9187	1.9901	2.0964	2.0964
2	English	0.8169	0.5120	0.5120	0.8169	1.0060
3	German	2.0186	2.0532	2.0532	2.0532	2.0625
4	French	2.0590	2.1292	2.1158	2.1036	2.1292
5	French	1.2630	1.7543	1.6779	1.6779	1.7543
6	English	1.5062	1.6104	1.6834	1.6756	1.6834
7	German	0.9686	1.1675	1.2249	1.2390	1.2390
8	Spanish	1.6446	1.6486	1.5433	1.6856	1.6856
9	French	1.9470	1.8230	2.0027	2.0027	2.0027

v18 Expressive−Flat (Small) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	1.6577	1.7797	1.7808	1.8097	1.9099
Std Dev	0.4096	0.5021	0.4790	0.4030	0.3495
Avg Mean	1.3920	1.4088	1.3715	1.3702	1.3765

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	1.8925	2.0090	2.0555	2.0892	2.0892
1	French	1.9245	1.9676	1.9677	2.1396	2.1396
2	English	0.8228	0.5462	0.5565	0.8322	1.1512
3	German	2.0265	1.9755	2.0265	1.9834	2.0265
4	French	1.8850	2.0415	2.1878	2.0415	2.1878
5	French	1.4913	2.2316	1.8938	1.9778	2.2316
6	English	1.6425	1.8139	1.7999	1.7818	1.8370
7	German	1.0728	1.2596	1.3945	1.3945	1.4380
8	Spanish	1.9426	1.9672	1.9413	1.9803	2.0134
9	French	1.8763	1.9847	1.9847	1.8763	1.9847

v19 FullPos−FullNeg (Small) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	1.5700	1.6128	1.6612	1.6989	1.7469
Std Dev	0.3859	0.4233	0.4468	0.3841	0.3160
Avg Mean	1.3126	1.3291	1.2899	1.2885	1.2964

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	1.6920	1.7246	1.8583	1.8986	1.8986
1	French	1.9208	1.8936	2.0274	2.0931	2.0931
2	English	0.8238	0.5746	0.5592	0.8238	1.1143
3	German	1.9430	2.0129	1.9983	1.9974	2.0437
4	French	1.8329	1.9385	1.9891	1.9638	1.9891
5	French	1.3100	1.6897	1.6879	1.6879	1.6897
6	English	1.4709	1.5158	1.6477	1.5318	1.6477
7	German	1.0837	1.2656	1.2837	1.3249	1.3249
8	Spanish	1.7527	1.7707	1.6895	1.7974	1.7974
9	French	1.8702	1.7422	1.8707	1.8707	1.8707

v20 Warm−Robotic (Small) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	1.7081	1.7511	1.8092	1.8405	1.9087
Std Dev	0.5026	0.5152	0.5468	0.4675	0.4241
Avg Mean	1.4168	1.4319	1.3929	1.3934	1.3978

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	1.9556	2.0282	2.0467	2.0415	2.1879
1	French	1.9558	1.9259	2.0562	2.0916	2.1705
2	English	0.8164	0.5619	0.5469	0.8164	1.0715
3	German	2.2923	2.2210	2.2923	2.2858	2.2971
4	French	2.1272	2.3172	2.3547	2.3162	2.3547
5	French	1.5113	1.7863	1.8355	1.8355	1.8355
6	English	1.5639	1.6882	1.8540	1.7028	1.8540
7	German	0.9827	1.2525	1.2931	1.3667	1.3667
8	Spanish	1.6662	1.7121	1.6222	1.7393	1.7393
9	French	2.2096	2.0173	2.1907	2.2096	2.2096

v21 Sanitized Prompt (Large) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	1.1298	1.1680	1.1868	1.2151	1.2431
Std Dev	0.3053	0.3328	0.3357	0.2999	0.2526
Avg Mean	0.9421	0.9517	0.9285	0.9285	0.9324

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	1.4114	1.4243	1.5158	1.5312	1.5312
1	French	1.4362	1.3614	1.4413	1.4571	1.4571
2	English	0.5642	0.4081	0.4143	0.5642	0.7911
3	German	1.3646	1.3644	1.3609	1.3722	1.4092
4	French	1.2187	1.2284	1.2187	1.2284	1.2284
5	French	1.0601	1.4074	1.4024	1.4024	1.4074
6	English	1.1419	1.2705	1.2699	1.2712	1.2712
7	German	0.6296	0.7288	0.8082	0.8175	0.8175
8	Spanish	1.1832	1.2148	1.1333	1.2148	1.2148
9	French	1.2876	1.2721	1.3036	1.2924	1.3036

v22 Sanitized Prompt (Small) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	0.7170	0.7874	0.7996	0.8311	0.8531
Std Dev	0.2538	0.2969	0.2897	0.2738	0.2618
Avg Mean	0.6087	0.6219	0.6109	0.6070	0.6088

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	1.1052	1.2145	1.2221	1.2389	1.2409
1	French	0.7607	0.7726	0.7963	0.8128	0.8128
2	English	0.3121	0.2286	0.2201	0.3121	0.3787
3	German	0.9288	0.9550	1.0054	1.0163	1.0163
4	French	0.6158	0.6584	0.6674	0.6933	0.6933
5	French	0.5652	1.0005	0.7901	0.9109	1.0005
6	English	1.0324	1.1345	1.1808	1.1834	1.1834
7	German	0.4282	0.5274	0.6373	0.6185	0.6373
8	Spanish	0.6784	0.6631	0.7082	0.7567	0.7997
9	French	0.7437	0.7195	0.7679	0.7679	0.7679

v23 Sanitized−Uncanny (Large) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	1.9141	1.9748	1.9965	2.0393	2.0803
Std Dev	0.5171	0.5552	0.5585	0.4952	0.4196
Avg Mean	1.6026	1.6198	1.5793	1.5795	1.5856

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	2.3112	2.3760	2.4739	2.4916	2.4916
1	French	2.4160	2.2890	2.4269	2.4384	2.4384
2	English	0.9139	0.6370	0.6574	0.9139	1.2489
3	German	2.3492	2.3531	2.3492	2.3566	2.3954
4	French	2.1244	2.1459	2.1244	2.1459	2.1459
5	French	1.7536	2.2921	2.2855	2.2855	2.2921
6	English	1.8642	2.0281	2.0294	2.0294	2.0294
7	German	1.1252	1.3491	1.4267	1.4288	1.4288
8	Spanish	2.0750	2.0953	1.9713	2.0953	2.1119
9	French	2.2079	2.1821	2.2206	2.2079	2.2206

v24 Sanitized−Uncanny (Small) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	1.6078	1.7076	1.7431	1.7802	1.8432
Std Dev	0.4618	0.5393	0.5341	0.4798	0.4386
Avg Mean	1.3624	1.3770	1.3471	1.3442	1.3516

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	2.1217	2.2303	2.3251	2.3172	2.3251
1	French	1.9161	1.8960	1.9896	2.0428	2.0820
2	English	0.6843	0.4682	0.4413	0.6843	0.8463
3	German	2.1527	2.1854	2.1866	2.2201	2.2201
4	French	1.6717	1.7874	1.8827	1.7990	1.8827
5	French	1.3620	2.0900	1.7527	1.9163	2.0900
6	English	1.7763	2.0273	2.0737	2.0622	2.0737
7	German	1.0518	1.1949	1.3921	1.3324	1.4438
8	Spanish	1.6534	1.6348	1.5947	1.6348	1.6753
9	French	1.6882	1.5621	1.7926	1.7926	1.7926

Prompt #0 — English (Silicon Valley accent)

Language: English Accent: Silicon Valley accent Scored: 100/100

DramaBox Prompt

A young woman, possessing an extremely high fundamental frequency and bright, delicate harmonic texture, with a brisk, elevated momentum and a Silicon Valley accent; this is a pristine, high-quality studio voice recording with no background noise. She delivers the lines with a teasing lightness that occasionally borders on nervous energy, punctuated by small moments of genuine relief. (A brief, high-pitched Giggle escapes as she begins.) "Honestly, you think finding a solid Firestone review is that hard? Boggle, really. But look, that Lys thing actually worked." (She pauses, a subtle Contemplation washing over her features, then manages a slight, contained Chuckle.) "Just wait, I'll show you." She concludes with a soft, almost satisfied sigh, allowing the tension to dissipate.

Prompt #1 — French

Language: French Scored: 100/100

DramaBox Prompt

High-pitched, delicately resonant, and possessing the slightly strained clarity of a young adult female soprano; the voice is bright and purely head-dominant, engineered for intimate projection.

Pauses briefly, gathering strength. "Malgré la profondeur de cette sombre forêt, je sens toujours cette confiance absolue en mon chemin, guidée par la lumière."
A slight, almost imperceptible hardening of tone. "Même au cœur de cette nuit insondable, ma boussole intérieure me montre la seule direction véritable."
She finishes, a note of unwavering certainty settling.

The pace remains glacially slow throughout the utterance. The delivery conveys immense, quiet self-assurance.

[Prompts 0-1] · Prompts 2-3 · Prompts 4-5 · Prompts 6-7 · Prompts 8-9