DramaBox + Chatterbox VC — Best-of-N: Diminishing Returns

10 Path A prompts × 100 candidates — 29 ranking methods — Page 1/5

DramaBox Best-of-N Scenema Best-of-N Scenema VC DramaBox Main

[Prompts 0-1] · Prompts 2-3 · Prompts 4-5 · Prompts 6-7 · Prompts 8-9

Ranking Method:

Methodology: Ranking Method Formulas & Text Prompts

All methods use reward = (1 − WER) × max(score, 0). The score varies per method as described below.

#	Key	Score Formula	Text Prompt(s)
0	standard	Content Enjoyment	—
1	clap_lq	cos(audio, quality_text)	"pleasant, realistic, genuine, authentic, natural performance, high-quality recording"
2	clap_sq	cos(audio, quality_text)	"pleasant, realistic, genuine, authentic, natural performance, high-quality recording"
3	clap_lp	cos(audio, prompt)	Original DramaBox prompt
4	clap_sp	cos(audio, prompt)	Original DramaBox prompt
5	v1_nat_L	cos(audio, nat)	"natural, spontaneous, lifelike speech with genuine emotion"
6	v2_auth_L	cos(audio, auth)	"authentic, emotionally truthful, deeply felt voice performance"
7	v3_pro_L	cos(audio, pro)	"professional studio recording, crystal clear high-fidelity audio"
8	v4_expr_L	cos(audio, expr)	"expressive, dynamic voice acting with rich emotional range"
9	v5_cine_L	cos(audio, cine)	"immersive cinematic narration, compelling storytelling"
10	v6_nat_S	cos(audio, nat)	"natural, spontaneous, lifelike speech with genuine emotion"
11	v7_auth_S	cos(audio, auth)	"authentic, emotionally truthful, deeply felt voice performance"
12	v8_pro_S	cos(audio, pro)	"professional studio recording, crystal clear high-fidelity audio"
13	v9_nr_L	cos(audio, nat) − cos(audio, rob)	+ "natural, spontaneous, lifelike speech with genuine emotion" / − "robotic, mechanical, monotonous, synthetic computer speech"
14	v10_ac_L	cos(audio, auth) − cos(audio, cheap)	+ "authentic, emotionally truthful, deeply felt voice performance" / − "cheap, amateurish, rehearsed, stilted text-to-speech output"
15	v11_pd_L	cos(audio, pro) − cos(audio, dist)	+ "professional studio recording, crystal clear high-fidelity audio" / − "distorted, noisy, muffled, low-quality poor recording"
16	v12_ef_L	cos(audio, expr) − cos(audio, flat)	+ "expressive, dynamic voice acting with rich emotional range" / − "flat, lifeless, boring, emotionally dead recitation"
17	v13_ff_L	cos(audio, full_pos) − cos(audio, full_neg)	+ "natural spontaneous genuine authentic high-quality voice performance" / − "robotic distorted monotonous rehearsed cheap artificial synthetic"
18	v14_wr_L	cos(audio, warm) − cos(audio, rob)	+ "warm, pleasant, engaging conversational human voice" / − "robotic, mechanical, monotonous, synthetic computer speech"
19	v15_nr_S	cos(audio, nat) − cos(audio, rob)	+ "natural, spontaneous, lifelike speech with genuine emotion" / − "robotic, mechanical, monotonous, synthetic computer speech"
20	v16_ac_S	cos(audio, auth) − cos(audio, cheap)	+ "authentic, emotionally truthful, deeply felt voice performance" / − "cheap, amateurish, rehearsed, stilted text-to-speech output"
21	v17_pd_S	cos(audio, pro) − cos(audio, dist)	+ "professional studio recording, crystal clear high-fidelity audio" / − "distorted, noisy, muffled, low-quality poor recording"
22	v18_ef_S	cos(audio, expr) − cos(audio, flat)	+ "expressive, dynamic voice acting with rich emotional range" / − "flat, lifeless, boring, emotionally dead recitation"
23	v19_ff_S	cos(audio, full_pos) − cos(audio, full_neg)	+ "natural spontaneous genuine authentic high-quality voice performance" / − "robotic distorted monotonous rehearsed cheap artificial synthetic"
24	v20_wr_S	cos(audio, warm) − cos(audio, rob)	+ "warm, pleasant, engaging conversational human voice" / − "robotic, mechanical, monotonous, synthetic computer speech"
25	v21_san_L	cos(audio, sanitized_prompt)	Quoted speech removed (Large)
26	v22_san_S	cos(audio, sanitized_prompt)	Quoted speech removed (Small)
27	v23_snr_L	cos(audio, sanitized) − cos(audio, neg_san)	Sanitized / − "robotic, distorted, uncanny" (Large)
28	v24_snr_S	cos(audio, sanitized) − cos(audio, neg_san)	Sanitized / − "robotic, distorted, uncanny" (Small)

Cross-Method Diminishing Returns Comparison

Method	N=5	N=10	N=25	N=50	N=100	Gain N=5→100	Knee Point
Standard: (1−WER) × Content Enjoyment	3.9581	4.0268	4.0754	4.1300	4.1873	+0.2292	N=25
VoiceCLAP-Large × Quality Text	0.9157	0.9398	0.9571	0.9579	0.9734	+0.0577	N=50
VoiceCLAP-Small × Quality Text	0.7554	0.7672	0.7617	0.7889	0.8122	+0.0568	N=25
VoiceCLAP-Large × Prompt Match	1.4017	1.4245	1.4332	1.4464	1.4711	+0.0693	N=25
VoiceCLAP-Small × Prompt Match	0.8592	0.8813	0.8670	0.8846	0.9075	+0.0483	N=25
v1 Natural (Large)	0.9833	1.0110	1.0301	1.0319	1.0473	+0.0639	N=50
v2 Authentic (Large)	1.0174	1.0398	1.0532	1.0589	1.0713	+0.0538	N=25
v3 Professional (Large)	0.8629	0.8870	0.9011	0.9019	0.9169	+0.0540	N=50
v4 Expressive (Large)	1.0302	1.0487	1.0722	1.0744	1.0922	+0.0620	N=50
v5 Cinematic (Large)	0.9959	1.0112	1.0255	1.0311	1.0503	+0.0544	N=25
v6 Natural (Small)	0.7859	0.8138	0.8190	0.8478	0.8632	+0.0773	N=25
v7 Authentic (Small)	0.7514	0.7713	0.7856	0.8068	0.8224	+0.0710	N=100
v8 Professional (Small)	0.6329	0.6443	0.6365	0.6591	0.6732	+0.0403	N=25
v9 Natural−Robotic (Large)	1.7735	1.8105	1.8171	1.8315	1.8589	+0.0854	N=25
v10 Authentic−Cheap (Large)	1.8551	1.8824	1.8978	1.9100	1.9342	+0.0790	N=25
v11 Professional−Distorted (Large)	1.7497	1.7845	1.8056	1.8117	1.8378	+0.0881	N=25
v12 Expressive−Flat (Large)	1.7729	1.7995	1.8278	1.8410	1.8651	+0.0922	N=50
v13 FullPos−FullNeg (Large)	1.7672	1.8010	1.8270	1.8293	1.8584	+0.0912	N=25
v14 Warm−Robotic (Large)	1.6873	1.7219	1.7342	1.7475	1.7749	+0.0876	N=25
v15 Natural−Robotic (Small)	1.8351	1.8862	1.9036	1.9140	1.9449	+0.1098	N=25
v16 Authentic−Cheap (Small)	1.8619	1.9117	1.9390	1.9592	2.0056	+0.1437	N=25
v17 Professional−Distorted (Small)	1.7135	1.7285	1.7501	1.7619	1.7889	+0.0754	N=25
v18 Expressive−Flat (Small)	1.5851	1.6384	1.6706	1.6856	1.7089	+0.1238	N=50
v19 FullPos−FullNeg (Small)	1.6466	1.6811	1.6969	1.7114	1.7327	+0.0861	N=25
v20 Warm−Robotic (Small)	1.7386	1.7669	1.7929	1.8204	1.8502	+0.1115	N=50
v21 Sanitized Prompt (Large)	1.1543	1.1748	1.1891	1.2022	1.2222	+0.0679	N=25
v22 Sanitized Prompt (Small)	0.8341	0.8580	0.8388	0.8582	0.8839	+0.0498	N=25
v23 Sanitized−Uncanny (Large)	1.9557	1.9897	2.0095	2.0236	2.0529	+0.0972	N=25
v24 Sanitized−Uncanny (Small)	1.8150	1.8599	1.8555	1.8901	1.9179	+0.1028	N=25

Diminishing Returns — All Methods Overlaid

Marginal Improvement per Additional Candidate

Method	N=5→10	N=10→25	N=25→50	N=50→100
Standard: (1−WER) × Content Enjoyment	0.01374/cand (1.7%)	0.00324/cand (1.2%)	0.00218/cand (1.3%)	0.00115/cand (1.4%)
VoiceCLAP-Large × Quality Text	0.00484/cand (2.6%)	0.00115/cand (1.8%)	0.00003/cand (0.1%)	0.00031/cand (1.6%)
VoiceCLAP-Small × Quality Text	0.00236/cand (1.6%)	-0.00036/cand (-0.7%)	0.00109/cand (3.6%)	0.00047/cand (3.0%)
VoiceCLAP-Large × Prompt Match	0.00456/cand (1.6%)	0.00058/cand (0.6%)	0.00052/cand (0.9%)	0.00049/cand (1.7%)
VoiceCLAP-Small × Prompt Match	0.00442/cand (2.6%)	-0.00095/cand (-1.6%)	0.00070/cand (2.0%)	0.00046/cand (2.6%)
v1 Natural (Large)	0.00554/cand (2.8%)	0.00127/cand (1.9%)	0.00007/cand (0.2%)	0.00031/cand (1.5%)
v2 Authentic (Large)	0.00448/cand (2.2%)	0.00090/cand (1.3%)	0.00023/cand (0.5%)	0.00025/cand (1.2%)
v3 Professional (Large)	0.00483/cand (2.8%)	0.00094/cand (1.6%)	0.00003/cand (0.1%)	0.00030/cand (1.7%)
v4 Expressive (Large)	0.00370/cand (1.8%)	0.00157/cand (2.2%)	0.00009/cand (0.2%)	0.00036/cand (1.7%)
v5 Cinematic (Large)	0.00307/cand (1.5%)	0.00095/cand (1.4%)	0.00022/cand (0.5%)	0.00039/cand (1.9%)
v6 Natural (Small)	0.00559/cand (3.6%)	0.00035/cand (0.6%)	0.00115/cand (3.5%)	0.00031/cand (1.8%)
v7 Authentic (Small)	0.00397/cand (2.6%)	0.00096/cand (1.9%)	0.00085/cand (2.7%)	0.00031/cand (1.9%)
v8 Professional (Small)	0.00228/cand (1.8%)	-0.00052/cand (-1.2%)	0.00090/cand (3.5%)	0.00028/cand (2.1%)
v9 Natural−Robotic (Large)	0.00741/cand (2.1%)	0.00044/cand (0.4%)	0.00057/cand (0.8%)	0.00055/cand (1.5%)
v10 Authentic−Cheap (Large)	0.00545/cand (1.5%)	0.00103/cand (0.8%)	0.00049/cand (0.6%)	0.00048/cand (1.3%)
v11 Professional−Distorted (Large)	0.00696/cand (2.0%)	0.00141/cand (1.2%)	0.00024/cand (0.3%)	0.00052/cand (1.4%)
v12 Expressive−Flat (Large)	0.00533/cand (1.5%)	0.00188/cand (1.6%)	0.00053/cand (0.7%)	0.00048/cand (1.3%)
v13 FullPos−FullNeg (Large)	0.00676/cand (1.9%)	0.00173/cand (1.4%)	0.00009/cand (0.1%)	0.00058/cand (1.6%)
v14 Warm−Robotic (Large)	0.00692/cand (2.1%)	0.00082/cand (0.7%)	0.00053/cand (0.8%)	0.00055/cand (1.6%)
v15 Natural−Robotic (Small)	0.01022/cand (2.8%)	0.00115/cand (0.9%)	0.00042/cand (0.5%)	0.00062/cand (1.6%)
v16 Authentic−Cheap (Small)	0.00996/cand (2.7%)	0.00182/cand (1.4%)	0.00081/cand (1.0%)	0.00093/cand (2.4%)
v17 Professional−Distorted (Small)	0.00300/cand (0.9%)	0.00144/cand (1.2%)	0.00047/cand (0.7%)	0.00054/cand (1.5%)
v18 Expressive−Flat (Small)	0.01065/cand (3.4%)	0.00215/cand (2.0%)	0.00060/cand (0.9%)	0.00047/cand (1.4%)
v19 FullPos−FullNeg (Small)	0.00689/cand (2.1%)	0.00106/cand (0.9%)	0.00058/cand (0.9%)	0.00043/cand (1.2%)
v20 Warm−Robotic (Small)	0.00565/cand (1.6%)	0.00174/cand (1.5%)	0.00110/cand (1.5%)	0.00060/cand (1.6%)
v21 Sanitized Prompt (Large)	0.00409/cand (1.8%)	0.00095/cand (1.2%)	0.00052/cand (1.1%)	0.00040/cand (1.7%)
v22 Sanitized Prompt (Small)	0.00477/cand (2.9%)	-0.00128/cand (-2.2%)	0.00078/cand (2.3%)	0.00051/cand (3.0%)
v23 Sanitized−Uncanny (Large)	0.00680/cand (1.7%)	0.00132/cand (1.0%)	0.00056/cand (0.7%)	0.00059/cand (1.4%)
v24 Sanitized−Uncanny (Small)	0.00896/cand (2.5%)	-0.00029/cand (-0.2%)	0.00138/cand (1.9%)	0.00056/cand (1.5%)

Ablation: Pronunciation Suffix Effect

Comparing N=10 without suffix vs N=10 with suffix.

Ranking Method	Without Suffix (N=10)			With Suffix (N=10)			Delta
	Mean	Best	Median	Mean	Best	Median	Δ Mean	Δ Best
Standard: (1−WER) × Content Enjoyment	3.6195	3.9607	3.6424	3.6106	4.0308	3.6205	-0.0090	+0.0702
VoiceCLAP-Large × Quality Text	0.8444	0.9283	0.8523	0.8329	0.9346	0.8389	-0.0114	+0.0064
VoiceCLAP-Small × Quality Text	0.6351	0.7551	0.6386	0.6189	0.7522	0.6206	-0.0162	-0.0030
VoiceCLAP-Large × Prompt Match	1.2870	1.4086	1.3024	1.2778	1.4212	1.2879	-0.0092	+0.0126
VoiceCLAP-Small × Prompt Match	0.7504	0.8416	0.7522	0.7446	0.8701	0.7434	-0.0058	+0.0285
v1 Natural (Large)	0.9049	1.0007	0.9144	0.8917	1.0024	0.8969	-0.0132	+0.0017
v2 Authentic (Large)	0.9343	1.0284	0.9436	0.9242	1.0312	0.9340	-0.0101	+0.0028
v3 Professional (Large)	0.7943	0.8754	0.8013	0.7844	0.8806	0.7911	-0.0099	+0.0052
v4 Expressive (Large)	0.9340	1.0333	0.9391	0.9271	1.0396	0.9385	-0.0069	+0.0062
v5 Cinematic (Large)	0.9045	1.0054	0.9096	0.8966	1.0054	0.9063	-0.0079	-0.0000
v6 Natural (Small)	0.6983	0.8117	0.6966	0.6881	0.8098	0.6849	-0.0102	-0.0019
v7 Authentic (Small)	0.6700	0.7680	0.6691	0.6649	0.7698	0.6667	-0.0051	+0.0018
v8 Professional (Small)	0.5315	0.6245	0.5312	0.5114	0.6184	0.5143	-0.0202	-0.0061
v9 Natural−Robotic (Large)	1.6313	1.7876	1.6492	1.6122	1.7956	1.6244	-0.0190	+0.0080
v10 Authentic−Cheap (Large)	1.7022	1.8613	1.7193	1.6859	1.8756	1.7018	-0.0163	+0.0143
v11 Professional−Distorted (Large)	1.6092	1.7664	1.6226	1.5894	1.7739	1.6019	-0.0198	+0.0076
v12 Expressive−Flat (Large)	1.6173	1.7740	1.6316	1.6040	1.7903	1.6215	-0.0133	+0.0163
v13 FullPos−FullNeg (Large)	1.6309	1.7793	1.6466	1.6119	1.7919	1.6223	-0.0190	+0.0126
v14 Warm−Robotic (Large)	1.5563	1.6963	1.5731	1.5380	1.7028	1.5489	-0.0183	+0.0065
v15 Natural−Robotic (Small)	1.6823	1.8742	1.6895	1.6675	1.8886	1.6774	-0.0148	+0.0144
v16 Authentic−Cheap (Small)	1.6965	1.8955	1.7027	1.6917	1.9241	1.7057	-0.0048	+0.0286
v17 Professional−Distorted (Small)	1.5521	1.7208	1.5647	1.5246	1.7014	1.5344	-0.0275	-0.0195
v18 Expressive−Flat (Small)	1.4477	1.6332	1.4492	1.4470	1.6288	1.4661	-0.0007	-0.0044
v19 FullPos−FullNeg (Small)	1.5022	1.6746	1.5111	1.4801	1.6635	1.4930	-0.0221	-0.0111
v20 Warm−Robotic (Small)	1.5724	1.7846	1.5727	1.5622	1.7670	1.5683	-0.0102	-0.0175
v21 Sanitized Prompt (Large)	1.0571	1.1594	1.0678	1.0474	1.1745	1.0524	-0.0097	+0.0151
v22 Sanitized Prompt (Small)	0.7158	0.7991	0.7177	0.7119	0.8345	0.7118	-0.0039	+0.0354
v23 Sanitized−Uncanny (Large)	1.8004	1.9687	1.8188	1.7835	1.9876	1.7944	-0.0170	+0.0189
v24 Sanitized−Uncanny (Small)	1.6304	1.8131	1.6364	1.6146	1.8390	1.6165	-0.0158	+0.0259

Per-Prompt Ablation: Standard Reward (N=10)

#	Lang	No Suffix Mean	No Suffix Best	With Suffix Mean	With Suffix Best	Δ Mean	Δ Best
0	English	4.4382	4.6876	4.2446	4.8360	-0.1936	+0.1484
1	French	4.7509	4.9606	4.5731	4.9682	-0.1778	+0.0076
2	English	0.7624	1.2826	0.8871	1.3244	+0.1247	+0.0418
3	German	4.7740	4.8920	4.8034	4.9066	+0.0295	+0.0146
4	French	4.5449	4.7755	4.4955	4.7763	-0.0493	+0.0008
5	French	3.6400	4.2156	3.8687	4.6396	+0.2287	+0.4240
6	English	2.9716	3.6152	3.1176	3.6882	+0.1460	+0.0730
7	German	2.2579	2.5002	2.2256	2.4999	-0.0323	-0.0003
8	Spanish	4.6049	4.8121	4.6370	4.8220	+0.0320	+0.0099
9	French	3.4505	3.8653	3.2531	3.8472	-0.1974	-0.0181

Standard: (1−WER) × Content Enjoyment — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	3.9581	4.0268	4.0754	4.1300	4.1873
Std Dev	1.2052	1.2232	1.2125	1.2485	1.2126
Avg Mean	3.6381	3.6722	3.5834	3.6057	3.6192

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	4.7697	4.7697	4.8043	4.8319	4.8905
1	French	4.8851	4.9270	5.0117	5.0117	5.0117
2	English	1.2964	1.3096	1.2896	1.2967	1.3108
3	German	4.9759	4.9558	4.9921	4.9518	4.9928
4	French	4.6222	4.7174	4.7755	4.8091	4.8091
5	French	4.1066	4.2291	4.4138	4.7376	4.8423
6	English	3.7682	4.1416	3.7523	4.1416	4.1416
7	German	2.4842	2.4531	2.8076	2.5773	2.9324
8	Spanish	4.8244	4.9292	5.0107	5.0107	5.0107
9	French	3.8486	3.8359	3.8964	3.9315	3.9315

VoiceCLAP-Large × Quality Text — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	0.9157	0.9398	0.9571	0.9579	0.9734
Std Dev	0.2876	0.2867	0.2740	0.2800	0.2735
Avg Mean	0.8492	0.8551	0.8320	0.8389	0.8418

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	1.1801	1.1960	1.2199	1.2072	1.2199
1	French	1.1609	1.1651	1.1736	1.1746	1.1787
2	English	0.3421	0.3426	0.3644	0.3644	0.3644
3	German	1.1194	1.1489	1.1363	1.1489	1.1489
4	French	1.2584	1.2616	1.2747	1.2612	1.2747
5	French	0.8236	0.8705	0.8963	0.9109	0.9599
6	English	0.8797	0.9655	0.8966	0.9655	0.9655
7	German	0.5779	0.6083	0.7006	0.6249	0.7006
8	Spanish	0.8715	0.8787	0.9214	0.9234	0.9234
9	French	0.9430	0.9612	0.9875	0.9977	0.9977

VoiceCLAP-Small × Quality Text — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	0.7554	0.7672	0.7617	0.7889	0.8122
Std Dev	0.2944	0.2925	0.2815	0.2918	0.2915
Avg Mean	0.6519	0.6468	0.6185	0.6304	0.6341

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	0.7848	0.7696	0.8288	0.8288	0.8588
1	French	0.9654	1.0323	0.9752	1.0323	1.0323
2	English	0.2507	0.2656	0.2666	0.2629	0.2680
3	German	1.0764	1.0613	1.1007	1.0541	1.1007
4	French	1.1490	1.1624	1.1073	1.1721	1.1721
5	French	0.4992	0.5699	0.5871	0.6379	0.6379
6	English	0.7262	0.8032	0.7391	0.8032	0.8714
7	German	0.4466	0.4526	0.4866	0.4358	0.4866
8	Spanish	0.6695	0.5961	0.5669	0.6568	0.6695
9	French	0.9860	0.9590	0.9590	1.0047	1.0246

VoiceCLAP-Large × Prompt Match — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	1.4017	1.4245	1.4332	1.4464	1.4711
Std Dev	0.4496	0.4485	0.4446	0.4537	0.4416
Avg Mean	1.2874	1.3003	1.2653	1.2744	1.2805

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	1.7400	1.7779	1.7687	1.7775	1.7779
1	French	1.8340	1.8423	1.8407	1.8427	1.8553
2	English	0.4610	0.4587	0.4610	0.4610	0.4610
3	German	1.8086	1.8357	1.8307	1.8357	1.8357
4	French	1.5970	1.5916	1.5846	1.5878	1.5970
5	French	1.5405	1.5608	1.6132	1.6831	1.7237
6	English	1.3109	1.3858	1.3323	1.3858	1.4119
7	German	0.8211	0.8590	0.9034	0.8673	0.9943
8	Spanish	1.6251	1.6041	1.6447	1.6522	1.6741
9	French	1.2792	1.3295	1.3530	1.3704	1.3797

VoiceCLAP-Small × Prompt Match — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	0.8592	0.8813	0.8670	0.8846	0.9075
Std Dev	0.2891	0.2970	0.2800	0.2953	0.2898
Avg Mean	0.7620	0.7667	0.7428	0.7468	0.7500

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	1.2965	1.3337	1.3187	1.3187	1.3337
1	French	0.8539	0.8664	0.8710	0.8710	0.8710
2	English	0.3242	0.3066	0.3013	0.3037	0.3242
3	German	1.1672	1.1607	1.1445	1.1830	1.1879
4	French	0.8727	0.9134	0.9120	0.9210	0.9293
5	French	0.9042	0.9781	0.9781	0.9615	1.0058
6	English	1.1224	1.1348	0.9805	1.1348	1.1678
7	German	0.5815	0.6115	0.6267	0.6144	0.7063
8	Spanish	0.7488	0.7396	0.7697	0.7697	0.7809
9	French	0.7203	0.7679	0.7679	0.7679	0.7679

v1 Natural (Large) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	0.9833	1.0110	1.0301	1.0319	1.0473
Std Dev	0.2930	0.2892	0.2820	0.2897	0.2820
Avg Mean	0.9095	0.9154	0.8904	0.8967	0.9003

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	1.2999	1.3086	1.3218	1.3206	1.3218
1	French	1.2314	1.2332	1.2605	1.2605	1.2605
2	English	0.3951	0.3878	0.4029	0.4029	0.4029
3	German	1.0973	1.1184	1.1089	1.1248	1.1248
4	French	1.3264	1.3264	1.3393	1.3286	1.3393
5	French	0.9290	0.9829	1.0206	1.0278	1.0844
6	English	0.9207	1.0350	0.9784	1.0350	1.0407
7	German	0.6350	0.6844	0.7618	0.6844	0.7618
8	Spanish	0.9463	0.9553	0.9966	1.0209	1.0209
9	French	1.0520	1.0780	1.1103	1.1140	1.1154

v2 Authentic (Large) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	1.0174	1.0398	1.0532	1.0589	1.0713
Std Dev	0.2791	0.2804	0.2748	0.2857	0.2760
Avg Mean	0.9382	0.9451	0.9201	0.9268	0.9302

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	1.1549	1.1756	1.1729	1.1752	1.1756
1	French	1.2840	1.2896	1.2984	1.2984	1.2984
2	English	0.4005	0.3968	0.4061	0.4061	0.4061
3	German	1.1687	1.1819	1.1812	1.1937	1.1937
4	French	1.3010	1.3022	1.3179	1.3179	1.3179
5	French	1.0370	1.0809	1.1116	1.1435	1.1790
6	English	0.9344	1.0569	0.9639	1.0569	1.0569
7	German	0.6940	0.7049	0.7912	0.7038	0.7915
8	Spanish	1.1312	1.1358	1.1862	1.1880	1.1880
9	French	1.0684	1.0735	1.1030	1.1054	1.1054

v3 Professional (Large) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	0.8629	0.8870	0.9011	0.9019	0.9169
Std Dev	0.2646	0.2657	0.2498	0.2591	0.2508
Avg Mean	0.7978	0.8034	0.7831	0.7890	0.7916

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	1.0788	1.1150	1.1273	1.1202	1.1351
1	French	1.0938	1.0996	1.1037	1.1057	1.1057
2	English	0.3301	0.3313	0.3570	0.3570	0.3570
3	German	1.0806	1.1039	1.0909	1.1052	1.1052
4	French	1.1657	1.1606	1.1726	1.1623	1.1726
5	French	0.7691	0.8260	0.8462	0.8519	0.8941
6	English	0.8495	0.9323	0.8728	0.9323	0.9323
7	German	0.5526	0.5725	0.6599	0.5784	0.6599
8	Spanish	0.8247	0.8339	0.8619	0.8796	0.8796
9	French	0.8837	0.8952	0.9183	0.9264	0.9273

v4 Expressive (Large) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	1.0302	1.0487	1.0722	1.0744	1.0922
Std Dev	0.2973	0.2907	0.2866	0.3061	0.2913
Avg Mean	0.9373	0.9471	0.9224	0.9273	0.9308

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	1.2845	1.2997	1.3146	1.3146	1.3146
1	French	1.2896	1.2632	1.3303	1.3303	1.3303
2	English	0.3983	0.4149	0.4080	0.4110	0.4176
3	German	1.0538	1.0646	1.0646	1.0721	1.0779
4	French	1.2563	1.2671	1.2632	1.2867	1.2867
5	French	1.1055	1.1381	1.1962	1.2255	1.2476
6	English	1.1798	1.2267	1.1692	1.2267	1.2345
7	German	0.6153	0.6578	0.7594	0.6548	0.7594
8	Spanish	1.1322	1.1394	1.1824	1.1824	1.1851
9	French	0.9865	1.0151	1.0344	1.0401	1.0685

v5 Cinematic (Large) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	0.9959	1.0112	1.0255	1.0311	1.0503
Std Dev	0.2751	0.2765	0.2611	0.2823	0.2707
Avg Mean	0.9071	0.9145	0.8905	0.8963	0.8999

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	1.0962	1.1126	1.1399	1.1399	1.1399
1	French	1.2493	1.2514	1.2545	1.2526	1.2545
2	English	0.3975	0.3952	0.3969	0.4072	0.4072
3	German	1.1020	1.1141	1.1141	1.1123	1.1141
4	French	1.2352	1.2321	1.2387	1.2437	1.2507
5	French	1.1387	1.1553	1.1818	1.2361	1.2825
6	English	1.0950	1.1501	1.0561	1.1501	1.1501
7	German	0.6107	0.6376	0.7574	0.6362	0.7574
8	Spanish	1.0335	1.0103	1.0450	1.0478	1.0478
9	French	1.0011	1.0538	1.0702	1.0846	1.0991

v6 Natural (Small) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	0.7859	0.8138	0.8190	0.8478	0.8632
Std Dev	0.2624	0.2598	0.2469	0.2604	0.2545
Avg Mean	0.7038	0.7016	0.6778	0.6870	0.6902

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	0.5855	0.6011	0.6720	0.6980	0.7086
1	French	1.0678	1.1142	1.1085	1.1323	1.1323
2	English	0.3018	0.2919	0.3094	0.3111	0.3112
3	German	1.0163	0.9880	1.0250	1.0267	1.0267
4	French	1.0540	1.0493	1.0521	1.0951	1.0951
5	French	0.5972	0.7371	0.6884	0.7579	0.7579
6	English	0.7692	0.9272	0.7952	0.9272	0.9744
7	German	0.5935	0.5827	0.6630	0.5827	0.6653
8	Spanish	0.8304	0.8293	0.8474	0.9041	0.9041
9	French	1.0429	1.0175	1.0291	1.0429	1.0563

v7 Authentic (Small) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	0.7514	0.7713	0.7856	0.8068	0.8224
Std Dev	0.2323	0.2361	0.2282	0.2368	0.2263
Avg Mean	0.6740	0.6731	0.6505	0.6594	0.6620

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	0.6015	0.6025	0.6906	0.6929	0.7054
1	French	1.0396	1.0476	1.0728	1.0733	1.0733
2	English	0.3161	0.3116	0.3160	0.3206	0.3267
3	German	0.9578	0.9621	0.9861	0.9854	0.9868
4	French	0.9614	0.9640	0.9895	1.0108	1.0108
5	French	0.6842	0.7927	0.7239	0.8119	0.8119
6	English	0.6189	0.8008	0.7064	0.8008	0.8349
7	German	0.5529	0.4810	0.5864	0.5201	0.6010
8	Spanish	0.8499	0.8258	0.8599	0.9206	0.9206
9	French	0.9321	0.9249	0.9249	0.9321	0.9527

v8 Professional (Small) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	0.6329	0.6443	0.6365	0.6591	0.6732
Std Dev	0.2393	0.2342	0.2323	0.2318	0.2384
Avg Mean	0.5460	0.5425	0.5196	0.5284	0.5315

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	0.8298	0.7883	0.8568	0.8568	0.8672
1	French	0.7508	0.8219	0.7614	0.8219	0.8219
2	English	0.2313	0.2374	0.2399	0.2361	0.2407
3	German	0.8619	0.8717	0.8988	0.8493	0.8988
4	French	0.9454	0.9521	0.9063	0.9311	0.9521
5	French	0.3799	0.4489	0.4533	0.4821	0.4821
6	English	0.6101	0.6875	0.6297	0.6875	0.6930
7	German	0.3956	0.4062	0.4121	0.3958	0.4187
8	Spanish	0.5416	0.4901	0.4684	0.5479	0.5479
9	French	0.7823	0.7387	0.7387	0.7823	0.8095

v9 Natural−Robotic (Large) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	1.7735	1.8105	1.8171	1.8315	1.8589
Std Dev	0.5023	0.5011	0.4863	0.5056	0.4894
Avg Mean	1.6346	1.6513	1.6035	1.6155	1.6223

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	2.1856	2.1856	2.1715	2.1856	2.1856
1	French	2.1406	2.1531	2.1525	2.1609	2.1609
2	English	0.6243	0.6164	0.6020	0.6243	0.6243
3	German	2.0682	2.0936	2.0832	2.0983	2.0983
4	French	2.2262	2.2262	2.2208	2.2298	2.2298
5	French	1.8148	1.9070	1.8824	1.9578	2.0445
6	English	1.6756	1.8555	1.7410	1.8555	1.8608
7	German	1.2134	1.2666	1.4427	1.2666	1.4427
8	Spanish	1.8958	1.9025	1.9592	2.0257	2.0257
9	French	1.8901	1.8987	1.9159	1.9103	1.9159

v10 Authentic−Cheap (Large) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	1.8551	1.8824	1.8978	1.9100	1.9342
Std Dev	0.5162	0.5218	0.5084	0.5348	0.5142
Avg Mean	1.7055	1.7213	1.6731	1.6859	1.6929

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	1.9805	1.9805	1.9704	1.9805	1.9805
1	French	2.3129	2.3257	2.3353	2.3328	2.3405
2	English	0.6722	0.6662	0.6771	0.6722	0.6771
3	German	2.1338	2.1609	2.1511	2.1638	2.1638
4	French	2.3307	2.3370	2.3550	2.3550	2.3550
5	French	2.0096	2.0885	2.0934	2.2069	2.2325
6	English	1.7113	1.8737	1.7147	1.8737	1.8737
7	German	1.2790	1.2744	1.4621	1.2790	1.4680
8	Spanish	2.1274	2.1179	2.1960	2.2225	2.2225
9	French	1.9938	1.9988	2.0233	2.0136	2.0281

v11 Professional−Distorted (Large) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	1.7497	1.7845	1.8056	1.8117	1.8378
Std Dev	0.5001	0.5083	0.4809	0.5052	0.4858
Avg Mean	1.6125	1.6263	1.5829	1.5948	1.6007

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	2.1368	2.1826	2.1965	2.1965	2.1965
1	French	2.1929	2.1956	2.2158	2.2158	2.2158
2	English	0.6719	0.6721	0.6976	0.6976	0.6976
3	German	1.9887	2.0254	2.0132	2.0317	2.0317
4	French	2.2817	2.2955	2.2954	2.2916	2.3069
5	French	1.7478	1.8559	1.8564	1.9257	1.9956
6	English	1.7129	1.8341	1.7216	1.8341	1.8341
7	German	1.1242	1.1291	1.3234	1.1479	1.3234
8	Spanish	1.8275	1.8163	1.8733	1.8899	1.8899
9	French	1.8124	1.8381	1.8629	1.8865	1.8865

v12 Expressive−Flat (Large) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	1.7729	1.7995	1.8278	1.8410	1.8651
Std Dev	0.5165	0.5108	0.5006	0.5287	0.5120
Avg Mean	1.6170	1.6382	1.5944	1.6033	1.6100

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	2.2023	2.2023	2.2302	2.2302	2.2302
1	French	2.2038	2.1729	2.2349	2.2349	2.2349
2	English	0.5963	0.6130	0.6042	0.6210	0.6210
3	German	1.8879	1.9135	1.9238	1.9245	1.9448
4	French	2.1078	2.1368	2.1097	2.1339	2.1368
5	French	1.9549	1.9768	2.0464	2.1254	2.1770
6	English	1.9250	2.0643	1.9352	2.0643	2.0643
7	German	1.1267	1.1891	1.3549	1.1891	1.3549
8	Spanish	1.9866	1.9861	2.0638	2.1036	2.1036
9	French	1.7373	1.7403	1.7747	1.7833	1.7833

v13 FullPos−FullNeg (Large) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	1.7672	1.8010	1.8270	1.8293	1.8584
Std Dev	0.5271	0.5227	0.5027	0.5267	0.5071
Avg Mean	1.6334	1.6490	1.6037	1.6153	1.6220

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	2.1537	2.1657	2.1748	2.1680	2.1748
1	French	2.2928	2.2831	2.3125	2.3125	2.3125
2	English	0.6100	0.6210	0.6349	0.6349	0.6349
3	German	2.0169	2.0474	2.0468	2.0480	2.0480
4	French	2.2687	2.2687	2.2729	2.2721	2.2729
5	French	1.8081	1.9001	1.9082	1.9826	2.0542
6	English	1.6926	1.8334	1.7388	1.8334	1.8515
7	German	1.1340	1.1675	1.3575	1.1729	1.3575
8	Spanish	1.8263	1.8329	1.9110	1.9583	1.9583
9	French	1.8690	1.8904	1.9124	1.9102	1.9195

v14 Warm−Robotic (Large) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	1.6873	1.7219	1.7342	1.7475	1.7749
Std Dev	0.5051	0.5079	0.4916	0.5060	0.4943
Avg Mean	1.5595	1.5758	1.5304	1.5430	1.5492

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	2.0916	2.1012	2.1366	2.1164	2.1366
1	French	2.0784	2.0871	2.0954	2.0871	2.0980
2	English	0.5567	0.5437	0.5501	0.5604	0.5604
3	German	2.0835	2.1383	2.1268	2.1258	2.1383
4	French	2.1315	2.1315	2.1200	2.1369	2.1369
5	French	1.6906	1.7761	1.7486	1.8316	1.9095
6	English	1.6091	1.7514	1.6514	1.7712	1.7712
7	German	1.1128	1.1680	1.3346	1.1823	1.3346
8	Spanish	1.7799	1.7860	1.8399	1.8920	1.8920
9	French	1.7389	1.7359	1.7390	1.7711	1.7711

v15 Natural−Robotic (Small) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	1.8351	1.8862	1.9036	1.9140	1.9449
Std Dev	0.5423	0.5520	0.5264	0.5606	0.5325
Avg Mean	1.6816	1.6968	1.6466	1.6609	1.6680

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	1.8539	1.9471	2.0145	2.0050	2.0303
1	French	2.4311	2.4143	2.4497	2.4497	2.4497
2	English	0.6333	0.6405	0.6522	0.6395	0.6522
3	German	2.2820	2.2947	2.2820	2.3039	2.3039
4	French	2.3169	2.3692	2.3504	2.3808	2.3836
5	French	1.8340	1.9902	1.9196	2.0329	2.0529
6	English	1.7990	1.9793	1.8037	1.9799	2.0081
7	German	1.2372	1.2136	1.4509	1.2305	1.4509
8	Spanish	1.9021	1.9495	1.9770	1.9818	1.9818
9	French	2.0619	2.0640	2.1355	2.1355	2.1355

v16 Authentic−Cheap (Small) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	1.8619	1.9117	1.9390	1.9592	2.0056
Std Dev	0.5453	0.5732	0.5491	0.5795	0.5567
Avg Mean	1.6898	1.7107	1.6606	1.6750	1.6808

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	1.7439	1.7439	1.8753	1.8392	1.8753
1	French	2.5063	2.4778	2.5432	2.5602	2.5602
2	English	0.6713	0.6431	0.6527	0.6511	0.6713
3	German	2.3565	2.4256	2.3913	2.4270	2.4270
4	French	2.2102	2.2980	2.3179	2.2980	2.3311
5	French	2.0216	2.1406	2.1085	2.2248	2.3031
6	English	1.6249	1.9050	1.7063	1.9050	1.9050
7	German	1.3323	1.2505	1.5216	1.3198	1.5521
8	Spanish	2.1459	2.1939	2.1939	2.2821	2.2821
9	French	2.0060	2.0385	2.0792	2.0846	2.1484

v17 Professional−Distorted (Small) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	1.7135	1.7285	1.7501	1.7619	1.7889
Std Dev	0.5271	0.5330	0.5180	0.5343	0.5170
Avg Mean	1.5574	1.5689	1.5235	1.5364	1.5427

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	2.0428	2.1051	2.1361	2.1361	2.1361
1	French	2.1708	2.1759	2.1575	2.1759	2.1759
2	English	0.5492	0.5412	0.5524	0.5524	0.5524
3	German	2.0547	2.0408	2.1070	2.0623	2.1070
4	French	2.2691	2.2691	2.2652	2.2691	2.2691
5	French	1.7026	1.6815	1.7862	1.8018	1.8460
6	English	1.6761	1.8117	1.6513	1.8504	1.8504
7	German	1.0969	1.1115	1.2562	1.1102	1.2815
8	Spanish	1.7161	1.6713	1.7120	1.7835	1.7835
9	French	1.8571	1.8772	1.8772	1.8772	1.8873

v18 Expressive−Flat (Small) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	1.5851	1.6384	1.6706	1.6856	1.7089
Std Dev	0.4835	0.4827	0.4739	0.5027	0.4877
Avg Mean	1.4512	1.4592	1.4309	1.4359	1.4424

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	2.0046	2.0648	2.0412	2.0535	2.0648
1	French	1.8388	1.8060	1.9466	1.9466	1.9466
2	English	0.5136	0.5190	0.5264	0.5160	0.5424
3	German	1.7483	1.8018	1.8018	1.8250	1.8311
4	French	1.8219	1.8401	1.8869	1.9158	1.9565
5	French	1.7471	1.7990	1.8520	1.9030	1.9459
6	English	1.6545	1.8510	1.7710	1.8510	1.8510
7	German	0.9641	1.0629	1.1792	1.0735	1.1792
8	Spanish	2.0441	2.0416	2.0575	2.1107	2.1107
9	French	1.5140	1.5975	1.6431	1.6611	1.6611

v19 FullPos−FullNeg (Small) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	1.6466	1.6811	1.6969	1.7114	1.7327
Std Dev	0.4764	0.4812	0.4669	0.4886	0.4652
Avg Mean	1.5050	1.5151	1.4714	1.4848	1.4912

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	1.8603	1.9313	1.9711	1.9711	1.9711
1	French	2.0814	2.1099	2.1060	2.1144	2.1144
2	English	0.5750	0.5750	0.5849	0.5849	0.5936
3	German	2.0124	1.9963	2.0124	2.0036	2.0124
4	French	2.0640	2.0905	2.1043	2.1005	2.1043
5	French	1.6111	1.7276	1.7058	1.8213	1.8213
6	English	1.5820	1.7505	1.5740	1.7554	1.7682
7	German	1.0959	1.1093	1.2826	1.1035	1.2826
8	Spanish	1.7775	1.7481	1.8525	1.8525	1.8525
9	French	1.8066	1.7723	1.7755	1.8066	1.8066

v20 Warm−Robotic (Small) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	1.7386	1.7669	1.7929	1.8204	1.8502
Std Dev	0.5543	0.5619	0.5456	0.5791	0.5598
Avg Mean	1.5798	1.5912	1.5472	1.5586	1.5654

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	2.0664	2.1809	2.1813	2.2174	2.2174
1	French	2.2268	2.1725	2.2305	2.2305	2.2305
2	English	0.5336	0.5565	0.5489	0.5565	0.5565
3	German	2.1818	2.1907	2.1988	2.2074	2.2089
4	French	2.2176	2.2926	2.2485	2.3549	2.3549
5	French	1.8035	1.7843	1.8619	1.8840	2.0068
6	English	1.6774	1.8177	1.6886	1.9310	1.9310
7	German	1.0253	1.0258	1.2075	1.0494	1.2075
8	Spanish	1.7583	1.7091	1.7362	1.7454	1.7609
9	French	1.8956	1.9385	2.0272	2.0272	2.0272

v21 Sanitized Prompt (Large) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	1.1543	1.1748	1.1891	1.2022	1.2222
Std Dev	0.3555	0.3527	0.3561	0.3626	0.3536
Avg Mean	1.0578	1.0693	1.0393	1.0461	1.0510

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	1.5629	1.5640	1.5809	1.5806	1.5893
1	French	1.4593	1.4519	1.4899	1.4899	1.4899
2	English	0.4213	0.4261	0.4235	0.4306	0.4306
3	German	1.3409	1.3787	1.3765	1.3740	1.3787
4	French	1.2723	1.2629	1.2645	1.2722	1.2723
5	French	1.3110	1.3130	1.3874	1.4408	1.4715
6	English	1.2032	1.3173	1.2355	1.3173	1.3389
7	German	0.6458	0.6843	0.7132	0.6967	0.7887
8	Spanish	1.2175	1.2031	1.2397	1.2397	1.2560
9	French	1.1087	1.1463	1.1797	1.1797	1.2061

v22 Sanitized Prompt (Small) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	0.8341	0.8580	0.8388	0.8582	0.8839
Std Dev	0.2808	0.2942	0.2797	0.2897	0.2913
Avg Mean	0.7333	0.7313	0.7077	0.7131	0.7175

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	1.3197	1.3575	1.3366	1.3313	1.3575
1	French	0.8224	0.8534	0.8511	0.8534	0.8534
2	English	0.3282	0.3110	0.3051	0.3051	0.3282
3	German	1.0338	1.0312	1.0465	1.0429	1.0612
4	French	0.7982	0.7982	0.7899	0.8270	0.8270
5	French	0.8852	0.9786	0.9786	0.9649	1.0212
6	English	1.1193	1.1572	1.0097	1.1572	1.1984
7	German	0.5814	0.6045	0.6086	0.6100	0.6846
8	Spanish	0.7799	0.7776	0.7509	0.7799	0.7971
9	French	0.6730	0.7106	0.7106	0.7106	0.7106

v23 Sanitized−Uncanny (Large) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	1.9557	1.9897	2.0095	2.0236	2.0529
Std Dev	0.5888	0.5896	0.5771	0.6004	0.5797
Avg Mean	1.7995	1.8210	1.7690	1.7814	1.7894

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	2.5445	2.5445	2.5636	2.5636	2.5663
1	French	2.4483	2.4475	2.4774	2.4774	2.4774
2	English	0.6573	0.6668	0.6650	0.6688	0.6688
3	German	2.2994	2.3630	2.3319	2.3630	2.3630
4	French	2.2450	2.2317	2.2406	2.2445	2.2450
5	French	2.1403	2.2009	2.2567	2.3476	2.3945
6	English	1.9472	2.1291	1.9977	2.1291	2.1600
7	German	1.1948	1.2248	1.3637	1.2438	1.4349
8	Spanish	2.1278	2.1085	2.1760	2.1760	2.1965
9	French	1.9521	1.9800	2.0227	2.0227	2.0227

v24 Sanitized−Uncanny (Small) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

	N=5	N=10	N=25	N=50	N=100
Avg Best	1.8150	1.8599	1.8555	1.8901	1.9179
Std Dev	0.5338	0.5649	0.5413	0.5632	0.5388
Avg Mean	1.6325	1.6488	1.6062	1.6150	1.6212

Per-Prompt Best Reward by N

#	Lang	N=5	N=10	N=25	N=50	N=100
0	English	2.4921	2.5163	2.5123	2.5232	2.5232
1	French	2.1357	2.1380	2.1643	2.1643	2.1643
2	English	0.6260	0.5819	0.5762	0.6093	0.6260
3	German	2.2411	2.2763	2.2315	2.3090	2.3090
4	French	1.9493	2.0650	2.0665	2.0665	2.0756
5	French	1.9681	2.0667	2.1180	2.1521	2.1806
6	English	2.0144	2.1946	1.9419	2.1946	2.1946
7	German	1.2632	1.2862	1.4201	1.3081	1.5316
8	Spanish	1.7937	1.7579	1.8082	1.8579	1.8579
9	French	1.6668	1.7157	1.7157	1.7157	1.7157

Prompt #0 — English (Silicon Valley accent)

Language: English Accent: Silicon Valley accent Scored: 100/100

DramaBox Prompt

A young woman, possessing an extremely high fundamental frequency and bright, delicate harmonic texture, with a brisk, elevated momentum and a Silicon Valley accent; this is a pristine, high-quality studio voice recording with no background noise. She delivers the lines with a teasing lightness that occasionally borders on nervous energy, punctuated by small moments of genuine relief. (A brief, high-pitched Giggle escapes as she begins.) "Honestly, you think finding a solid Firestone review is that hard? Boggle, really. But look, that Lys thing actually worked." (She pauses, a subtle Contemplation washing over her features, then manages a slight, contained Chuckle.) "Just wait, I'll show you." She concludes with a soft, almost satisfied sigh, allowing the tension to dissipate.

Prompt #1 — French

Language: French Scored: 100/100

DramaBox Prompt

High-pitched, delicately resonant, and possessing the slightly strained clarity of a young adult female soprano; the voice is bright and purely head-dominant, engineered for intimate projection.

Pauses briefly, gathering strength. "Malgré la profondeur de cette sombre forêt, je sens toujours cette confiance absolue en mon chemin, guidée par la lumière."
A slight, almost imperceptible hardening of tone. "Même au cœur de cette nuit insondable, ma boussole intérieure me montre la seule direction véritable."
She finishes, a note of unwavering certainty settling.

The pace remains glacially slow throughout the utterance. The delivery conveys immense, quiet self-assurance.

[Prompts 0-1] · Prompts 2-3 · Prompts 4-5 · Prompts 6-7 · Prompts 8-9