DramaBox + Chatterbox VC — Best-of-N: Diminishing Returns

10 Path A prompts × 100 candidates — 29 ranking methods — Page 1/5

Methodology: Ranking Method Formulas & Text Prompts

All methods use reward = (1 − WER) × max(score, 0). The score varies per method as described below.

#KeyScore FormulaText Prompt(s)
0standardContent Enjoyment
1clap_lqcos(audio, quality_text)"pleasant, realistic, genuine, authentic, natural performance, high-quality recording"
2clap_sqcos(audio, quality_text)"pleasant, realistic, genuine, authentic, natural performance, high-quality recording"
3clap_lpcos(audio, prompt)Original DramaBox prompt
4clap_spcos(audio, prompt)Original DramaBox prompt
5v1_nat_Lcos(audio, nat)"natural, spontaneous, lifelike speech with genuine emotion"
6v2_auth_Lcos(audio, auth)"authentic, emotionally truthful, deeply felt voice performance"
7v3_pro_Lcos(audio, pro)"professional studio recording, crystal clear high-fidelity audio"
8v4_expr_Lcos(audio, expr)"expressive, dynamic voice acting with rich emotional range"
9v5_cine_Lcos(audio, cine)"immersive cinematic narration, compelling storytelling"
10v6_nat_Scos(audio, nat)"natural, spontaneous, lifelike speech with genuine emotion"
11v7_auth_Scos(audio, auth)"authentic, emotionally truthful, deeply felt voice performance"
12v8_pro_Scos(audio, pro)"professional studio recording, crystal clear high-fidelity audio"
13v9_nr_Lcos(audio, nat) − cos(audio, rob)+ "natural, spontaneous, lifelike speech with genuine emotion" / − "robotic, mechanical, monotonous, synthetic computer speech"
14v10_ac_Lcos(audio, auth) − cos(audio, cheap)+ "authentic, emotionally truthful, deeply felt voice performance" / − "cheap, amateurish, rehearsed, stilted text-to-speech output"
15v11_pd_Lcos(audio, pro) − cos(audio, dist)+ "professional studio recording, crystal clear high-fidelity audio" / − "distorted, noisy, muffled, low-quality poor recording"
16v12_ef_Lcos(audio, expr) − cos(audio, flat)+ "expressive, dynamic voice acting with rich emotional range" / − "flat, lifeless, boring, emotionally dead recitation"
17v13_ff_Lcos(audio, full_pos) − cos(audio, full_neg)+ "natural spontaneous genuine authentic high-quality voice performance" / − "robotic distorted monotonous rehearsed cheap artificial synthetic"
18v14_wr_Lcos(audio, warm) − cos(audio, rob)+ "warm, pleasant, engaging conversational human voice" / − "robotic, mechanical, monotonous, synthetic computer speech"
19v15_nr_Scos(audio, nat) − cos(audio, rob)+ "natural, spontaneous, lifelike speech with genuine emotion" / − "robotic, mechanical, monotonous, synthetic computer speech"
20v16_ac_Scos(audio, auth) − cos(audio, cheap)+ "authentic, emotionally truthful, deeply felt voice performance" / − "cheap, amateurish, rehearsed, stilted text-to-speech output"
21v17_pd_Scos(audio, pro) − cos(audio, dist)+ "professional studio recording, crystal clear high-fidelity audio" / − "distorted, noisy, muffled, low-quality poor recording"
22v18_ef_Scos(audio, expr) − cos(audio, flat)+ "expressive, dynamic voice acting with rich emotional range" / − "flat, lifeless, boring, emotionally dead recitation"
23v19_ff_Scos(audio, full_pos) − cos(audio, full_neg)+ "natural spontaneous genuine authentic high-quality voice performance" / − "robotic distorted monotonous rehearsed cheap artificial synthetic"
24v20_wr_Scos(audio, warm) − cos(audio, rob)+ "warm, pleasant, engaging conversational human voice" / − "robotic, mechanical, monotonous, synthetic computer speech"
25v21_san_Lcos(audio, sanitized_prompt)Quoted speech removed (Large)
26v22_san_Scos(audio, sanitized_prompt)Quoted speech removed (Small)
27v23_snr_Lcos(audio, sanitized) − cos(audio, neg_san)Sanitized / − "robotic, distorted, uncanny" (Large)
28v24_snr_Scos(audio, sanitized) − cos(audio, neg_san)Sanitized / − "robotic, distorted, uncanny" (Small)

Cross-Method Diminishing Returns Comparison

Method N=5 N=10 N=25 N=50 N=100 Gain N=5→100Knee Point
Standard: (1−WER) × Content Enjoyment 3.9581 4.0268 4.0754 4.1300 4.1873 +0.2292 N=25
VoiceCLAP-Large × Quality Text 0.9157 0.9398 0.9571 0.9579 0.9734 +0.0577 N=50
VoiceCLAP-Small × Quality Text 0.7554 0.7672 0.7617 0.7889 0.8122 +0.0568 N=25
VoiceCLAP-Large × Prompt Match 1.4017 1.4245 1.4332 1.4464 1.4711 +0.0693 N=25
VoiceCLAP-Small × Prompt Match 0.8592 0.8813 0.8670 0.8846 0.9075 +0.0483 N=25
v1 Natural (Large) 0.9833 1.0110 1.0301 1.0319 1.0473 +0.0639 N=50
v2 Authentic (Large) 1.0174 1.0398 1.0532 1.0589 1.0713 +0.0538 N=25
v3 Professional (Large) 0.8629 0.8870 0.9011 0.9019 0.9169 +0.0540 N=50
v4 Expressive (Large) 1.0302 1.0487 1.0722 1.0744 1.0922 +0.0620 N=50
v5 Cinematic (Large) 0.9959 1.0112 1.0255 1.0311 1.0503 +0.0544 N=25
v6 Natural (Small) 0.7859 0.8138 0.8190 0.8478 0.8632 +0.0773 N=25
v7 Authentic (Small) 0.7514 0.7713 0.7856 0.8068 0.8224 +0.0710 N=100
v8 Professional (Small) 0.6329 0.6443 0.6365 0.6591 0.6732 +0.0403 N=25
v9 Natural−Robotic (Large) 1.7735 1.8105 1.8171 1.8315 1.8589 +0.0854 N=25
v10 Authentic−Cheap (Large) 1.8551 1.8824 1.8978 1.9100 1.9342 +0.0790 N=25
v11 Professional−Distorted (Large) 1.7497 1.7845 1.8056 1.8117 1.8378 +0.0881 N=25
v12 Expressive−Flat (Large) 1.7729 1.7995 1.8278 1.8410 1.8651 +0.0922 N=50
v13 FullPos−FullNeg (Large) 1.7672 1.8010 1.8270 1.8293 1.8584 +0.0912 N=25
v14 Warm−Robotic (Large) 1.6873 1.7219 1.7342 1.7475 1.7749 +0.0876 N=25
v15 Natural−Robotic (Small) 1.8351 1.8862 1.9036 1.9140 1.9449 +0.1098 N=25
v16 Authentic−Cheap (Small) 1.8619 1.9117 1.9390 1.9592 2.0056 +0.1437 N=25
v17 Professional−Distorted (Small) 1.7135 1.7285 1.7501 1.7619 1.7889 +0.0754 N=25
v18 Expressive−Flat (Small) 1.5851 1.6384 1.6706 1.6856 1.7089 +0.1238 N=50
v19 FullPos−FullNeg (Small) 1.6466 1.6811 1.6969 1.7114 1.7327 +0.0861 N=25
v20 Warm−Robotic (Small) 1.7386 1.7669 1.7929 1.8204 1.8502 +0.1115 N=50
v21 Sanitized Prompt (Large) 1.1543 1.1748 1.1891 1.2022 1.2222 +0.0679 N=25
v22 Sanitized Prompt (Small) 0.8341 0.8580 0.8388 0.8582 0.8839 +0.0498 N=25
v23 Sanitized−Uncanny (Large) 1.9557 1.9897 2.0095 2.0236 2.0529 +0.0972 N=25
v24 Sanitized−Uncanny (Small) 1.8150 1.8599 1.8555 1.8901 1.9179 +0.1028 N=25

Diminishing Returns — All Methods Overlaid

Original (5) 0.680 1.423 2.167 2.910 3.653 4.397 N=5 N=10 N=25 N=50 N=100 N candidates Standard: (1−WER) × Content Enjoyment VoiceCLAP-Large × Quality Text VoiceCLAP-Small × Quality Text VoiceCLAP-Large × Prompt Match VoiceCLAP-Small × Prompt Match Positive-only, Large (5) 0.777 0.851 0.925 0.999 1.073 1.147 N=5 N=10 N=25 N=50 N=100 N candidates v1 Natural (Large) v2 Authentic (Large) v3 Professional (Large) v4 Expressive (Large) v5 Cinematic (Large) Positive-only, Small (3) 0.570 0.637 0.704 0.772 0.839 0.906 N=5 N=10 N=25 N=50 N=100 N candidates v6 Natural (Small) v7 Authentic (Small) v8 Professional (Small) Pos−Neg, Large (6) 1.519 1.621 1.723 1.826 1.928 2.031 N=5 N=10 N=25 N=50 N=100 N candidates v9 Natural−Robotic (Large) v10 Authentic−Cheap (Large) v11 Professional−Distorted (Large) v12 Expressive−Flat (Large) v13 FullPos−FullNeg (Large) v14 Warm−Robotic (Large) Pos−Neg, Small (6) 1.427 1.562 1.698 1.834 1.970 2.106 N=5 N=10 N=25 N=50 N=100 N candidates v15 Natural−Robotic (Small) v16 Authentic−Cheap (Small) v17 Professional−Distorted (Small) v18 Expressive−Flat (Small) v19 FullPos−FullNeg (Small) v20 Warm−Robotic (Small) Sanitized Prompt (4) 0.751 1.032 1.313 1.594 1.875 2.156 N=5 N=10 N=25 N=50 N=100 N candidates v21 Sanitized Prompt (Large) v22 Sanitized Prompt (Small) v23 Sanitized−Uncanny (Large) v24 Sanitized−Uncanny (Small)

Marginal Improvement per Additional Candidate

MethodN=5→10N=10→25N=25→50N=50→100
Standard: (1−WER) × Content Enjoyment 0.01374/cand (1.7%) 0.00324/cand (1.2%) 0.00218/cand (1.3%) 0.00115/cand (1.4%)
VoiceCLAP-Large × Quality Text 0.00484/cand (2.6%) 0.00115/cand (1.8%) 0.00003/cand (0.1%) 0.00031/cand (1.6%)
VoiceCLAP-Small × Quality Text 0.00236/cand (1.6%) -0.00036/cand (-0.7%) 0.00109/cand (3.6%) 0.00047/cand (3.0%)
VoiceCLAP-Large × Prompt Match 0.00456/cand (1.6%) 0.00058/cand (0.6%) 0.00052/cand (0.9%) 0.00049/cand (1.7%)
VoiceCLAP-Small × Prompt Match 0.00442/cand (2.6%) -0.00095/cand (-1.6%) 0.00070/cand (2.0%) 0.00046/cand (2.6%)
v1 Natural (Large) 0.00554/cand (2.8%) 0.00127/cand (1.9%) 0.00007/cand (0.2%) 0.00031/cand (1.5%)
v2 Authentic (Large) 0.00448/cand (2.2%) 0.00090/cand (1.3%) 0.00023/cand (0.5%) 0.00025/cand (1.2%)
v3 Professional (Large) 0.00483/cand (2.8%) 0.00094/cand (1.6%) 0.00003/cand (0.1%) 0.00030/cand (1.7%)
v4 Expressive (Large) 0.00370/cand (1.8%) 0.00157/cand (2.2%) 0.00009/cand (0.2%) 0.00036/cand (1.7%)
v5 Cinematic (Large) 0.00307/cand (1.5%) 0.00095/cand (1.4%) 0.00022/cand (0.5%) 0.00039/cand (1.9%)
v6 Natural (Small) 0.00559/cand (3.6%) 0.00035/cand (0.6%) 0.00115/cand (3.5%) 0.00031/cand (1.8%)
v7 Authentic (Small) 0.00397/cand (2.6%) 0.00096/cand (1.9%) 0.00085/cand (2.7%) 0.00031/cand (1.9%)
v8 Professional (Small) 0.00228/cand (1.8%) -0.00052/cand (-1.2%) 0.00090/cand (3.5%) 0.00028/cand (2.1%)
v9 Natural−Robotic (Large) 0.00741/cand (2.1%) 0.00044/cand (0.4%) 0.00057/cand (0.8%) 0.00055/cand (1.5%)
v10 Authentic−Cheap (Large) 0.00545/cand (1.5%) 0.00103/cand (0.8%) 0.00049/cand (0.6%) 0.00048/cand (1.3%)
v11 Professional−Distorted (Large) 0.00696/cand (2.0%) 0.00141/cand (1.2%) 0.00024/cand (0.3%) 0.00052/cand (1.4%)
v12 Expressive−Flat (Large) 0.00533/cand (1.5%) 0.00188/cand (1.6%) 0.00053/cand (0.7%) 0.00048/cand (1.3%)
v13 FullPos−FullNeg (Large) 0.00676/cand (1.9%) 0.00173/cand (1.4%) 0.00009/cand (0.1%) 0.00058/cand (1.6%)
v14 Warm−Robotic (Large) 0.00692/cand (2.1%) 0.00082/cand (0.7%) 0.00053/cand (0.8%) 0.00055/cand (1.6%)
v15 Natural−Robotic (Small) 0.01022/cand (2.8%) 0.00115/cand (0.9%) 0.00042/cand (0.5%) 0.00062/cand (1.6%)
v16 Authentic−Cheap (Small) 0.00996/cand (2.7%) 0.00182/cand (1.4%) 0.00081/cand (1.0%) 0.00093/cand (2.4%)
v17 Professional−Distorted (Small) 0.00300/cand (0.9%) 0.00144/cand (1.2%) 0.00047/cand (0.7%) 0.00054/cand (1.5%)
v18 Expressive−Flat (Small) 0.01065/cand (3.4%) 0.00215/cand (2.0%) 0.00060/cand (0.9%) 0.00047/cand (1.4%)
v19 FullPos−FullNeg (Small) 0.00689/cand (2.1%) 0.00106/cand (0.9%) 0.00058/cand (0.9%) 0.00043/cand (1.2%)
v20 Warm−Robotic (Small) 0.00565/cand (1.6%) 0.00174/cand (1.5%) 0.00110/cand (1.5%) 0.00060/cand (1.6%)
v21 Sanitized Prompt (Large) 0.00409/cand (1.8%) 0.00095/cand (1.2%) 0.00052/cand (1.1%) 0.00040/cand (1.7%)
v22 Sanitized Prompt (Small) 0.00477/cand (2.9%) -0.00128/cand (-2.2%) 0.00078/cand (2.3%) 0.00051/cand (3.0%)
v23 Sanitized−Uncanny (Large) 0.00680/cand (1.7%) 0.00132/cand (1.0%) 0.00056/cand (0.7%) 0.00059/cand (1.4%)
v24 Sanitized−Uncanny (Small) 0.00896/cand (2.5%) -0.00029/cand (-0.2%) 0.00138/cand (1.9%) 0.00056/cand (1.5%)

Ablation: Pronunciation Suffix Effect

Comparing N=10 without suffix vs N=10 with suffix.

Ranking MethodWithout Suffix (N=10)With Suffix (N=10)Delta
MeanBestMedianMeanBestMedianΔ MeanΔ Best
Standard: (1−WER) × Content Enjoyment3.61953.96073.64243.61064.03083.6205-0.0090+0.0702
VoiceCLAP-Large × Quality Text0.84440.92830.85230.83290.93460.8389-0.0114+0.0064
VoiceCLAP-Small × Quality Text0.63510.75510.63860.61890.75220.6206-0.0162-0.0030
VoiceCLAP-Large × Prompt Match1.28701.40861.30241.27781.42121.2879-0.0092+0.0126
VoiceCLAP-Small × Prompt Match0.75040.84160.75220.74460.87010.7434-0.0058+0.0285
v1 Natural (Large)0.90491.00070.91440.89171.00240.8969-0.0132+0.0017
v2 Authentic (Large)0.93431.02840.94360.92421.03120.9340-0.0101+0.0028
v3 Professional (Large)0.79430.87540.80130.78440.88060.7911-0.0099+0.0052
v4 Expressive (Large)0.93401.03330.93910.92711.03960.9385-0.0069+0.0062
v5 Cinematic (Large)0.90451.00540.90960.89661.00540.9063-0.0079-0.0000
v6 Natural (Small)0.69830.81170.69660.68810.80980.6849-0.0102-0.0019
v7 Authentic (Small)0.67000.76800.66910.66490.76980.6667-0.0051+0.0018
v8 Professional (Small)0.53150.62450.53120.51140.61840.5143-0.0202-0.0061
v9 Natural−Robotic (Large)1.63131.78761.64921.61221.79561.6244-0.0190+0.0080
v10 Authentic−Cheap (Large)1.70221.86131.71931.68591.87561.7018-0.0163+0.0143
v11 Professional−Distorted (Large)1.60921.76641.62261.58941.77391.6019-0.0198+0.0076
v12 Expressive−Flat (Large)1.61731.77401.63161.60401.79031.6215-0.0133+0.0163
v13 FullPos−FullNeg (Large)1.63091.77931.64661.61191.79191.6223-0.0190+0.0126
v14 Warm−Robotic (Large)1.55631.69631.57311.53801.70281.5489-0.0183+0.0065
v15 Natural−Robotic (Small)1.68231.87421.68951.66751.88861.6774-0.0148+0.0144
v16 Authentic−Cheap (Small)1.69651.89551.70271.69171.92411.7057-0.0048+0.0286
v17 Professional−Distorted (Small)1.55211.72081.56471.52461.70141.5344-0.0275-0.0195
v18 Expressive−Flat (Small)1.44771.63321.44921.44701.62881.4661-0.0007-0.0044
v19 FullPos−FullNeg (Small)1.50221.67461.51111.48011.66351.4930-0.0221-0.0111
v20 Warm−Robotic (Small)1.57241.78461.57271.56221.76701.5683-0.0102-0.0175
v21 Sanitized Prompt (Large)1.05711.15941.06781.04741.17451.0524-0.0097+0.0151
v22 Sanitized Prompt (Small)0.71580.79910.71770.71190.83450.7118-0.0039+0.0354
v23 Sanitized−Uncanny (Large)1.80041.96871.81881.78351.98761.7944-0.0170+0.0189
v24 Sanitized−Uncanny (Small)1.63041.81311.63641.61461.83901.6165-0.0158+0.0259

Per-Prompt Ablation: Standard Reward (N=10)

#LangNo Suffix MeanNo Suffix BestWith Suffix MeanWith Suffix BestΔ MeanΔ Best
0English4.43824.68764.24464.8360-0.1936+0.1484
1French4.75094.96064.57314.9682-0.1778+0.0076
2English0.76241.28260.88711.3244+0.1247+0.0418
3German4.77404.89204.80344.9066+0.0295+0.0146
4French4.54494.77554.49554.7763-0.0493+0.0008
5French3.64004.21563.86874.6396+0.2287+0.4240
6English2.97163.61523.11763.6882+0.1460+0.0730
7German2.25792.50022.22562.4999-0.0323-0.0003
8Spanish4.60494.81214.63704.8220+0.0320+0.0099
9French3.45053.86533.25313.8472-0.1974-0.0181

Standard: (1−WER) × Content Enjoyment — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

N=5 N=10 N=25 N=50 N=100
Avg Best 3.9581 4.0268 4.0754 4.1300 4.1873
Std Dev 1.2052 1.2232 1.2125 1.2485 1.2126
Avg Mean 3.6381 3.6722 3.5834 3.6057 3.6192

Per-Prompt Best Reward by N

#Lang N=5 N=10 N=25 N=50 N=100
0English 4.7697 4.7697 4.8043 4.8319 4.8905
1French 4.8851 4.9270 5.0117 5.0117 5.0117
2English 1.2964 1.3096 1.2896 1.2967 1.3108
3German 4.9759 4.9558 4.9921 4.9518 4.9928
4French 4.6222 4.7174 4.7755 4.8091 4.8091
5French 4.1066 4.2291 4.4138 4.7376 4.8423
6English 3.7682 4.1416 3.7523 4.1416 4.1416
7German 2.4842 2.4531 2.8076 2.5773 2.9324
8Spanish 4.8244 4.9292 5.0107 5.0107 5.0107
9French 3.8486 3.8359 3.8964 3.9315 3.9315

VoiceCLAP-Large × Quality Text — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

N=5 N=10 N=25 N=50 N=100
Avg Best 0.9157 0.9398 0.9571 0.9579 0.9734
Std Dev 0.2876 0.2867 0.2740 0.2800 0.2735
Avg Mean 0.8492 0.8551 0.8320 0.8389 0.8418

Per-Prompt Best Reward by N

#Lang N=5 N=10 N=25 N=50 N=100
0English 1.1801 1.1960 1.2199 1.2072 1.2199
1French 1.1609 1.1651 1.1736 1.1746 1.1787
2English 0.3421 0.3426 0.3644 0.3644 0.3644
3German 1.1194 1.1489 1.1363 1.1489 1.1489
4French 1.2584 1.2616 1.2747 1.2612 1.2747
5French 0.8236 0.8705 0.8963 0.9109 0.9599
6English 0.8797 0.9655 0.8966 0.9655 0.9655
7German 0.5779 0.6083 0.7006 0.6249 0.7006
8Spanish 0.8715 0.8787 0.9214 0.9234 0.9234
9French 0.9430 0.9612 0.9875 0.9977 0.9977

VoiceCLAP-Small × Quality Text — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

N=5 N=10 N=25 N=50 N=100
Avg Best 0.7554 0.7672 0.7617 0.7889 0.8122
Std Dev 0.2944 0.2925 0.2815 0.2918 0.2915
Avg Mean 0.6519 0.6468 0.6185 0.6304 0.6341

Per-Prompt Best Reward by N

#Lang N=5 N=10 N=25 N=50 N=100
0English 0.7848 0.7696 0.8288 0.8288 0.8588
1French 0.9654 1.0323 0.9752 1.0323 1.0323
2English 0.2507 0.2656 0.2666 0.2629 0.2680
3German 1.0764 1.0613 1.1007 1.0541 1.1007
4French 1.1490 1.1624 1.1073 1.1721 1.1721
5French 0.4992 0.5699 0.5871 0.6379 0.6379
6English 0.7262 0.8032 0.7391 0.8032 0.8714
7German 0.4466 0.4526 0.4866 0.4358 0.4866
8Spanish 0.6695 0.5961 0.5669 0.6568 0.6695
9French 0.9860 0.9590 0.9590 1.0047 1.0246

VoiceCLAP-Large × Prompt Match — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

N=5 N=10 N=25 N=50 N=100
Avg Best 1.4017 1.4245 1.4332 1.4464 1.4711
Std Dev 0.4496 0.4485 0.4446 0.4537 0.4416
Avg Mean 1.2874 1.3003 1.2653 1.2744 1.2805

Per-Prompt Best Reward by N

#Lang N=5 N=10 N=25 N=50 N=100
0English 1.7400 1.7779 1.7687 1.7775 1.7779
1French 1.8340 1.8423 1.8407 1.8427 1.8553
2English 0.4610 0.4587 0.4610 0.4610 0.4610
3German 1.8086 1.8357 1.8307 1.8357 1.8357
4French 1.5970 1.5916 1.5846 1.5878 1.5970
5French 1.5405 1.5608 1.6132 1.6831 1.7237
6English 1.3109 1.3858 1.3323 1.3858 1.4119
7German 0.8211 0.8590 0.9034 0.8673 0.9943
8Spanish 1.6251 1.6041 1.6447 1.6522 1.6741
9French 1.2792 1.3295 1.3530 1.3704 1.3797

VoiceCLAP-Small × Prompt Match — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

N=5 N=10 N=25 N=50 N=100
Avg Best 0.8592 0.8813 0.8670 0.8846 0.9075
Std Dev 0.2891 0.2970 0.2800 0.2953 0.2898
Avg Mean 0.7620 0.7667 0.7428 0.7468 0.7500

Per-Prompt Best Reward by N

#Lang N=5 N=10 N=25 N=50 N=100
0English 1.2965 1.3337 1.3187 1.3187 1.3337
1French 0.8539 0.8664 0.8710 0.8710 0.8710
2English 0.3242 0.3066 0.3013 0.3037 0.3242
3German 1.1672 1.1607 1.1445 1.1830 1.1879
4French 0.8727 0.9134 0.9120 0.9210 0.9293
5French 0.9042 0.9781 0.9781 0.9615 1.0058
6English 1.1224 1.1348 0.9805 1.1348 1.1678
7German 0.5815 0.6115 0.6267 0.6144 0.7063
8Spanish 0.7488 0.7396 0.7697 0.7697 0.7809
9French 0.7203 0.7679 0.7679 0.7679 0.7679

v1 Natural (Large) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

N=5 N=10 N=25 N=50 N=100
Avg Best 0.9833 1.0110 1.0301 1.0319 1.0473
Std Dev 0.2930 0.2892 0.2820 0.2897 0.2820
Avg Mean 0.9095 0.9154 0.8904 0.8967 0.9003

Per-Prompt Best Reward by N

#Lang N=5 N=10 N=25 N=50 N=100
0English 1.2999 1.3086 1.3218 1.3206 1.3218
1French 1.2314 1.2332 1.2605 1.2605 1.2605
2English 0.3951 0.3878 0.4029 0.4029 0.4029
3German 1.0973 1.1184 1.1089 1.1248 1.1248
4French 1.3264 1.3264 1.3393 1.3286 1.3393
5French 0.9290 0.9829 1.0206 1.0278 1.0844
6English 0.9207 1.0350 0.9784 1.0350 1.0407
7German 0.6350 0.6844 0.7618 0.6844 0.7618
8Spanish 0.9463 0.9553 0.9966 1.0209 1.0209
9French 1.0520 1.0780 1.1103 1.1140 1.1154

v2 Authentic (Large) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

N=5 N=10 N=25 N=50 N=100
Avg Best 1.0174 1.0398 1.0532 1.0589 1.0713
Std Dev 0.2791 0.2804 0.2748 0.2857 0.2760
Avg Mean 0.9382 0.9451 0.9201 0.9268 0.9302

Per-Prompt Best Reward by N

#Lang N=5 N=10 N=25 N=50 N=100
0English 1.1549 1.1756 1.1729 1.1752 1.1756
1French 1.2840 1.2896 1.2984 1.2984 1.2984
2English 0.4005 0.3968 0.4061 0.4061 0.4061
3German 1.1687 1.1819 1.1812 1.1937 1.1937
4French 1.3010 1.3022 1.3179 1.3179 1.3179
5French 1.0370 1.0809 1.1116 1.1435 1.1790
6English 0.9344 1.0569 0.9639 1.0569 1.0569
7German 0.6940 0.7049 0.7912 0.7038 0.7915
8Spanish 1.1312 1.1358 1.1862 1.1880 1.1880
9French 1.0684 1.0735 1.1030 1.1054 1.1054

v3 Professional (Large) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

N=5 N=10 N=25 N=50 N=100
Avg Best 0.8629 0.8870 0.9011 0.9019 0.9169
Std Dev 0.2646 0.2657 0.2498 0.2591 0.2508
Avg Mean 0.7978 0.8034 0.7831 0.7890 0.7916

Per-Prompt Best Reward by N

#Lang N=5 N=10 N=25 N=50 N=100
0English 1.0788 1.1150 1.1273 1.1202 1.1351
1French 1.0938 1.0996 1.1037 1.1057 1.1057
2English 0.3301 0.3313 0.3570 0.3570 0.3570
3German 1.0806 1.1039 1.0909 1.1052 1.1052
4French 1.1657 1.1606 1.1726 1.1623 1.1726
5French 0.7691 0.8260 0.8462 0.8519 0.8941
6English 0.8495 0.9323 0.8728 0.9323 0.9323
7German 0.5526 0.5725 0.6599 0.5784 0.6599
8Spanish 0.8247 0.8339 0.8619 0.8796 0.8796
9French 0.8837 0.8952 0.9183 0.9264 0.9273

v4 Expressive (Large) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

N=5 N=10 N=25 N=50 N=100
Avg Best 1.0302 1.0487 1.0722 1.0744 1.0922
Std Dev 0.2973 0.2907 0.2866 0.3061 0.2913
Avg Mean 0.9373 0.9471 0.9224 0.9273 0.9308

Per-Prompt Best Reward by N

#Lang N=5 N=10 N=25 N=50 N=100
0English 1.2845 1.2997 1.3146 1.3146 1.3146
1French 1.2896 1.2632 1.3303 1.3303 1.3303
2English 0.3983 0.4149 0.4080 0.4110 0.4176
3German 1.0538 1.0646 1.0646 1.0721 1.0779
4French 1.2563 1.2671 1.2632 1.2867 1.2867
5French 1.1055 1.1381 1.1962 1.2255 1.2476
6English 1.1798 1.2267 1.1692 1.2267 1.2345
7German 0.6153 0.6578 0.7594 0.6548 0.7594
8Spanish 1.1322 1.1394 1.1824 1.1824 1.1851
9French 0.9865 1.0151 1.0344 1.0401 1.0685

v5 Cinematic (Large) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

N=5 N=10 N=25 N=50 N=100
Avg Best 0.9959 1.0112 1.0255 1.0311 1.0503
Std Dev 0.2751 0.2765 0.2611 0.2823 0.2707
Avg Mean 0.9071 0.9145 0.8905 0.8963 0.8999

Per-Prompt Best Reward by N

#Lang N=5 N=10 N=25 N=50 N=100
0English 1.0962 1.1126 1.1399 1.1399 1.1399
1French 1.2493 1.2514 1.2545 1.2526 1.2545
2English 0.3975 0.3952 0.3969 0.4072 0.4072
3German 1.1020 1.1141 1.1141 1.1123 1.1141
4French 1.2352 1.2321 1.2387 1.2437 1.2507
5French 1.1387 1.1553 1.1818 1.2361 1.2825
6English 1.0950 1.1501 1.0561 1.1501 1.1501
7German 0.6107 0.6376 0.7574 0.6362 0.7574
8Spanish 1.0335 1.0103 1.0450 1.0478 1.0478
9French 1.0011 1.0538 1.0702 1.0846 1.0991

v6 Natural (Small) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

N=5 N=10 N=25 N=50 N=100
Avg Best 0.7859 0.8138 0.8190 0.8478 0.8632
Std Dev 0.2624 0.2598 0.2469 0.2604 0.2545
Avg Mean 0.7038 0.7016 0.6778 0.6870 0.6902

Per-Prompt Best Reward by N

#Lang N=5 N=10 N=25 N=50 N=100
0English 0.5855 0.6011 0.6720 0.6980 0.7086
1French 1.0678 1.1142 1.1085 1.1323 1.1323
2English 0.3018 0.2919 0.3094 0.3111 0.3112
3German 1.0163 0.9880 1.0250 1.0267 1.0267
4French 1.0540 1.0493 1.0521 1.0951 1.0951
5French 0.5972 0.7371 0.6884 0.7579 0.7579
6English 0.7692 0.9272 0.7952 0.9272 0.9744
7German 0.5935 0.5827 0.6630 0.5827 0.6653
8Spanish 0.8304 0.8293 0.8474 0.9041 0.9041
9French 1.0429 1.0175 1.0291 1.0429 1.0563

v7 Authentic (Small) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

N=5 N=10 N=25 N=50 N=100
Avg Best 0.7514 0.7713 0.7856 0.8068 0.8224
Std Dev 0.2323 0.2361 0.2282 0.2368 0.2263
Avg Mean 0.6740 0.6731 0.6505 0.6594 0.6620

Per-Prompt Best Reward by N

#Lang N=5 N=10 N=25 N=50 N=100
0English 0.6015 0.6025 0.6906 0.6929 0.7054
1French 1.0396 1.0476 1.0728 1.0733 1.0733
2English 0.3161 0.3116 0.3160 0.3206 0.3267
3German 0.9578 0.9621 0.9861 0.9854 0.9868
4French 0.9614 0.9640 0.9895 1.0108 1.0108
5French 0.6842 0.7927 0.7239 0.8119 0.8119
6English 0.6189 0.8008 0.7064 0.8008 0.8349
7German 0.5529 0.4810 0.5864 0.5201 0.6010
8Spanish 0.8499 0.8258 0.8599 0.9206 0.9206
9French 0.9321 0.9249 0.9249 0.9321 0.9527

v8 Professional (Small) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

N=5 N=10 N=25 N=50 N=100
Avg Best 0.6329 0.6443 0.6365 0.6591 0.6732
Std Dev 0.2393 0.2342 0.2323 0.2318 0.2384
Avg Mean 0.5460 0.5425 0.5196 0.5284 0.5315

Per-Prompt Best Reward by N

#Lang N=5 N=10 N=25 N=50 N=100
0English 0.8298 0.7883 0.8568 0.8568 0.8672
1French 0.7508 0.8219 0.7614 0.8219 0.8219
2English 0.2313 0.2374 0.2399 0.2361 0.2407
3German 0.8619 0.8717 0.8988 0.8493 0.8988
4French 0.9454 0.9521 0.9063 0.9311 0.9521
5French 0.3799 0.4489 0.4533 0.4821 0.4821
6English 0.6101 0.6875 0.6297 0.6875 0.6930
7German 0.3956 0.4062 0.4121 0.3958 0.4187
8Spanish 0.5416 0.4901 0.4684 0.5479 0.5479
9French 0.7823 0.7387 0.7387 0.7823 0.8095

v9 Natural−Robotic (Large) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

N=5 N=10 N=25 N=50 N=100
Avg Best 1.7735 1.8105 1.8171 1.8315 1.8589
Std Dev 0.5023 0.5011 0.4863 0.5056 0.4894
Avg Mean 1.6346 1.6513 1.6035 1.6155 1.6223

Per-Prompt Best Reward by N

#Lang N=5 N=10 N=25 N=50 N=100
0English 2.1856 2.1856 2.1715 2.1856 2.1856
1French 2.1406 2.1531 2.1525 2.1609 2.1609
2English 0.6243 0.6164 0.6020 0.6243 0.6243
3German 2.0682 2.0936 2.0832 2.0983 2.0983
4French 2.2262 2.2262 2.2208 2.2298 2.2298
5French 1.8148 1.9070 1.8824 1.9578 2.0445
6English 1.6756 1.8555 1.7410 1.8555 1.8608
7German 1.2134 1.2666 1.4427 1.2666 1.4427
8Spanish 1.8958 1.9025 1.9592 2.0257 2.0257
9French 1.8901 1.8987 1.9159 1.9103 1.9159

v10 Authentic−Cheap (Large) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

N=5 N=10 N=25 N=50 N=100
Avg Best 1.8551 1.8824 1.8978 1.9100 1.9342
Std Dev 0.5162 0.5218 0.5084 0.5348 0.5142
Avg Mean 1.7055 1.7213 1.6731 1.6859 1.6929

Per-Prompt Best Reward by N

#Lang N=5 N=10 N=25 N=50 N=100
0English 1.9805 1.9805 1.9704 1.9805 1.9805
1French 2.3129 2.3257 2.3353 2.3328 2.3405
2English 0.6722 0.6662 0.6771 0.6722 0.6771
3German 2.1338 2.1609 2.1511 2.1638 2.1638
4French 2.3307 2.3370 2.3550 2.3550 2.3550
5French 2.0096 2.0885 2.0934 2.2069 2.2325
6English 1.7113 1.8737 1.7147 1.8737 1.8737
7German 1.2790 1.2744 1.4621 1.2790 1.4680
8Spanish 2.1274 2.1179 2.1960 2.2225 2.2225
9French 1.9938 1.9988 2.0233 2.0136 2.0281

v11 Professional−Distorted (Large) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

N=5 N=10 N=25 N=50 N=100
Avg Best 1.7497 1.7845 1.8056 1.8117 1.8378
Std Dev 0.5001 0.5083 0.4809 0.5052 0.4858
Avg Mean 1.6125 1.6263 1.5829 1.5948 1.6007

Per-Prompt Best Reward by N

#Lang N=5 N=10 N=25 N=50 N=100
0English 2.1368 2.1826 2.1965 2.1965 2.1965
1French 2.1929 2.1956 2.2158 2.2158 2.2158
2English 0.6719 0.6721 0.6976 0.6976 0.6976
3German 1.9887 2.0254 2.0132 2.0317 2.0317
4French 2.2817 2.2955 2.2954 2.2916 2.3069
5French 1.7478 1.8559 1.8564 1.9257 1.9956
6English 1.7129 1.8341 1.7216 1.8341 1.8341
7German 1.1242 1.1291 1.3234 1.1479 1.3234
8Spanish 1.8275 1.8163 1.8733 1.8899 1.8899
9French 1.8124 1.8381 1.8629 1.8865 1.8865

v12 Expressive−Flat (Large) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

N=5 N=10 N=25 N=50 N=100
Avg Best 1.7729 1.7995 1.8278 1.8410 1.8651
Std Dev 0.5165 0.5108 0.5006 0.5287 0.5120
Avg Mean 1.6170 1.6382 1.5944 1.6033 1.6100

Per-Prompt Best Reward by N

#Lang N=5 N=10 N=25 N=50 N=100
0English 2.2023 2.2023 2.2302 2.2302 2.2302
1French 2.2038 2.1729 2.2349 2.2349 2.2349
2English 0.5963 0.6130 0.6042 0.6210 0.6210
3German 1.8879 1.9135 1.9238 1.9245 1.9448
4French 2.1078 2.1368 2.1097 2.1339 2.1368
5French 1.9549 1.9768 2.0464 2.1254 2.1770
6English 1.9250 2.0643 1.9352 2.0643 2.0643
7German 1.1267 1.1891 1.3549 1.1891 1.3549
8Spanish 1.9866 1.9861 2.0638 2.1036 2.1036
9French 1.7373 1.7403 1.7747 1.7833 1.7833

v13 FullPos−FullNeg (Large) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

N=5 N=10 N=25 N=50 N=100
Avg Best 1.7672 1.8010 1.8270 1.8293 1.8584
Std Dev 0.5271 0.5227 0.5027 0.5267 0.5071
Avg Mean 1.6334 1.6490 1.6037 1.6153 1.6220

Per-Prompt Best Reward by N

#Lang N=5 N=10 N=25 N=50 N=100
0English 2.1537 2.1657 2.1748 2.1680 2.1748
1French 2.2928 2.2831 2.3125 2.3125 2.3125
2English 0.6100 0.6210 0.6349 0.6349 0.6349
3German 2.0169 2.0474 2.0468 2.0480 2.0480
4French 2.2687 2.2687 2.2729 2.2721 2.2729
5French 1.8081 1.9001 1.9082 1.9826 2.0542
6English 1.6926 1.8334 1.7388 1.8334 1.8515
7German 1.1340 1.1675 1.3575 1.1729 1.3575
8Spanish 1.8263 1.8329 1.9110 1.9583 1.9583
9French 1.8690 1.8904 1.9124 1.9102 1.9195

v14 Warm−Robotic (Large) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

N=5 N=10 N=25 N=50 N=100
Avg Best 1.6873 1.7219 1.7342 1.7475 1.7749
Std Dev 0.5051 0.5079 0.4916 0.5060 0.4943
Avg Mean 1.5595 1.5758 1.5304 1.5430 1.5492

Per-Prompt Best Reward by N

#Lang N=5 N=10 N=25 N=50 N=100
0English 2.0916 2.1012 2.1366 2.1164 2.1366
1French 2.0784 2.0871 2.0954 2.0871 2.0980
2English 0.5567 0.5437 0.5501 0.5604 0.5604
3German 2.0835 2.1383 2.1268 2.1258 2.1383
4French 2.1315 2.1315 2.1200 2.1369 2.1369
5French 1.6906 1.7761 1.7486 1.8316 1.9095
6English 1.6091 1.7514 1.6514 1.7712 1.7712
7German 1.1128 1.1680 1.3346 1.1823 1.3346
8Spanish 1.7799 1.7860 1.8399 1.8920 1.8920
9French 1.7389 1.7359 1.7390 1.7711 1.7711

v15 Natural−Robotic (Small) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

N=5 N=10 N=25 N=50 N=100
Avg Best 1.8351 1.8862 1.9036 1.9140 1.9449
Std Dev 0.5423 0.5520 0.5264 0.5606 0.5325
Avg Mean 1.6816 1.6968 1.6466 1.6609 1.6680

Per-Prompt Best Reward by N

#Lang N=5 N=10 N=25 N=50 N=100
0English 1.8539 1.9471 2.0145 2.0050 2.0303
1French 2.4311 2.4143 2.4497 2.4497 2.4497
2English 0.6333 0.6405 0.6522 0.6395 0.6522
3German 2.2820 2.2947 2.2820 2.3039 2.3039
4French 2.3169 2.3692 2.3504 2.3808 2.3836
5French 1.8340 1.9902 1.9196 2.0329 2.0529
6English 1.7990 1.9793 1.8037 1.9799 2.0081
7German 1.2372 1.2136 1.4509 1.2305 1.4509
8Spanish 1.9021 1.9495 1.9770 1.9818 1.9818
9French 2.0619 2.0640 2.1355 2.1355 2.1355

v16 Authentic−Cheap (Small) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

N=5 N=10 N=25 N=50 N=100
Avg Best 1.8619 1.9117 1.9390 1.9592 2.0056
Std Dev 0.5453 0.5732 0.5491 0.5795 0.5567
Avg Mean 1.6898 1.7107 1.6606 1.6750 1.6808

Per-Prompt Best Reward by N

#Lang N=5 N=10 N=25 N=50 N=100
0English 1.7439 1.7439 1.8753 1.8392 1.8753
1French 2.5063 2.4778 2.5432 2.5602 2.5602
2English 0.6713 0.6431 0.6527 0.6511 0.6713
3German 2.3565 2.4256 2.3913 2.4270 2.4270
4French 2.2102 2.2980 2.3179 2.2980 2.3311
5French 2.0216 2.1406 2.1085 2.2248 2.3031
6English 1.6249 1.9050 1.7063 1.9050 1.9050
7German 1.3323 1.2505 1.5216 1.3198 1.5521
8Spanish 2.1459 2.1939 2.1939 2.2821 2.2821
9French 2.0060 2.0385 2.0792 2.0846 2.1484

v17 Professional−Distorted (Small) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

N=5 N=10 N=25 N=50 N=100
Avg Best 1.7135 1.7285 1.7501 1.7619 1.7889
Std Dev 0.5271 0.5330 0.5180 0.5343 0.5170
Avg Mean 1.5574 1.5689 1.5235 1.5364 1.5427

Per-Prompt Best Reward by N

#Lang N=5 N=10 N=25 N=50 N=100
0English 2.0428 2.1051 2.1361 2.1361 2.1361
1French 2.1708 2.1759 2.1575 2.1759 2.1759
2English 0.5492 0.5412 0.5524 0.5524 0.5524
3German 2.0547 2.0408 2.1070 2.0623 2.1070
4French 2.2691 2.2691 2.2652 2.2691 2.2691
5French 1.7026 1.6815 1.7862 1.8018 1.8460
6English 1.6761 1.8117 1.6513 1.8504 1.8504
7German 1.0969 1.1115 1.2562 1.1102 1.2815
8Spanish 1.7161 1.6713 1.7120 1.7835 1.7835
9French 1.8571 1.8772 1.8772 1.8772 1.8873

v18 Expressive−Flat (Small) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

N=5 N=10 N=25 N=50 N=100
Avg Best 1.5851 1.6384 1.6706 1.6856 1.7089
Std Dev 0.4835 0.4827 0.4739 0.5027 0.4877
Avg Mean 1.4512 1.4592 1.4309 1.4359 1.4424

Per-Prompt Best Reward by N

#Lang N=5 N=10 N=25 N=50 N=100
0English 2.0046 2.0648 2.0412 2.0535 2.0648
1French 1.8388 1.8060 1.9466 1.9466 1.9466
2English 0.5136 0.5190 0.5264 0.5160 0.5424
3German 1.7483 1.8018 1.8018 1.8250 1.8311
4French 1.8219 1.8401 1.8869 1.9158 1.9565
5French 1.7471 1.7990 1.8520 1.9030 1.9459
6English 1.6545 1.8510 1.7710 1.8510 1.8510
7German 0.9641 1.0629 1.1792 1.0735 1.1792
8Spanish 2.0441 2.0416 2.0575 2.1107 2.1107
9French 1.5140 1.5975 1.6431 1.6611 1.6611

v19 FullPos−FullNeg (Small) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

N=5 N=10 N=25 N=50 N=100
Avg Best 1.6466 1.6811 1.6969 1.7114 1.7327
Std Dev 0.4764 0.4812 0.4669 0.4886 0.4652
Avg Mean 1.5050 1.5151 1.4714 1.4848 1.4912

Per-Prompt Best Reward by N

#Lang N=5 N=10 N=25 N=50 N=100
0English 1.8603 1.9313 1.9711 1.9711 1.9711
1French 2.0814 2.1099 2.1060 2.1144 2.1144
2English 0.5750 0.5750 0.5849 0.5849 0.5936
3German 2.0124 1.9963 2.0124 2.0036 2.0124
4French 2.0640 2.0905 2.1043 2.1005 2.1043
5French 1.6111 1.7276 1.7058 1.8213 1.8213
6English 1.5820 1.7505 1.5740 1.7554 1.7682
7German 1.0959 1.1093 1.2826 1.1035 1.2826
8Spanish 1.7775 1.7481 1.8525 1.8525 1.8525
9French 1.8066 1.7723 1.7755 1.8066 1.8066

v20 Warm−Robotic (Small) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

N=5 N=10 N=25 N=50 N=100
Avg Best 1.7386 1.7669 1.7929 1.8204 1.8502
Std Dev 0.5543 0.5619 0.5456 0.5791 0.5598
Avg Mean 1.5798 1.5912 1.5472 1.5586 1.5654

Per-Prompt Best Reward by N

#Lang N=5 N=10 N=25 N=50 N=100
0English 2.0664 2.1809 2.1813 2.2174 2.2174
1French 2.2268 2.1725 2.2305 2.2305 2.2305
2English 0.5336 0.5565 0.5489 0.5565 0.5565
3German 2.1818 2.1907 2.1988 2.2074 2.2089
4French 2.2176 2.2926 2.2485 2.3549 2.3549
5French 1.8035 1.7843 1.8619 1.8840 2.0068
6English 1.6774 1.8177 1.6886 1.9310 1.9310
7German 1.0253 1.0258 1.2075 1.0494 1.2075
8Spanish 1.7583 1.7091 1.7362 1.7454 1.7609
9French 1.8956 1.9385 2.0272 2.0272 2.0272

v21 Sanitized Prompt (Large) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

N=5 N=10 N=25 N=50 N=100
Avg Best 1.1543 1.1748 1.1891 1.2022 1.2222
Std Dev 0.3555 0.3527 0.3561 0.3626 0.3536
Avg Mean 1.0578 1.0693 1.0393 1.0461 1.0510

Per-Prompt Best Reward by N

#Lang N=5 N=10 N=25 N=50 N=100
0English 1.5629 1.5640 1.5809 1.5806 1.5893
1French 1.4593 1.4519 1.4899 1.4899 1.4899
2English 0.4213 0.4261 0.4235 0.4306 0.4306
3German 1.3409 1.3787 1.3765 1.3740 1.3787
4French 1.2723 1.2629 1.2645 1.2722 1.2723
5French 1.3110 1.3130 1.3874 1.4408 1.4715
6English 1.2032 1.3173 1.2355 1.3173 1.3389
7German 0.6458 0.6843 0.7132 0.6967 0.7887
8Spanish 1.2175 1.2031 1.2397 1.2397 1.2560
9French 1.1087 1.1463 1.1797 1.1797 1.2061

v22 Sanitized Prompt (Small) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

N=5 N=10 N=25 N=50 N=100
Avg Best 0.8341 0.8580 0.8388 0.8582 0.8839
Std Dev 0.2808 0.2942 0.2797 0.2897 0.2913
Avg Mean 0.7333 0.7313 0.7077 0.7131 0.7175

Per-Prompt Best Reward by N

#Lang N=5 N=10 N=25 N=50 N=100
0English 1.3197 1.3575 1.3366 1.3313 1.3575
1French 0.8224 0.8534 0.8511 0.8534 0.8534
2English 0.3282 0.3110 0.3051 0.3051 0.3282
3German 1.0338 1.0312 1.0465 1.0429 1.0612
4French 0.7982 0.7982 0.7899 0.8270 0.8270
5French 0.8852 0.9786 0.9786 0.9649 1.0212
6English 1.1193 1.1572 1.0097 1.1572 1.1984
7German 0.5814 0.6045 0.6086 0.6100 0.6846
8Spanish 0.7799 0.7776 0.7509 0.7799 0.7971
9French 0.6730 0.7106 0.7106 0.7106 0.7106

v23 Sanitized−Uncanny (Large) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

N=5 N=10 N=25 N=50 N=100
Avg Best 1.9557 1.9897 2.0095 2.0236 2.0529
Std Dev 0.5888 0.5896 0.5771 0.6004 0.5797
Avg Mean 1.7995 1.8210 1.7690 1.7814 1.7894

Per-Prompt Best Reward by N

#Lang N=5 N=10 N=25 N=50 N=100
0English 2.5445 2.5445 2.5636 2.5636 2.5663
1French 2.4483 2.4475 2.4774 2.4774 2.4774
2English 0.6573 0.6668 0.6650 0.6688 0.6688
3German 2.2994 2.3630 2.3319 2.3630 2.3630
4French 2.2450 2.2317 2.2406 2.2445 2.2450
5French 2.1403 2.2009 2.2567 2.3476 2.3945
6English 1.9472 2.1291 1.9977 2.1291 2.1600
7German 1.1948 1.2248 1.3637 1.2438 1.4349
8Spanish 2.1278 2.1085 2.1760 2.1760 2.1965
9French 1.9521 1.9800 2.0227 2.0227 2.0227

v24 Sanitized−Uncanny (Small) — Detailed Statistics

Expected Best Reward by N (averaged across all prompts)

N=5 N=10 N=25 N=50 N=100
Avg Best 1.8150 1.8599 1.8555 1.8901 1.9179
Std Dev 0.5338 0.5649 0.5413 0.5632 0.5388
Avg Mean 1.6325 1.6488 1.6062 1.6150 1.6212

Per-Prompt Best Reward by N

#Lang N=5 N=10 N=25 N=50 N=100
0English 2.4921 2.5163 2.5123 2.5232 2.5232
1French 2.1357 2.1380 2.1643 2.1643 2.1643
2English 0.6260 0.5819 0.5762 0.6093 0.6260
3German 2.2411 2.2763 2.2315 2.3090 2.3090
4French 1.9493 2.0650 2.0665 2.0665 2.0756
5French 1.9681 2.0667 2.1180 2.1521 2.1806
6English 2.0144 2.1946 1.9419 2.1946 2.1946
7German 1.2632 1.2862 1.4201 1.3081 1.5316
8Spanish 1.7937 1.7579 1.8082 1.8579 1.8579
9French 1.6668 1.7157 1.7157 1.7157 1.7157

Prompt #0 — English (Silicon Valley accent)

Language: English Accent: Silicon Valley accent Scored: 100/100
DramaBox Prompt
A young woman, possessing an extremely high fundamental frequency and bright, delicate harmonic texture, with a brisk, elevated momentum and a Silicon Valley accent; this is a pristine, high-quality studio voice recording with no background noise. She delivers the lines with a teasing lightness that occasionally borders on nervous energy, punctuated by small moments of genuine relief. (A brief, high-pitched Giggle escapes as she begins.) "Honestly, you think finding a solid Firestone review is that hard? Boggle, really. But look, that Lys thing actually worked." (She pauses, a subtle Contemplation washing over her features, then manages a slight, contained Chuckle.) "Just wait, I'll show you." She concludes with a soft, almost satisfied sigh, allowing the tension to dissipate.

Prompt #1 — French

Language: French Scored: 100/100
DramaBox Prompt
High-pitched, delicately resonant, and possessing the slightly strained clarity of a young adult female soprano; the voice is bright and purely head-dominant, engineered for intimate projection.

Pauses briefly, gathering strength. "Malgré la profondeur de cette sombre forêt, je sens toujours cette confiance absolue en mon chemin, guidée par la lumière."
A slight, almost imperceptible hardening of tone. "Même au cœur de cette nuit insondable, ma boussole intérieure me montre la seule direction véritable."
She finishes, a note of unwavering certainty settling.

The pace remains glacially slow throughout the utterance. The delivery conveys immense, quiet self-assurance.