Thesis Measurement Spec
This page specifies how the thesis metrics are computed from logged artifacts and how they are interpreted. Use it together with:
Specification
Data sources (logged artifacts only)
steps.parquetarea_steps.parquetagents.parquetvotes.parquetmeta.yamlstatic.json
Primary time-series definitions
turnout_pct_tfromsteps.turnout(0..100)gini_assets_tfromsteps.gini_index(0..100)mean_dissatisfaction_tfrom mean ofagents.dissatisfaction_valueby stepgini_dissatisfaction_tfrom stepwise Gini overagents.dissatisfaction_value(0..100)quality_distance_tfrom eligible-weighted area quality distance by step
Mode-aware quality distance source:
- if
quality_target_mode = puzzle: per-area value isarea_steps.puzzle_distance - if
quality_target_mode = reality: per-area value isarea_steps.dist_to_reality
Current weighted aggregation for quality_distance_t:
- numerator:
sum_a quality_distance(a,t) * eligible_voters(a,t) - denominator:
sum_a eligible_voters(a,t) - if denominator is zero:
NaN
Run-level summaries (summary_stats.json)
global_summary keys:
turnout_mean,turnout_finalturnout_volatilitygini_assets_mean,gini_assets_finalgini_assets_volatilitygini_dissatisfaction_mean,gini_dissatisfaction_finalgini_dissatisfaction_volatilitymean_dissatisfaction_mean,mean_dissatisfaction_finalquality_distance_mean,quality_distance_finalquality_distance_volatilitydiversity_entropy_mean,diversity_entropy_final
Volatility definition (adjacent-step):
- step_volatility_l1(x) = mean_t |x_t - x_{t-1}| over finite adjacent pairs
- normalization:
- turnout/gini series: divide by 100 (series are 0..100)
- distance series (quality_distance): divide by 1
- no clamping is applied in formula layer
- if fewer than one finite adjacent pair exists: NaN
Secondary descriptive metrics
mean_altruism_tfromsteps.mean_altruism(mechanism diagnostic; non-confirmatory)diversity_first_choice_entropy_tfromvotes.rank_1_option_iddist_to_ref_*benchmark trajectories (analysis artifacts; not runtime schema columns)dist_to_ref_utilitariandist_to_ref_nashdist_to_ref_egalitariandist_to_ref_rawlsiandist_to_ref_egalitarian_lam025dist_to_ref_egalitarian_lam400
These are descriptive benchmark comparisons, not normative optimality claims.
Group-level descriptive diagnostics (non-confirmatory) may additionally be computed in analysis artifacts to inspect participation composition over time.
Benchmark reference computation
Reference families currently used in analysis:
- utilitarian reference
- nash reference
- egalitarian reference (
lambda=1.0) with sensitivity variants (lambda=0.25,lambda=4.0) - rawlsian reference
Analysis output columns (time-indexed):
dist_to_ref_utilitariandist_to_ref_nashdist_to_ref_egalitariandist_to_ref_rawlsiandist_to_ref_egalitarian_lam025dist_to_ref_egalitarian_lam400
Computation-layer assumptions:
- all
dist_to_ref_*values are computed in analysis from logged artifacts - no runtime reward-loop dependency on these benchmark trajectories
- optimization/tie policies are deterministic for fixed input artifacts
NaN policy:
- area-level benchmark distances are
NaNfor empty-area agent sets - global benchmark distance is
NaNonly if the global agent set is empty - no-vote steps still produce defined distances based on color distributions
Consistency checks (must hold)
steps.turnout(t) == 100 * sum_a participants(a,t) / sum_a area_num_agents(a)(if denominator is zero, turnout is0)area_steps.participants(a,t) == count(votes rows for (a,t))- one
agentsrow per(agent_id, step) - no
NaN/infin thesis-critical emitted series (except explicitly allowedNaNsemantics like denominator-zeroquality_distance_t)
Inference Specification
Confirmatory endpoint subset
Primary confirmatory endpoints are time means:
turnout_meangini_assets_meangini_dissatisfaction_meanquality_distance_mean
These endpoints are computed from the time-series definitions above using fixed formulas.
Secondary reported endpoints
Additional summary_stats.json endpoints (final values, volatility, diversity entropy, mean dissatisfaction) are reported descriptively unless explicitly promoted in a separate analysis contract.
Inference-family rules
Rule-family roles for reporting:
- canonical confirmatory family:
utilitarian (2),borda (3),schulze (4) - reference family:
plurality (0),random (5) - context-only calibration arm:
approval (1)
Family boundaries must remain explicit in results reporting.
Multiple-testing policy rule
- We use a predefined multiplicity correction policy within each reported family.
- We do not merge canonical confirmatory and reference-family p-value pools.
- We keep endpoint formulas fixed within one reported experiment set.
Summary-layer separation rule
summary_stats.jsoncontains the baseline run-level endpoint set.- Additional thesis inference outputs may extend beyond this set, but must be computed from logged artifacts with fixed formulas.