Menu

Case Studies

Deep dives into the technical challenges we've solved and the engineering decisions that shaped our projects.

Back to Home

From Uniformity to Diversity: Engineering Richer Procedural Worlds

MISsimulation TypeScript Cloudflare Workers Gaussian Distributions Procedural Generation Data Modeling

Our procedural generation engine for educational simulations had a problem. Despite generating technically correct demographic data, every island it created felt eerily similar. Same political distribution patterns. Same area clustering. Same everything.

The issue became clear when we ran multiple test generations: we'd essentially built a very sophisticated rubber stamp.

The Tech Stack and Starting Point

The generator runs on Cloudflare Workers using TypeScript. The core engine uses a seeded random approach with Gaussian distributions for realistic demographic modeling. We output both CSV files and MySQL dumps containing everything from voter registration to social media networks.

The problem wasn't the technology. Our architecture was sound: deterministic seeding (same inputs always produce identical outputs), efficient in-memory generation, and sub-30-second execution times even for 5,000+ resident populations.

The problem was that we were using the same four demographic cluster templates for every single island. It worked, but it was predictable.

Problem 1: The Apolitical Rate Was Hardcoded

Originally, every island had roughly the same percentage of politically disengaged residents because we calculated it the same way every time. This created unrealistic uniformity across different "worlds" that should have had distinct political cultures.

The fix: We implemented a weighted random apolitical target using a power curve:

const apoliticalTarget = _.clamp(
  0.1 + Math.pow(Math.max(randUnit(880), 0), 1.4) * 0.4,
  0.1,
  0.5
);

This gives each island an apolitical rate between 10% and 50%, with a bias toward ~20%. The power curve (exponent 1.4) makes middle values more common than extremes, which matches what we see in actual demographic data.

But here's the tricky part: we couldn't just set a global variable and walk away. We needed to recalculate the "quiet majority" cluster's population share to hit that target while keeping all other demographic clusters intact:

const apoliticalOther = otherClusters.reduce((sum, cluster) => {
  return sum + cluster.share * cluster.apoliticalProbability;
}, 0);

const quietProb = Math.max(0.05, quietCluster.apoliticalProbability);
let desiredShare = (this.targetApoliticalRate - apoliticalOther) / quietProb;
desiredShare = _.clamp(desiredShare, 0.05, 0.8);

This dynamically adjusts cluster shares to achieve the target rate. Some islands end up with highly engaged populations, others are dominated by politically indifferent residents.

Problem 2: Template Fatigue

Nine demographic archetypes lived in our codebase, but we only ever used the same four (plus the quiet majority baseline). Every island got the exact same mix.

The technical constraint: we needed deterministic generation for testing. If you generate an island with seed 12345 today, it needs to be identical tomorrow.

The solution: Keep all nine templates but randomly select four based on the island's seed:

const optionalSelection = optionalTemplates
  .map((template, index) => ({ template, score: randUnit(500 + index) }))
  .sort((a, b) => a.score - b.sort)
  .slice(0, optionalSlots)
  .map((entry) => entry.template);

Each template gets a seeded random score, we sort by that score, and take the top four. Same seed = same selection every time. Different seed = different cluster composition.

This means one island might combine "young progressives" with "rural conservatives" and "wealthy moderates," while another gets "retirees," "working-class liberals," and "suburban independents." The combinatorial explosion from nine-choose-four gives us 126 unique demographic personalities before we even factor in the apolitical swing.

Problem 3: Geographic Distribution Was Too Uniform

Our area assignment code gave every geographic area roughly equal population. Real places don't work that way. City centers concentrate population. Rural areas spread thin. Some suburbs boom while others stagnate.

The original code multiplied cluster affinities by 1.0 for baseline, with minor adjustments. Too conservative.

The fix: We added a seeded area diversity bias that multiplies geographic weights by values between 0.65 and 2.0:

this.config.areas.forEach((area, index) => {
  let base = 0.65 + localRand(index * 19 + 1) * 1.35;

  // Urban preference adjustment
  if (['Central', 'Downtown', 'City Center'].includes(area)) {
    base *= 0.9 + this.config.urbanRuralSkew * 0.5;
  }

  bias[area] = base;
});

Then during resident assignment, we apply this bias on top of cluster affinities:

if (this.areaDiversityBias) {
  areas.forEach((area, index) => {
    const modifier = this.areaDiversityBias![area];
    if (modifier !== undefined) {
      const jitter = 0.9 + (index % 5) * 0.02;
      areaScores[area] *= modifier * jitter;
    }
  });
}

Now Central might soak up 25% of the population on one island and 6% on another. Southeast could boom in one scenario and barely register in the next. It creates geographic personality.

Bonus: Performance Instrumentation

We also switched from Date.now() to performance.now() for timing metrics:

const getTimestamp: () => number = (() => {
  try {
    if (typeof globalThis !== 'undefined') {
      const perf = (globalThis as any).performance;
      if (perf && typeof perf.now === 'function') {
        return () => perf.now();
      }
    }
  } catch (error) {
    // fallback
  }
  return () => Date.now();
})();

This gives us sub-millisecond precision on performance breakdown, which matters when you're trying to optimize generation time for a 30-second Cloudflare Workers CPU limit.

The Results

Before these changes:

After:

Same deterministic generation. Same performance budget. Vastly more interesting output.

The key insight: procedural generation isn't about randomness. It's about finding the right parameters to vary and the right constraints to maintain. We kept strict determinism while unlocking emergent complexity through targeted variability in apolitical rates, cluster composition, and geographic distribution.

Sometimes the best engineering isn't building something new. It's finding the three variables in your existing system that unlock the potential that was already there.