Fitness Design
Fitness function design is the most impactful single decision in a Quale experiment. Evolution has no opinions of its own - it will relentlessly optimize whatever you measure, including unintended shortcuts. A well-crafted fitness function creates a landscape where the behavior you want is the only viable path to high scores. A poorly crafted one produces agents that look like they are doing the right thing while actually gaming the metric. This guide covers how to think about fitness, not just how to write it.
Fitness Function Design
Section titled “Fitness Function Design”fitness Survival { maximize survival: 10.0 maximize health: 5.0 reward food_eaten: 2.0 penalize sickness: 5.0 penalize idle: 1.0 penalize complexity: 0.001}Verb Choice
Section titled “Verb Choice”| Verb | When to Use |
|---|---|
maximize | Continuous metric where higher is better (survival fraction, final health) |
reward | Event count where more is better (items consumed, goals reached) |
penalize | Metric where lower is better (sickness, idle time, complexity) |
Weight Balancing
Section titled “Weight Balancing”The fitness score is a weighted sum: sum(sign * metric * weight). Weights determine relative importance.
Start with survival at the highest weight. If the agent cannot survive, nothing else matters. Then add secondary objectives at lower weights and tune from there.
Example weight hierarchy:
- Survival (10.0) - staying alive is the foundation
- Primary objective (5.0-8.0) - the main behavior you are evolving
- Secondary objectives (2.0-3.0) - supporting behaviors
- Penalties (1.0-5.0) - discourage degenerate strategies
- Complexity (0.001) - gentle pressure for simpler networks
The Complexity Penalty
Section titled “The Complexity Penalty”penalize complexity: 0.001 penalizes genome connection count. This is always available regardless of domain. A small weight (0.001) provides gentle pressure toward simpler networks without dominating the fitness function.
Without a complexity penalty, evolution tends to produce networks that are larger than necessary. With too high a penalty, networks cannot grow complex enough to solve the task.
Common Pitfalls
Section titled “Common Pitfalls”Binary throttle problem: If you only reward/penalize a binary outcome (alive/dead), the fitness function has no gradient. There is no difference between “survived 10 ticks” and “survived 290 ticks.” Use continuous metrics (maximize survival) that produce a gradient.
Attention without consequences: If you reward attention: 15.0 but inattention has no consequences, evolution discovers that always firing the attention actuator is the cheapest way to maximize fitness. Instead, make inattention cause missed signals or slower reaction times.
Degenerate strategies: Watch for agents that exploit the fitness function without exhibiting the intended behavior:
- Moving in circles to avoid idle penalties
- Eating everything indiscriminately when safe food is more common than dangerous food
- Standing still near a food source and eating whenever items respawn
Counter these by ensuring the fitness landscape has no cheap shortcuts. If moving in circles avoids the idle penalty, make sure the survival pressure requires actual food-seeking.