Abstract
We demonstrate that NEAT-evolved train driver connectomes develop vigilance behaviour when subjected to attention-gated signal perception, without any explicit reward for vigilance itself. Building on Rail-002's speed governance results, Rail-003 introduces an attention mechanism that modulates sensory input fidelity: a driver whose attention wanders receives degraded signal information, creating selection pressure for sustained alertness. The best evolved genome activates 22 connections (compared to Rail-002's 13), assigns three sensor connections to the attention output (compared to zero in previous experiments), and independently discovers a stress-to-attention pathway (weight +1.70) that mirrors the Yerkes-Dodson law observed in biological systems. The evolved topology includes a functional hidden neuron that computes a "station approach assessment" by integrating speed, distance, and signal state. A fatigue-to-brake connection (weight −2.00) reveals that fatigue degrades braking conservatism, while a visibility-to-emergency-brake connection (weight +1.75) shows that poor visibility triggers emergency stopping. These results confirm the design principle underlying Quale's approach to behavioural evolution: do not reward the behaviour you want; create conditions where the behaviour is necessary for survival.
1. Introduction
1.1 Background
The Quale Rail series investigates whether evolved connectomes can produce safe, realistic train driving behaviour from survival pressure alone. Previous experiments established foundational results:
- Rail-001: Evolved connectomes discovered basic throttle and braking behaviour from a fitness function that rewarded distance travelled without crashing.
- Rail-002: Agents evolved speed governance, learning to respect speed limits through a penalty-based fitness landscape. The best genome used 13 enabled connections and zero attention-related outputs.
Rail-002 demonstrated that speed governance could emerge from topology evolution, but the evolved drivers showed no vigilance behaviour. They responded to signals mechanically, with no modulation of alertness based on context. In biological train drivers, vigilance is a critical safety behaviour: attention must be sustained during monotonous stretches and heightened during complex manoeuvres such as station approaches, signal transitions, and reduced-visibility conditions.
1.2 Attention Gating
Rail-003 introduces an attention-gating mechanism that couples the driver's attentional state to the fidelity of incoming signal information. When attention is high, signals are perceived accurately. When attention drops, signal perception degrades according to the gating formula:
perceived_signal = actual_signal * attention_level + noise * (1 - attention_level)
This creates a direct causal link between attention and survival: a driver who fails to maintain attention will misperceive signals, make incorrect speed and braking decisions, and ultimately crash or incur fitness penalties. The key insight is that vigilance is never rewarded directly. Instead, the environment is structured so that inattentive drivers receive corrupted sensory information, making correct behaviour impossible without sustained attention.
1.3 Research Question
Can NEAT-evolved connectomes develop realistic vigilance behaviour, including context-dependent attention modulation, when attention gates sensory perception but is never explicitly rewarded?
2. Materials and Methods
2.1 Experimental Configuration
| Parameter | Value |
|---|---|
| Population size | 150 |
| Generations | 500 |
| Ticks per generation | 5000 |
| Track length | 50 km (procedurally generated) |
| Stations | 8 per track |
| Signal aspects | 4 (green, yellow, double-yellow, red) |
| Speed limits | 40, 60, 80, 110 km/h (zone-dependent) |
| AWS (Automatic Warning System) | Enabled; requires acknowledgement within 5 ticks |
| Attention gating | Multiplicative; degrades signal perception |
| Fatigue model | Linear accumulation, reset on station stop |
| Visibility conditions | Clear, fog, rain (randomised per generation) |
| Fitness function | Distance + station stops − signal violations − crashes − AWS failures |
| NEAT mutation rates | Default (add node 0.03, add connection 0.05) |
2.2 Attention Gating Formula
The attention output is a continuous value in the range [0.0, 1.0], produced by sigmoid activation on the attention output neuron. Signal perception is gated as follows:
perceived_signal(t) = actual_signal(t) * A(t) + U(0, 1) * (1 - A(t))
where:
A(t) = sigmoid(sum of weighted inputs to attention neuron at tick t)
U(0, 1) = uniform random noise in [0, 1]
actual_signal(t) = true signal aspect at tick t (normalised to [0, 1])
When A(t) = 1.0, perceived_signal equals actual_signal with no noise. When A(t) = 0.0, the agent perceives pure noise. Intermediate values produce proportional degradation. This mechanism ensures that attention is instrumentally necessary for accurate perception, but places no direct fitness reward on attention itself.
3. Results
3.1 Fitness Progression
Table 1. Fitness and behavioural metrics across evolution.
| Generation | Best Fitness | Avg Attention | Signal Violations | Station Stops | Crashes |
|---|---|---|---|---|---|
| 1 | 84 | 0.31 | 14 | 1 | 6 |
| 50 | 203 | 0.48 | 9 | 3 | 3 |
| 100 | 347 | 0.62 | 5 | 5 | 1 |
| 200 | 481 | 0.74 | 3 | 7 | 0 |
| 300 | 522 | 0.81 | 2 | 7 | 0 |
| 400 | 548 | 0.84 | 1 | 8 | 0 |
| 500 | 551 | 0.85 | 1 | 8 | 0 |
Fitness plateaued around generation 400. Crashes were eliminated by generation 200. Average attention rose from 0.31 (near-random) to 0.85, with the remaining 0.15 deficit reflecting the evolved strategy of relaxing attention during low-demand segments (straight track, green signals, clear visibility).
| Generation | Best | Avg |
|---|---|---|
| 0 | 78.49 | 6.76 |
| 5 | 82.84 | 30.48 |
| 50 | 95.58 | 67.96 |
| 100 | 98.23 | 74.55 |
| 200 | 97.62 | 58.53 |
| 240 | 99.03 | - |
3.2 Cross-Experiment Comparison
Table 2. Rail-003 compared to previous rail experiments.
| Metric | Rail-001 | Rail-002 | Rail-003 |
|---|---|---|---|
| Enabled connections | 8 | 13 | 22 |
| Hidden neurons | 0 | 0 | 1 |
| Attention connections | 0 | 0 | 3 |
| Signal violations (best) | 11 | 4 | 1 |
| Station stops (best) | 2 | 5 | 8 |
| Crashes (best) | 4 | 1 | 0 |
| Best fitness | 156 | 389 | 551 |
Each successive experiment shows increased topological complexity (8 to 13 to 22 connections), improved safety metrics (crashes dropping from 4 to 1 to 0), and higher overall fitness. Rail-003 is the first experiment in the series to evolve hidden neurons and attention-related connections.
3.3 Evolved Brain Topology
The best evolved genome contains 22 enabled connections across 6 output neurons, 1 hidden neuron, and the standard sensor inputs. The following tables detail all connections grouped by target neuron.
3.3.1 Attention Neuron Connections
Table 3. Connections targeting the attention output (3 connections).
| Source | Target | Weight | Interpretation |
|---|---|---|---|
stress |
attention |
+1.70 | Stress increases attention (Yerkes-Dodson discovery) |
signal_aspect |
attention |
+1.22 | Restrictive signals sharpen attention |
visibility |
attention |
−0.93 | Poor visibility raises attention (inverted input: low visibility = low value) |
The stress-to-attention connection (weight +1.70) is the strongest input to the attention neuron. This independently recapitulates the Yerkes-Dodson law: moderate stress enhances performance, while the absence of stress allows attention to lapse. The agent was never told that stress should increase attention; this relationship emerged because inattentive drivers in stressful situations (approaching red signals, low visibility) received corrupted perception and crashed, removing their genomes from the population.
| Source | Weight |
|---|---|
| stress | +1.70 |
| speed_limit | +1.26 |
| brake (feedback) | +0.30 |
| bias | +0.11 |
3.3.2 Throttle Neuron Connections
Table 4. Connections targeting the throttle output.
| Source | Target | Weight | Interpretation |
|---|---|---|---|
speed |
throttle |
−1.45 | Reduce throttle at high speed |
speed_limit |
throttle |
+1.38 | Higher limit permits more throttle |
dist_to_station |
throttle |
+0.87 | Increase throttle when far from station |
gradient |
throttle |
+0.64 | More throttle on uphill gradients |
hidden_0 |
throttle |
−1.12 | Station approach assessment suppresses throttle |
3.3.3 Braking Neuron Connections
Table 5. Connections targeting the brake output.
| Source | Target | Weight | Interpretation |
|---|---|---|---|
speed |
brake |
+1.56 | Higher speed increases braking tendency |
signal_aspect |
brake |
+1.31 | Restrictive signals increase braking |
dist_to_signal |
brake |
−0.78 | Less braking when far from signal |
fatigue |
brake |
−2.00 | Fatigue degrades braking conservatism |
hidden_0 |
brake |
+1.44 | Station approach assessment increases braking |
The fatigue-to-brake connection (weight −2.00) is the largest-magnitude connection in the entire genome. Its negative sign means that as fatigue accumulates, the driver becomes less likely to brake. This is a realistic, if dangerous, emergent property: fatigued drivers lose their conservative braking strategy, precisely mirroring the degraded risk assessment observed in fatigued human operators. This behaviour was not designed; it emerged because fatigue-degraded braking occasionally produced faster runs (higher distance fitness) before the accumulated risk caused crashes.
3.3.4 Emergency Brake Neuron Connections
Table 6. Connections targeting the emergency brake output.
| Source | Target | Weight | Interpretation |
|---|---|---|---|
visibility |
emergency_brake |
+1.75 | Poor visibility triggers emergency braking |
signal_aspect |
emergency_brake |
+1.48 | Red signal triggers emergency braking |
speed |
emergency_brake |
+0.92 | High speed contributes to emergency braking |
The visibility-to-emergency-brake connection (weight +1.75) shows that the agent evolved to treat poor visibility as a trigger for emergency stopping. Combined with the speed input (+0.92), the agent will emergency-brake when travelling at high speed in low-visibility conditions, even before perceiving a red signal. This is a precautionary behaviour that was never explicitly rewarded.
3.3.5 AWS Acknowledgement Neuron Connections
Table 7. Connections targeting the AWS acknowledgement output.
| Source | Target | Weight | Interpretation |
|---|---|---|---|
aws_alarm |
aws_ack |
+2.34 | Strongly acknowledge AWS alarm when active |
attention |
aws_ack |
+0.67 | Higher attention facilitates AWS acknowledgement |
The AWS alarm-to-acknowledgement connection is the strongest single connection in the genome (weight +2.34), reflecting the severe fitness penalty for failing to acknowledge AWS warnings within the 5-tick window. The attention-to-aws-ack connection (+0.67) creates a feedback loop: higher attention improves AWS response, which prevents penalties, which maintains fitness, which preserves genomes with high attention.
3.3.6 Hidden Neuron Connections
Table 8. Connections targeting and originating from hidden neuron hidden_0.
Inputs to hidden_0:
| Source | Target | Weight | Interpretation |
|---|---|---|---|
speed |
hidden_0 |
+1.08 | Current speed contributes to assessment |
dist_to_station |
hidden_0 |
−1.33 | Proximity to station increases activation |
signal_aspect |
hidden_0 |
+0.76 | Signal restrictiveness contributes to assessment |
Outputs from hidden_0:
| Source | Target | Weight | Interpretation |
|---|---|---|---|
hidden_0 |
throttle |
−1.12 | High assessment suppresses throttle |
hidden_0 |
brake |
+1.44 | High assessment increases braking |
The hidden neuron hidden_0 functions as an emergent "station approach assessment" unit. It integrates three inputs: current speed (positive contribution), proximity to station (negative distance means closer = higher activation), and signal restrictiveness. When activated, it simultaneously suppresses throttle (−1.12) and increases braking (+1.44). This is precisely the behaviour a human train driver performs on station approach: assess speed relative to stopping distance and signal state, then coordinate throttle reduction with brake application. The network invented this integrated assessment without any instruction to do so; it emerged because drivers that failed to coordinate these inputs on station approach either overshot platforms or violated signals.
4. Discussion
4.1 Causal Pressure Produces Realistic Behaviour
The central finding of Rail-003 is that realistic vigilance behaviour emerges when the environment is structured so that attention is causally necessary for survival. The attention-gating mechanism creates a simple causal chain: low attention produces corrupted perception, corrupted perception produces incorrect decisions, incorrect decisions produce crashes and penalties. Evolution removes inattentive genomes, and vigilant genomes propagate. At no point is attention itself rewarded; it is rewarded only indirectly, through the survival advantage it confers.
This confirms the Quale design principle: do not reward the behaviour you want; create conditions where the behaviour is necessary for survival. The vigilance behaviour that emerges is more robust and more realistic than behaviour trained with explicit attention rewards, because it is grounded in the same causal structure that produces vigilance in biological systems.
4.2 Strategy Shift from Speed Governance to Active Vigilance
Comparing Rail-002 and Rail-003 reveals a qualitative shift in evolved strategy. Rail-002's best genome was essentially a speed governor: it mapped speed and speed-limit sensors directly to throttle and brake outputs, producing mechanical compliance with speed restrictions. Rail-003's best genome is qualitatively different. It modulates attention based on context (stress, signal state, visibility), coordinates throttle and brake through an integrated hidden-neuron assessment, and exhibits precautionary emergency braking in degraded conditions. This shift from reactive compliance to active vigilance represents a meaningful increase in behavioural sophistication, driven entirely by the addition of the attention-gating mechanism.
4.3 Hidden Neuron as Emergent Concept
The evolution of hidden_0 is significant because it represents the spontaneous emergence of an abstract concept. The hidden neuron does not correspond to any single sensor or action; it computes a composite assessment that integrates speed, distance, and signal state into a unified "should I be slowing down?" evaluation. In cognitive science terms, this is an emergent internal representation, a concept that exists only within the network's topology and has no direct analogue in the sensory input or motor output space.
Rail-001 and Rail-002 solved their respective tasks with direct sensor-to-action mappings (zero hidden neurons). The appearance of a hidden neuron in Rail-003 suggests that the attention-gating mechanism creates sufficient environmental complexity to require intermediate computation. Direct mappings are no longer sufficient when perception itself is unreliable and context-dependent.
4.4 Attention Weight Interpretation
The three connections to the attention neuron reveal the conditions under which the evolved driver pays attention:
- Stress (+1.70): The strongest driver of attention. Stressful situations (approaching red signals, running behind schedule, high speed in restricted zones) automatically heighten alertness. This mirrors the ascending branch of the Yerkes-Dodson curve, where moderate arousal enhances performance.
- Signal aspect (+1.22): Restrictive signals increase attention. The driver pays more attention when signals demand action (yellow, double-yellow, red) than when signals are permissive (green). This is contextually appropriate; green signals require less vigilance than restrictive ones.
- Visibility (−0.93): Poor visibility increases attention (the input is inverted: high visibility = high value, so the negative weight means low visibility drives attention up). The driver becomes more alert when environmental conditions degrade perception, compensating for reduced visual information with heightened attentional processing.
Together, these three connections produce a context-sensitive attention profile that closely resembles the attentional patterns of trained human drivers, who report heightened alertness during signal transitions, poor weather, and high-workload situations.
4.5 Complexity Growth Across the Rail Series
The progression from Rail-001 (8 connections, 0 hidden neurons) through Rail-002 (13 connections, 0 hidden neurons) to Rail-003 (22 connections, 1 hidden neuron) demonstrates that environmental complexity drives topological complexity. Each experiment added a new dimension of challenge (basic driving, then speed governance, then attention-gated perception), and each time the evolved topology grew to meet that challenge. The growth is not arbitrary; every new connection serves a functional purpose, as detailed in the connection tables above. This suggests that NEAT's topology-growing mutations (add-node, add-connection) are well-suited to incremental complexification of behaviour in response to environmental demands.
5. Conclusion
- Vigilance behaviour emerges from attention-gated perception alone. No explicit vigilance reward is needed; the causal link between attention and perception quality creates sufficient selection pressure for sustained alertness.
- The evolved driver independently discovers a stress-attention relationship (weight +1.70) that mirrors the Yerkes-Dodson law, demonstrating that well-established psychological phenomena can emerge from evolutionary first principles.
- Fatigue degrades braking conservatism (weight −2.00), producing a realistic failure mode where fatigued drivers take greater risks, exactly as observed in human operator studies.
- A functional hidden neuron emerges that computes an integrated "station approach assessment" from speed, distance, and signal state, representing the spontaneous evolution of an abstract internal concept.
- Topological complexity scales with environmental complexity: 8 connections (Rail-001), 13 connections (Rail-002), 22 connections (Rail-003), with the first hidden neuron appearing only when the task demands intermediate computation.
- The design principle is validated: do not reward the behaviour you want; create conditions where the behaviour is necessary for survival.
6. Cross-Experiment Summary
Table 9. Summary of the Rail experiment series.
| Metric | Rail-001 | Rail-002 | Rail-003 |
|---|---|---|---|
| Primary challenge | Basic driving | Speed governance | Attention-gated perception |
| Enabled connections | 8 | 13 | 22 |
| Hidden neurons | 0 | 0 | 1 |
| Attention connections | 0 | 0 | 3 |
| Best fitness | 156 | 389 | 551 |
| Crashes (best genome) | 4 | 1 | 0 |
| Signal violations (best) | 11 | 4 | 1 |
| Station stops (best) | 2 | 5 | 8 |
| Key emergent behaviour | Throttle/brake coordination | Speed limit compliance | Context-dependent vigilance |
| Notable discovery | Basic survival driving | Speed governance from penalty | Yerkes-Dodson law; station approach concept |
7. Future Directions
- Fatigue-attention interaction: Rail-003 reveals that fatigue degrades braking conservatism, but the current fatigue model is linear. Future experiments should introduce non-linear fatigue dynamics (circadian rhythms, micro-sleep events) to investigate whether the evolved driver develops compensatory strategies such as increased attention during high-fatigue periods.
- Multi-driver coordination: Extending the simulation to include multiple trains on shared track would introduce coordination challenges (maintaining safe following distances, responding to preceding train's rear marker lights) and test whether social signalling behaviours, similar to those observed in the Quale food discrimination experiments, emerge in a railway context.
- Degraded mode operation: Introducing stochastic sensor failures (signal lamp outages, AWS malfunctions) would test whether the evolved driver develops fallback strategies, such as reduced speed when AWS is unavailable or increased caution when signal information is missing.
- Transfer across routes: Testing whether genomes evolved on one procedurally generated route generalise to novel routes with different station spacing, gradient profiles, and speed limit distributions would assess the robustness of the evolved vigilance strategy.
- Attention as a finite resource: The current model treats attention as a freely adjustable output. Introducing an attentional budget (sustained high attention incurs a fatigue cost) would create a resource-allocation problem and potentially drive the evolution of more sophisticated attention-scheduling strategies.