A ComplexitÃ
TROISIÈME CYCLE DE LA PHYSIQUE
EN SUISSE ROMANDE
LA COMPLEXITÉ
Michel DROZ
Département de Physique Théorique, Université de Genève
24, quai Ernest Ansermet, CH-1211 Genève 4
Bastien CHOPARD
Centre Universitaire d’Informatique (CUI), Université de Genève
24, Rue du Général Dufour, CH-1211 Genève 4
Paolo DE LOS RIOS
Département de Physique Théorique, Université de Lausanne
BSP, CH-1015 Lausanne
Marco TOMASSINI
Institut d’informatique, Université de Lausanne
Collège Propédeutique, CH-1015 Lausanne
Robin GRAS
Institut Suisse de Bioinformatique, Université de Genève
1, rue Michel Servet, CH-1211 Genève 4
Jean-Louis DENEUBOURG
Centre d’étude des phénomènes non linéaires et des systèmes complexes
Université Libre de Bruxelles
CP 231, Boulevard du Triomphe, B-1050 Bruxelles
Semestre d’hiver 2001-2002
TROISIÈME CYCLE DE LA PHYSIQUE
EN SUISSE ROMANDE
Universités de :
BERNE - FRIBOURG - GENÈVE
LAUSANNE - NEUCHÂTEL
ET
ÉCOLE POLYTECHNIQUE FÉDÉRALE DE LAUSANNE
Archives et ventes de polycopiés :
M. D. Reymond, Université de Lausanne
Bâtiment des sciences physiques
CH-1015 LAUSANNE-DORIGNY
Préface
De nombreuse études, tant dans les sciences exactes que dans les sciences hu-
maines, se trouvent confrontées à des problèmes qui partagent la caractéristique com-
mune d’être des “problèmes complexes”. Bien qu’il soit difficile de donner une défini-
tion générale et univoque de la complexité, ce terme prend tout son sens dans le cadre
de problèmes concrets. Dans chaque domaine où la complexité est présente des outils
spécifiques ont été développés afin de cerner les problèmes posés.
L’idée de ce cours était de réunir quelques spécialistes de domaines dans lesquels
la complexité joue un rôle important afin d’essayer de dégager des concepts et des
outils communs adaptés à l’études de la complexité.
C’est un projet ambitieux et je tiens à remercier tous les conférenciers pour les
efforts consentis afin de communiquer à des non-spécialistes les aspects complexes de
leurs domaines respectifs.
Devant le volume et le contenu du matériel apporté par les conférenciers, il est
rapidement apparu que la rédaction de notes de cours au format papier habituel n’était
pas appropriée et que la solution cédérom s’imposait.
La réalisation de ce cédérom a été possible grâce aux compétences informatiques
et au travail considérable effectué par M. François Coppex, assistant du cours. Je tiens
particulièrement à le remercier pour son investissement dans ce projet.
Prof. Michel Droz
i
Table des matières
Partie I: Michel Droz
1
1
Introduction à la complexité
3
1.1
Qu’est-ce que la complexité ? . . . . . . . . . . . . . . . . . . . . . .
3
1.2
Organisation du cours . . . . . . . . . . . . . . . . . . . . . . . . . .
5
1.3
Introduction à quelques approches de la complexité. . . . . . . . . . .
7
1.3.1
Généralités . . . . . . . . . . . . . . . . . . . . . . . . . . .
7
1.3.2
Systèmes dynamiques . . . . . . . . . . . . . . . . . . . . .
7
1.3.3
Systèmes étendus . . . . . . . . . . . . . . . . . . . . . . . .
12
1.3.4
Mécanique statistique de l’équilibre . . . . . . . . . . . . . .
14
1.3.5
Concept de scaling . . . . . . . . . . . . . . . . . . . . . . .
19
1.3.6
Mécanique statistique du non-équilibre . . . . . . . . . . . .
22
1.4
Bibliographie . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
24
Partie II: Bastien Chopard
25
2
Cellular Automata and Lattice Boltzmann Techniques
27
2.1
The Cellular Automata approach . . . . . . . . . . . . . . . . . . . .
27
2.1.1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . .
27
2.1.2
Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . .
30
2.1.3
CA as a model of the physical world . . . . . . . . . . . . . .
32
2.1.4
Limitations, advantage, drawbacks and Extension . . . . . . .
33
2.2
Examples of simple rules . . . . . . . . . . . . . . . . . . . . . . . .
34
2.2.1
A growth model
. . . . . . . . . . . . . . . . . . . . . . . .
34
2.2.2
Ising-like dynamics . . . . . . . . . . . . . . . . . . . . . . .
35
2.2.3
Competition models and cell differentiation . . . . . . . . . .
38
2.2.4
Traffic models
. . . . . . . . . . . . . . . . . . . . . . . . .
39
2.2.5
A simple gas: the HPP model . . . . . . . . . . . . . . . . .
47
2.2.6
Random walk . . . . . . . . . . . . . . . . . . . . . . . . . .
50
2.2.7
The traveling ant . . . . . . . . . . . . . . . . . . . . . . . .
53
2.2.8
Population dynamics . . . . . . . . . . . . . . . . . . . . . .
57
2.3
From micro-physics to macro-physics . . . . . . . . . . . . . . . . .
60
2.3.1
The FHP model . . . . . . . . . . . . . . . . . . . . . . . . .
61
2.3.2
Microdynamics . . . . . . . . . . . . . . . . . . . . . . . . .
62
2.3.3
The macroscopic variables . . . . . . . . . . . . . . . . . . .
65
iii
LA COMPLEXITÉ
TABLE DES MATIÈRES
2.3.4
Multiscale Chapman-Enskog expansion . . . . . . . . . . . .
65
2.3.5
Chapman-Enskog procedure . . . . . . . . . . . . . . . . . .
67
2.3.6
Balance equations
. . . . . . . . . . . . . . . . . . . . . . .
69
2.3.7
Local equilibrium . . . . . . . . . . . . . . . . . . . . . . . .
71
2.3.8
Correction to local equilibrium . . . . . . . . . . . . . . . . .
73
2.3.9
The Navier-Stokes equation . . . . . . . . . . . . . . . . . .
76
2.3.10 A two-phase CA fluids . . . . . . . . . . . . . . . . . . . . .
77
2.4
The Lattice Boltzmann Method . . . . . . . . . . . . . . . . . . . . .
81
2.4.1
From Boolean to real-valued fields . . . . . . . . . . . . . . .
81
2.4.2
BGK models . . . . . . . . . . . . . . . . . . . . . . . . . .
82
2.4.3
Lattice Boltzmann fluids . . . . . . . . . . . . . . . . . . . .
83
2.4.4
The Navier-Stokes equation . . . . . . . . . . . . . . . . . .
86
2.4.5
A short summary of LB models . . . . . . . . . . . . . . . .
88
2.4.6
Subgrid models . . . . . . . . . . . . . . . . . . . . . . . . .
90
2.4.7
Pattern formation in snow transport . . . . . . . . . . . . . .
91
2.5
Reaction-Diffusion systems . . . . . . . . . . . . . . . . . . . . . . .
94
2.5.1
Excitable media . . . . . . . . . . . . . . . . . . . . . . . . .
96
2.5.2
Surface reaction models . . . . . . . . . . . . . . . . . . . .
99
2.5.3
The reaction-diffusion rule . . . . . . . . . . . . . . . . . . . 104
2.5.4
The macroscopic behavior . . . . . . . . . . . . . . . . . . . 109
2.5.5
Liesegang patterns . . . . . . . . . . . . . . . . . . . . . . . 112
2.6
Multiparticle models . . . . . . . . . . . . . . . . . . . . . . . . . . 119
2.6.1
Multiparticle diffusion model
. . . . . . . . . . . . . . . . . 121
2.6.2
Numerical implementation . . . . . . . . . . . . . . . . . . . 123
2.6.3
The reaction algorithm . . . . . . . . . . . . . . . . . . . . . 124
2.6.4
Turing patterns . . . . . . . . . . . . . . . . . . . . . . . . . 126
2.6.5
A multiparticle fluid . . . . . . . . . . . . . . . . . . . . . . 128
2.7
Wave model and fracture simulation . . . . . . . . . . . . . . . . . . 132
2.7.1
The wave model . . . . . . . . . . . . . . . . . . . . . . . . 132
2.7.2
Application to mobile communications . . . . . . . . . . . . 136
2.7.3
Modeling Solid Body . . . . . . . . . . . . . . . . . . . . . . 137
2.7.4
Fracture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
2.7.5
Wave localization . . . . . . . . . . . . . . . . . . . . . . . . 142
2.8
Bibliography
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
Partie III: Paolo De Los Rios
157
3
Self-Organized Critical Systems
159
3.1
Spontaneous occurrence of scaling in nature . . . . . . . . . . . . . . 159
3.2
The Bak-Sneppen model . . . . . . . . . . . . . . . . . . . . . . . . 164
3.2.1
Description of the model . . . . . . . . . . . . . . . . . . . . 164
3.2.2
Relevant quantities and their scaling behavior . . . . . . . . . 165
3.2.3
Numerical results in d = 1 . . . . . . . . . . . . . . . . . . . 169
3.2.4
The Random Nearest Neighbor model: a mean-field limit . . . 170
iv
TABLE DES MATIÈRES
LA COMPLEXITÉ
3.2.5
An exact Master Equation . . . . . . . . . . . . . . . . . . . 172
3.2.6
Toward a solution: the Maslov equation . . . . . . . . . . . . 173
3.3
Conclusions, and perspectives for Self-Organized Criticality . . . . . 173
3.4
Bibliography
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
Partie IV: Marco Tomassini
177
4
Evolving Cellular Automata
179
4.1
What Are Cellular Automata?
. . . . . . . . . . . . . . . . . . . . . 179
4.2
Formal Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
4.3
Cellular automata as complex and computational systems . . . . . . . 182
4.4
Variations on the Original Model . . . . . . . . . . . . . . . . . . . . 186
4.4.1
Non-Uniform Cellular Automata . . . . . . . . . . . . . . . . 186
4.4.2
Non-Standard Architectures . . . . . . . . . . . . . . . . . . 186
4.4.3
Asynchronous Cellular Automata . . . . . . . . . . . . . . . 186
4.4.4
Probabilistic Cellular Automata . . . . . . . . . . . . . . . . 187
4.5
Artificial Evolution of Cellular Automata
. . . . . . . . . . . . . . . 187
4.6
The Cellular Programming Algorithm . . . . . . . . . . . . . . . . . 189
4.7
Applications of Cellular Programming . . . . . . . . . . . . . . . . . 191
4.7.1
The Density Task . . . . . . . . . . . . . . . . . . . . . . . . 191
4.7.2
The Synchronization Task . . . . . . . . . . . . . . . . . . . 193
4.7.3
The Ordering Task . . . . . . . . . . . . . . . . . . . . . . . 193
4.7.4
Random Number Generation . . . . . . . . . . . . . . . . . . 194
4.8
Asynchronous Cellular Automata . . . . . . . . . . . . . . . . . . . . 195
4.9
Fault Tolerance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
4.9.1
Evolving Uniform Cellular Automata . . . . . . . . . . . . . 197
4.10 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . 198
4.11 Bibliography
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
5
Evolutionary Algorithms
203
5.1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
5.2
Genetic Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
5.2.1
The Metaphor . . . . . . . . . . . . . . . . . . . . . . . . . . 204
5.2.2
Representation . . . . . . . . . . . . . . . . . . . . . . . . . 204
5.2.3
The Evolutionary Cycle
. . . . . . . . . . . . . . . . . . . . 204
5.2.4
A First Example . . . . . . . . . . . . . . . . . . . . . . . . 206
5.2.5
A Second Example . . . . . . . . . . . . . . . . . . . . . . . 208
5.3
Theoretical Background of Genetic Algorithms . . . . . . . . . . . . 210
5.3.1
Notation
. . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
5.3.2
Main Theoretical Results . . . . . . . . . . . . . . . . . . . . 211
5.4
Introduction to Classifier Systems . . . . . . . . . . . . . . . . . . . 214
5.5
Genetic Programming . . . . . . . . . . . . . . . . . . . . . . . . . . 217
5.6
Evolution Strategies and Evolutionary Programming
. . . . . . . . . 220
5.6.1
Evolution Strategies . . . . . . . . . . . . . . . . . . . . . . 220
v
LA COMPLEXITÉ
TABLE DES MATIÈRES
5.6.2
Evolutionary Programming . . . . . . . . . . . . . . . . . . . 223
5.7
Advanced Topics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
5.7.1
Selection Methods and Reproduction Strategies . . . . . . . . 224
5.7.2
Specialized Representations and Genetic Operators . . . . . . 227
5.7.3
Handling Constraints . . . . . . . . . . . . . . . . . . . . . . 230
5.7.4
Hybrid Evolutionary Algorithms . . . . . . . . . . . . . . . . 231
5.7.5
Co-evolution . . . . . . . . . . . . . . . . . . . . . . . . . . 232
5.8
Bibliography
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
6
Artificial Neural Networks
237
6.1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
6.2
Artificial Neurons . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
6.3
Networks of Artificial Neurons . . . . . . . . . . . . . . . . . . . . . 241
6.4
Hopfield Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
6.5
Neural Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246
6.6
Supervised Learning . . . . . . . . . . . . . . . . . . . . . . . . . . 247
6.6.1
Perceptron Learning Algorithm . . . . . . . . . . . . . . . . 247
6.6.2
LMS Rule and Delta Rule . . . . . . . . . . . . . . . . . . . 249
6.6.3
Backpropagation Algorithm . . . . . . . . . . . . . . . . . . 250
6.6.4
Applications of Supervised Learning Networks . . . . . . . . 253
6.7
Unsupervised Learning . . . . . . . . . . . . . . . . . . . . . . . . . 257
6.7.1
Hebbian Learning . . . . . . . . . . . . . . . . . . . . . . . . 257
6.7.2
Competitive Learning
. . . . . . . . . . . . . . . . . . . . . 258
6.7.3
Self-Organizing Features Maps
. . . . . . . . . . . . . . . . 260
6.8
Fault Tolerance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262
6.9
Artificial Neural Nets and Statistics
. . . . . . . . . . . . . . . . . . 264
6.10 Bibliography
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266
7
Evolutionary Design of Artificial Neural Networks
269
7.1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
7.2
Evolving Weights in a Predefined Network
Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 270
7.2.1
Genetic Algorithms and Reinforcement Learning Networks
. 273
7.3
Evolving Network Architectures . . . . . . . . . . . . . . . . . . . . 274
7.3.1
Direct Encoding
. . . . . . . . . . . . . . . . . . . . . . . . 275
7.3.2
Indirect Encoding . . . . . . . . . . . . . . . . . . . . . . . . 278
7.4
Evolution of Learning Rules . . . . . . . . . . . . . . . . . . . . . . 281
7.5
ANN Input Data Selection . . . . . . . . . . . . . . . . . . . . . . . 282
7.6
Evolution of Neural Machines . . . . . . . . . . . . . . . . . . . . . 283
7.6.1
Evolvable Neural-Network Hardware . . . . . . . . . . . . . 285
7.6.2
Evolving Digital Brains
. . . . . . . . . . . . . . . . . . . . 287
7.7
A Case Study: Evolutionary Autonomous Robots . . . . . . . . . . . 289
7.7.1
Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291
7.7.2
A Sequential Task
. . . . . . . . . . . . . . . . . . . . . . . 294
7.8
Bibliography
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299
vi
TABLE DES MATIÈRES
LA COMPLEXITÉ
Partie V: Robin Gras
303
8
La complexité en biologie et en bioinformatique
305
8.1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305
8.2
La complexité du monde vivant . . . . . . . . . . . . . . . . . . . . . 305
8.3
Les molécules élémentaires . . . . . . . . . . . . . . . . . . . . . . . 306
8.4
L’information génomique . . . . . . . . . . . . . . . . . . . . . . . . 306
8.5
Les protéines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307
8.6
Les méthodes bioinformatiques classiques . . . . . . . . . . . . . . . 307
8.7
Application de système bio-inspirés en protéomique . . . . . . . . . . 308
Partie VI: Jean-Louis Deneubourg
311
9
Introduction
313
9.1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313
9.2
Bibliographie . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314
10 Optimality of Collective Choices: a Stochastic Approach
317
10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319
10.2 The model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319
10.2.1 Mean field formulation . . . . . . . . . . . . . . . . . . . . . 319
10.2.2 Principle and implementation of a Monte Carlo simulation . . 320
10.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 322
10.3.1 The role of the colony size . . . . . . . . . . . . . . . . . . . 323
10.3.2 Optimization of the selection . . . . . . . . . . . . . . . . . . 324
10.3.3 High trail- laying vs. being numerous . . . . . . . . . . . . . 327
10.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328
10.5 Bibliography
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 330
11 Emerging Patterns and Food Recruitment in Ants: an Analytical Study
333
11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334
11.2 The model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 336
11.3 The case of identical sources and trails . . . . . . . . . . . . . . . . . 340
11.3.1 The case j = 1 . . . . . . . . . . . . . . . . . . . . . . . . . 341
11.4 The case j > 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342
11.4.1 Biological relevance . . . . . . . . . . . . . . . . . . . . . . 344
11.5 One of the sources is different . . . . . . . . . . . . . . . . . . . . . 345
11.5.1 The case C2 = . . . = Cs = C2 . . . . . . . . . . . . . . . . . 346
11.5.2 The case C3 = 1/C2 . . . . . . . . . . . . . . . . . . . . . . 348
11.6 Biological relevance . . . . . . . . . . . . . . . . . . . . . . . . . . . 351
11.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352
11.8 Bibliography
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 356
vii
LA COMPLEXITÉ
TABLE DES MATIÈRES
12 Dynamics of Nest Excavation and Nest Size Regulation of Lasius Niger
(Hymenoptera: Formicidae)
361
12.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 362
12.2 Material and Methods . . . . . . . . . . . . . . . . . . . . . . . . . . 362
12.3 Experimental procedures . . . . . . . . . . . . . . . . . . . . . . . . 362
12.3.1 Dynamics of digging by groups of different sizes . . . . . . . 362
12.3.2 Effect of population increases on nest volume . . . . . . . . . 363
12.3.3 Examination of factors responsible for the digging dynamics . 364
12.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364
12.4.1 Dynamics of digging . . . . . . . . . . . . . . . . . . . . . . 364
12.4.2 Effect of population increases on nest volume . . . . . . . . . 367
12.4.3 Factors responsible of the digging dynamics . . . . . . . . . . 370
12.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371
12.6 Bibliography
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 373
13 Spatial Patterns in Ant Colonies
377
13.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 379
13.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 379
13.2.1 Colony Collection and Ant Maintenance . . . . . . . . . . . . 379
13.2.2 Experiments
. . . . . . . . . . . . . . . . . . . . . . . . . . 380
13.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 382
13.3.1 Clustering behavior: collective and individual levels . . . . . 382
13.3.2 Model description
. . . . . . . . . . . . . . . . . . . . . . . 383
13.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 389
13.5 Bibliography
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 389
Partie VII: Michel Droz
393
14 Conclusion
395
14.1 Synthèse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395
14.2 Bibliographie . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 397
viii
Première Partie
Introduction à la complexité
Michel Droz
Département de Physique Théorique, Université de Genève
24, quai Ernest Ansermet, CH-1211 Genève 4
1
Chapitre 1
Introduction à la complexité
1.1
Qu’est-ce que la complexité ?
Au cours de ces dernières années, divers domaines scientifiques se sont trouvés
confrontés à une nouvelle classe de problèmes correspondant à l’étude de ce qui est ap-
pelé la complexité ou l’étude des systèmes complexes. Il se trouve que de tels systèmes
se rencontrent dans des domaines très différents, comprenant la physique, la chimie,
la biologie, le comportement de sociétés organisées, l’économie. Il se pose alors la
question de savoir s’il existe des points communs entre ces différents domaines et, en
particulier, si il est possible de dégager des concepts unificateurs. Certaines méthodes
d’analyse développées dans un contexte particulier peuvent se montrer utiles dans des
contextes différents.
Le but de ce cours est d’étudier divers aspects de la complexité et d’essayer de
dégager des concepts unificateurs et des outils communs aux différents domaines dans
lesquels la complexité apparaît.
Bien qu’il ne soit pas possible de donner une définition précise de la complexité,
on peut néanmoins distinguer deux grandes catégories de systèmes complexes.
La première décrit des systèmes en apparence simple, caractérisés par un très petit
nombre de degrés de liberté (voir le paragraphe 2.1 ci-dessous). La complexité dans
ce cas est étroitement liée au caractère non-linéaire des équations qui gouvernent la
dynamique du système. Il s’en suit que bien que totalement connue, l’évolution du
système peut apparaître chaotique, elle va fortement dépendre des conditions initiales.
De plus, en variant certains paramètres de la dynamique, l’évolution peut changer de
manière drastique à cause de l’apparition de bifurcations, de cycles limites ou d’at-
tracteurs étranges. Ce type de complexité a connu un très grand essor avec le dévelop-
pement de la théorie des systèmes dynamiques [1]. Une extension naturelle de ce cas
consiste à considérer des systèmes étendus pour lesquels les variables dynamiques sont
locales [2]. Un tel type de complexité se rencontre dans de nombreux phénomènes na-
turels tels qu’en hydrodynamique (instabilités hydrodynamiques, turbulence), dans les
systèmes de type réaction-diffusion conduisant à la formation de structures spatio-
3
LA COMPLEXITÉ
Michel DROZ
temporelles (structures de Turing [3], structures de Liesegang [4] par exemple).
La seconde classe de problèmes concerne des systèmes avec un très grand nombre
de degrés de liberté. Cependant, tous les systèmes ayant un grand nombre de degrés de
liberté ne sont pas des systèmes complexes. Par exemple, un gaz parfait, formé d’un
nombre astronomiquement grand d’atomes ne montre pas d’aspects complexes. Les
propriétés d’un petit système ne diffèrent pas beaucoup de celles d’un très grand sys-
tème [5]. Il en va tout autrement si les atomes du gaz sont en interaction. En effet, en
variant les paramètres extérieurs du système (température, pression) le système peut
passer d’une phase à une autre (phases gazeuse, liquide et solide). Au voisinage des
points de transitions “continus” (ou de second ordre), le système est le siège d’impor-
tantes fluctuations à toute échelle de longueur, allant de l’Angström à la taille macro-
scopique du système. Cet aspect multi-échelles est responsable des propriétés “d’uni-
versalité” que possèdent ces systèmes [6]. La complexité apparaît donc ici au niveau
des propriétés d’équilibre.
Un autre exemple de système avec beaucoup de degrés de liberté est donné par
les systèmes d’agents en interaction sur réseau. En chaque site du réseau, souvent ré-
gulier, et de dimension d, se trouve un agent (individu, particule, etc.) qui interagit
typiquement avec les autres agents appartenant à un voisinage plus ou moins étendu.
La dynamique peut être en temps continu ou en temps discret, la mise à jour des sites
pouvant être faite de manière séquentielle ou de manière simultanée [7]. Bien que
les règles fixant la dynamique locale puissent être très simples, le comportement du
système à une échelle “coarsed-grained” peut être très riche. C’est là une des caracté-
ristiques des systèmes complexes: le comportement collectif du système est infiniment
plus riche que le comportement individuel des agents qui le compose.
Deux exemples sont donnés par l’automate cellulaire appelé le “jeu de la vie” [8] et
les termites de Resnick [9].
Ces exemples montrent combien il est difficile de quantifier la complexité. Une quanti-
fication proposée consiste à considérer le problème en termes algorithmiques. La com-
plexité d’un problème est associée à la taille du plus petit “programme” permettant de
résoudre le problème. On peut se poser la question de savoir si une telle quantification
de la complexité est raisonnable si l’on sait que le jeu de la vie ci-dessus, dont le code
informatique ne consiste que de quelques lignes, est en fait un calculateur universel.
4
1. Introduction à la complexité
LA COMPLEXITÉ
1.2
Organisation du cours
Comme nous l’avons vu, la complexité apparaît dans un très grand nombre de
domaines [10,11]. Il n’est donc pas possible dans un cours d’un semestre de couvrir
toutes les facettes du problème. Les sujets traités dans ce cours seront les suivants:
1. Le cours débute par la présente introduction.
2. Dans une seconde partie, Bastien Chopard parlera de l’approche des automates
cellulaires.
Une approche possible pour mieux comprendre, prédire et contrôler un système
complexe et de recourir à une modélisation du processus à une échelle où ses
divers constituants ont des comportements maîtrisables, et ensuite de réaliser
une simulation numérique du modèle ainsi obtenu.
Les automates cellulaires offrent un cadre adéquat pour cette démarche. Ils consti-
tuent une abstraction mathématique du monde physique dans lequel de nom-
breux processus réels peuvent être représentés à une échelle mésoscopique avec
un niveau de simplicité suffisant pour obtenir des simulations numériques effi-
caces. Tout d’abord, le concept d’automate cellulaire sera décrit et des exemples
simples seront donnés: règle de parité, règles de croissance majoritaire, règles
de Wolfram, jeu de la vie, fourmis de Langton et automate auto-reproducteur.
On verra comment ce monde mésoscopique qu’est un automate cellulaire retient
de nombreuses caractéristiques du monde réel, comme par exemple la possi-
bilité de créer des structures spatio-temporelles riches dont les propriétés sont
génériques. Puis, dans un deuxième temps, on dégagera de ces exemples une
meilleure compréhension des notions de complexité, et des liens qui existent
avec les problèmes de calcul universel, d’imprédictabilité et d’émergence de
comportements collectifs. On discutera aussi du fait que le comportement ma-
croscopique est essentiellement gouverné par les lois de conservation et les sy-
métries du monde microscopique, ce qui apporte une justification de l’approche
par automate cellulaire pour représenter le monde réel. Finalement, on présen-
tera des exemples d’applications spécifiques pour illustrer comment les auto-
mates cellulaires permettent d’étudier des problèmes concrets. On discutera des
modèles variés comme par exemple des automates cellulaires pour décrire la dif-
férenciation des cellules en biologie, certaines formes de compétition sociale, et
le trafic routier. Une extension de cette approche pour la modélisation de fluides
complexes sera aussi proposée, avec en toile de fond une brève description des
outils mathématiques qui font le pont entre l’univers virtuel de l’automate et le
monde physique que nous cherchons à décrire.
3. Dans la troisième partie, Paolo de Los Rios parlera du problème des systèmes
critiques auto-organisés. Les différents concepts seront introduits à travers une
analyse détaillée du modèle de Bak-Sneppen, un modèle simple de dynamique
extrémale. Le problème général de l’existence dans la Nature de systèmes pos-
sédant des propriétés de scaling (comportement de type “critique” en loi de puis-
sances) sera discuté.
4. Dans la quatrième partie, Marco Tomassini abordera la question de la complexité
5
LA COMPLEXITÉ
Michel DROZ
dans deux systèmes d’inspiration biologique: les réseaux de neurones artificiels
et les algorithmes évolutionnistes. Après une présentation des concepts de base,
des modèles théoriques de ces systèmes adaptatifs complexes seront présentés,
suivis par des applications typiques. Les aspects coopératifs et hybrides des sys-
tèmes en question, en particulier l’interaction entre évolution et apprentissages
artificiels et l’utilisation des algorithmes évolutionnistes dans la recherche de
règles d’automates cellulaires pouvant accomplir des tâches données seront mis
en évidence.
5. Dans la cinquième partie, Robin Gras traitera des systèmes adaptatifs complexes
appliqués à la bioinformatique. L’analyse de données génétiques, incluant la dé-
tection automatique des gènes, la découverte de motifs biologiques, l’identifica-
tion et la caractérisation des protéines ainsi que l’étude de la quantité d’expres-
sion des protéines et des interactions multiples entre protéines étant très com-
plexe du fait du grand nombre de facteurs à prendre en considération simultané-
ment, nécessite l’utilisation d’algorithmes spécifiques, capables de produire de
bonnes solutions malgré la forte combinatoire et l’aspect dynamique des pro-
blèmes à traiter. Il peut alors être intéressant de suivre une approche basée sur
les systèmes adaptatifs complexes qui utilisent des méthodes combinatoires dy-
namiques pour explorer efficacement l’espace de recherches considéré et qui
s’appuient sur l’interaction de fonctionalités simples pour produire des compor-
tements émergeant complexes. Les méthodes concernées par ces approches sont
les algorithmes génétiques et la programmation génétique.
6. Dans la sixième partie, Jean-Louis Deneubourg parlera de la complexité et de
l’auto-organisation chez les insectes sociaux. Les sociétés animales et d’insectes
en particulier fournissent des modèles intéressants pour étudier l’émergence de
structures et de décisions au sein de systèmes multi-agents lorsque ceux-ci sont
doués d’une certaine autonomie.
Deux procédures extrêmes sont impliquées dans la genèse de ces réponses col-
lectives. La première met l’accent sur les capacités des individus: chaque situa-
tion est reconnue et induit une réponse spécifique. La seconde fait appel à la
dynamique des actions qui conduit à la réponse collective: il y a émergence dans
la mesure où il n’existe pas de plan préétabli que les animaux utilisent.
Grâce à différents exemples empruntés à la biologie des insectes sociaux (ré-
colte de nourriture, formation de groupes et construction) nous discuterons le
lien entre les règles comportementales des individus, les contraintes imposées
par le milieu et la réponse du groupe. Nous développerons les méthodes de mo-
délisation (équations, simulations) utilisées dans le domaine et essaierons de
montrer comment s’opèrent les échanges entre laboratoire et théorie.
Dans ces exposés nous essaierons de présenter les points de vue des différentes
disciplines concernées : biologie, chimie et physique du non-linéaire. Du point
de vue théorique, il sera abondamment fait appel aux méthodes et notions des
sciences du non-linéaire.
7. Enfin, dans une dernière partie j’essaierai de faire une synthèse et de dégager les
concepts unificateurs dans l’étude des systèmes complexes.
6
1. Introduction à la complexité
LA COMPLEXITÉ
1.3
Introduction à quelques approches de la complexité.
1.3.1
Généralités
Afin de ne pas perdre d’emblée les étudiants qui ne possèdent pas une formation de
physicien, il nous a semblé bon d’introduire quelques notions et méthodes couramment
utilisés par les physiciens pour décrire les systèmes complexes. Cette introduction n’est
bien sûr pas exhaustive, mais permet de rafraîchir certaines notions et de trouver des
références appropriées.
1.3.2
Systèmes dynamiques
On étudie des systèmes purement déterministes dont la dynamique est donnée par
un ensemble d’équations différentielles du type:
˙xi = fi(x1,x2,...,xn)
(1.1)
où ˙x signifie la dérivée par rapport au temps. n est le nombre de degrés de liberté
du système. L’espace de phases est donc Rn. Les fonctions fi sont supposées être
indépendantes du temps. 1
Il est parfois plus simple de considérer une dynamique en temps discrets, pour
laquelle les équations différentielles sont remplacées par des applications (map) du
type:
xk+1 = f
,xk,...,xk )
i
i(xk
1
2
n
(1.2)
Il est important de distinguer entre une dynamique conservative ou dissipative. Soit
D un domaine de l’espace de phase. On regarde comment la mesure du volume de
ce domaine V (D) varie au cours de l’évolution. Si V (D) ne change pas au cours du
temps, le système est conservatif (théorème de Liouville) alors que si le système est
dissipatif, V (D) diminuera au cours de l’évolution.
Partant d’un état initial donné, le système peut évoluer vers une solution unique
(point fixe), un cycle limite ou un “attracteur” plus compliqué. Ces diverses situations
et les conséquences qui en découlent sont facilement illustrées, dans le cas discret, par
l’application logistique (logistic map) définie de la manière suivante [1]:
xn+1 = µxn(1 − xn),
xn ∈ [0,1]
(1.3)
Pour une valeur de µ fixée, et une condition initiale x0 donnée, il est facile de
visualiser les valeurs successives prises par x en regardant les constructions géomé-
triques des figures 1 à 4. On constate que pour diverses valeurs de µ les comportements
peuvent être très différents. Sur la figure 1, le cas µ = 2 et x0 = 0.2 est illustré. Sous
itérations, le point fixe 0.5 est atteint (ceci est vrai pour tout x0). Si µ = 3.3 (figure
1. On peut considérer des cas plus généraux. Si par exemple les fonctions fi dépendent du temps de
manière périodique de fréquence ω, alors on introduit une nouvelle variable xn+1 telle que ˙xn+1 = ω.
7
LA COMPLEXITÉ
Michel DROZ
FIG. 1.1 – Évolution de l’application logistique pour µ = 2
FIG. 1.2 – Évolution de l’application logistique pour µ = 3.3
8
1. Introduction à la complexité
LA COMPLEXITÉ
FIG. 1.3 – Évolution de l’application logistique pour µ = 3.53
2) il y a oscillation entre x = 0.48 et x = 0.83 et xn+2 = xn (doublement de pé-
riode). Lorsque µ croît, de nouveaux doublements de période se produisent. La figure
3 montre le cas µ = 3.53 pour lequel xn+4 = xn. Ce mécanisme de doublement de
période conduit au chaos. Le figure 4 montre le cas µ = 3.9. Les valeurs prises par xn
ne sont plus limitées à quelques valeurs particulières. Le comportement du système est
chaotique.
Ce mécanisme de doublement de période est intéressant car il peut être caractérisé
par certains nombres universels qui ne dépendent pas (dans certaines limites) de la
nature de l’application.
Soient µk, k = 1,2,..., les valeurs de µ consécutives pour lesquelles une bifurcation est
présente, alors
µ
lim
k − µk−1 = δ = 4.4669 . . .
(1.4)
k→∞ µk+1 − µk
δ est le nombre de Feigenbaum. Cette relation est une propriété universelle des appli-
cations possédant un maximum quadratique et allant vers le chaos par doublement de
période. On peut également montrer qu’un nombre infini de bifurcations sera présent
lorsque la valeur de µ = 3.569 est approchée. Le diagramme de bifurcation est donné
par la figure 5.
Une notion très utilisée pour décrire le comportement chaotique est celle d’expo-
sant de Liapunov. Considérons l’application
xn+1 = f (xn)
(1.5)
et regardons comment deux conditions initiales voisines se comportent après n étapes
9
LA COMPLEXITÉ
Michel DROZ
FIG. 1.4 – Évolution de l’application logistique pour µ = 3.9
d’évolution.
f n(x + ) − fn(x) ∼ exp(nλ)
(1.6)
λ est l’exposant de Liapunov.
Si λ < 0, deux trajectoires initialement proches convergent lors de l’évolution. Par
contre si λ > 0, les deux trajectoires divergent. La dynamique du système est sensible
aux conditions initiales et le système est chaotique.
Dans la limite n → ∞, on peut montrer que
n−1
λ = lim
ln |f (xi)|
(1.7)
n→∞ i=1
L’exposant de Liapunov caractérise donc le taux d’écartement par itération moyenné
sur la trajectoire. 2
L’exposant de Liapunov pour l’application logistique est représenté sur la figure 6.
Il est naturel de vouloir caractériser le degré de chaoticité en termes d’entropie.
Considérons un système statistique dont les “états” possibles (indexés par l’indice i,
i = 1,...,N ) sont réalisés avec probabilité pi. L’entropie correspondante sera
N
S = −
pi ln pi
(1.8)
i=1
2. Pour une application à m dimensions, il y a m exposants de Liapunov. Ainsi un volume initial de
taille V0 dans l’espace de phases devient après n itérations Vn = V0 exp(λ1 + λ2 + ... + λm). Pour un
système dissipatif, la somme des exposants doit être négative et si le système est chaotique, au moins
un des exposants doit être positif.
10
1. Introduction à la complexité
LA COMPLEXITÉ
FIG. 1.5 – Diagramme de bifurcation pour l’application logistique.
FIG. 1.6 – Exposant de Liapunov pour l’application logistique.
11
LA COMPLEXITÉ
Michel DROZ
FIG. 1.7 – Entropie pour l’application logistique.
On peut alors appliquer ce concept à l’application logistique. Divisons l’intervalle
unité en N parties égales. Si le système est dans un état non chaotique, les valeurs
possibles de xn ne tomberont que dans un très petit nombre d’intervalles et donc l’en-
tropie sera petite. Si par contre le système se trouve dans un état chaotique, l’entropie
sera plus grande pour atteindre la valeur maximale ln N pour une distribution uni-
forme. La figure 7 montre le comportement de l’entropie de l’application logistique
pour N = 40.
1.3.3
Systèmes étendus
Une autre classe de systèmes pouvant exhiber des comportements complexes concerne
les systèmes dits “étendus”. Les grandeurs physiques décrivant les propriétés du sys-
tème sont alors locales. Elles peuvent dépendre continûment de l’espace, ou d’une
manière discrète si le système se trouve sur un réseau.
Pour illustrer nos propos, considérons un système de réaction-diffusion composé
de deux espèces chimiques, A et B. Ces deux constituants diffusent (dans un gel par
exemple) et réagissent chimiquement.
Soient A(r,t) et B(r,t) les densités locales respectives. Dans la limite des temps longs,
ce système peut atteindre un état stationnaire homogène de densités A0 et B0. La ques-
tion qui se pose est de savoir sous qu’elles conditions cet état homogène est stable. En
particulier, peut-il arriver que la diffusion soit un facteur de déstabilisation, condui-
sant à un état stationnaire inhomogène? Turing a montré dans un fameux article sur la
morphogénèse que cela était possible [3].
Regardons comment ce phénomène est possible.
12
1. Introduction à la complexité
LA COMPLEXITÉ
Les équations de réaction-diffusion s’écrivent:
∂
2
tA(r,t) = DA
A(r,t) + F1(A,B)
(1.9)
∂
2
tB(r,t) = DB
B(r,t) + F2(A,B)
(1.10)
où DA et DB sont les constantes de diffusion respectives et F1,F2, les fonctions (non-
linéaires) décrivant les réactions.
On procède à une analyse de stabilité linéaire. On écrit
A(r,t) = A0 + a(r,t),
B(r,t) = B0 + b(r,t)
(1.11)
où a et b sont de petites perturbations et on linéarise les équations de mouvement en a
et b. D’où:
∂
2
ta(r,t) = DA
a(r,t) + LAAa(r,t) + LABb(r,t)
(1.12)
∂
2
tb(r,t) = DB
b(r,t) + LBBb(r,t) + LBAa(r,t)
(1.13)
où
∂F
L
I
IJ =
(1.14)
∂CJ A0,B0
CJ étant une des deux concentrations possibles (A ou B).
Dans le cas d’un réacteur idéal empêchant toute inhomogénéité spatiale, on regarde
la stabilité de la solution homogène par rapport à des perturbations de la forme
a(t) = a0 exp(ω0t),
b(t) = b0 exp(ω0t)
(1.15)
Les conditions de stabilité sont alors:
LAA + LBB < 0
(1.16)
et
LAALBB − LABLBA = ∆(0) > 0
(1.17)
Retournons alors au problème des inhomogénéités spatiales (dans un système infini-
ment étendu) en étudiant la stabilité de perturbations du type
a(r,t) =
exp(ωkt) exp(ik · r)˜a(k)
(1.18)
k
b(r,t) =
exp(ωkt) exp(ik · r)˜b(k)
(1.19)
k
En substituant ces perturbations dans les équations linéarisées, on peut extraire les
valeurs propres possibles ω± pour les fréquences permises. Si une solution possède
k
une partie réelle positive, alors la perturbation associée va grandir exponentiellement
vite au cours du temps. C’est la signature de l’instabilité. Un calcul détaillé montre que
13
LA COMPLEXITÉ
Michel DROZ
la condition marginale de stabilité donne pour le vecteur d’onde kc associé au premier
mode qui devient instable dans le système:
D
k2
ALBB + DB LAA
c =
(1.20)
DADB
Donc la longueur d’onde du premier mode instable est une longueur intrinsèque, in-
dépendante de la géométrie du système. Une telle instabilité est appelée instabilité de
Turing.
Dans le mécanisme de l’instabilité de Turing il y a compétition entre deux méca-
nismes. D’une part les inhomogénéités (fluctuations) sont amplifiées par effets auto-
catalytiques, d’autre part, elles sont supprimées au-delà d’une certaine distance par la
diffusion de l’inhibiteur. De la compétition entre ces deux mécanismes (activation, in-
hibition) peut naître des structures complexes. Ces instabilités jouent un rôle important
en biologie et de nombreux exemples sont discutés dans le livre de Murray [12].
Un autre exemple de système étendu sur réseau est fourni pas le modèle des ter-
mites de Resnick [9]. Dans ce modèle, l’espace et le temps sont discrétisés. Le modèle
est défini de la manière suivante. Des agents (termites) sont aléatoirement distribués sur
les noeuds d’un réseau carré (donc en 2 dimensions). En chaque site, des copeaux de
bois peuvent être déposés. Chaque termite effectue une marche aléatoire sur le réseau.
Il se déplace d’une maille du réseau en une étape de temps. De plus chaque termite
peut porter ou non un copeau de bois. Les règles d’évolution sont les suivantes:
Chaque termite effectue une marche aléatoire jusqu’à ce qu’il rencontre un copeau de
bois sur un site.
1. Si il ne porte rien, il se charge avec un copeau et continue sa marche aléatoire
jusqu’à ce qu’il rencontre un nouveau site avec copeaux.
2. Si il est déjà chargé, il dépose son copeau sur le site et continue sa marche
aléatoire “à vide”, jusqu’à ce qu’il rencontre un nouveau site avec copeaux.
Que se passe-t’il dans un tel système? La situation à des instants différents est illus-
trée dans la figure 8. Les copeaux, initialement distribués de manière aléatoire sont
progressivement regroupés pour former des tas.
Bien que les règles d’évolution soient très simples, le système développe un compor-
tement complexe, caractérisé par l’apparition d’ordre dans le système.
De nombreux autres exemples de systèmes discrets montrant des comportements
complexes seront étudiés dans le chapitre sur les automates cellulaires [7].
1.3.4
Mécanique statistique de l’équilibre
Le but de la mécanique statistique de l’équilibre est d’expliquer à partir d’une
description microscopique les propriétés macroscopiques (ou thermodynamiques) des
14
1. Introduction à la complexité
LA COMPLEXITÉ
FIG. 1.8 – Distribution des copeaux à différents temps pour le modèle de Resnick.
corps formés d’un très grand nombre de constituants. Partant d’un hamiltonien dé-
crivant la dynamique des constituants élémentaires on en déduit des règles de calcul
permettant de dériver les propriétés macroscopiques. Cette théorie, appelée théorie des
ensembles [5], est de nature probabiliste. Néanmoins, pour des systèmes constitués de
N entités microscopiques, les fluctuations relatives aux valeurs moyennes des obser-
vables sont de l’ordre N −1/2 et donc négligeables dans la limite N → ∞.
Le but de cette introduction n’est pas d’exposer tous les concepts de la mécanique
statistique, mais de décrire de manière qualitative un type de complexité apparaissant
dans des systèmes statistiques à l’équilibre, le cas des transitions de phases continues.
Un exemple bien connu de transition de phase est la transition ferromagnétique-para-
magnétique. Typiquement, un cristal magnétique est constitué d’un réseau cristallin
régulier. En chaque site du réseau cristallin se trouve un ion porteur d’un moment
magnétique ou “spin”.
Le modèle le plus simple est celui d’Ising, dans lequel on suppose la présence en
chaque site d’un spin classique ne pouvant être dirigé que dans deux directions (selon
l’axe z par exemple) si = ±1. Seuls les spins situés sur des sites plus proches voisins
interagissent. En présence d’un champ magnétique extérieur h, il y a une interaction
de type Zeeman. L’hamiltonien d’Ising s’écrit donc:
H = −J
sisj − h
si
(1.21)
<i,j>
i
Nous supposons que le couplage est ferromagnétique (J > 0), il s’en suit que dans
15
LA COMPLEXITÉ
Michel DROZ
l’état fondamental, tous les spins sont dirigés dans la même direction (en haut ou en
bas).
Une généralisation du modèle d’Ising consiste à considérer le cas où, en chaque
site, se trouve un spin “classique” sj pouvant prendre n’importe quelle direction sur
la sphère unité. On parle alors du modèle de Heisenberg classique dont l’hamiltonien
s’écrit:
H = −J
sisj − h
si
(1.22)
<i,j>
i
Considérons pour l’instant ces modèles en champ magnétique nul. Nous remar-
quons que l’hamiltonien de Heisenberg est invariant sous une rotation globale de tous
les spins, alors que celui d’Ising est invariant sous un renversement de tous les spins.
D’autre part, les spins du système sont en contact avec un bain thermique (constitué
par les autres degrés de liberté non contenus dans l’hamiltonien ci-dessus) et changent
d’état au cours du temps. Que se passe-t-il qualitativement dans le système en fonction
de la température?
i). À haute température: l’agitation thermique est grande. Le système est dans une
phase paramagnétique. Les spins sont orientés dans toutes les directions. La sus-
ceptibilité magnétique isotherme suit une loi de Curie, c.à.d, χ ∼ 1 .
T
ii). À basse température: à température nulle, le système minimalise son énergie en
plaçant tous ses spins de façon parallèle. Le système est dans la phase ferroma-
gnétique. Il y a apparition d’une aimantation spontanée dans une direction privi-
légiée, disons z. La symétrie de la phase à basse température est donc plus petite
que celle de la phase à haute température. On dit qu’il y a brisure spontanée de
symétrie. Pourtant, la direction appelée z est arbitraire. Ainsi, l’état fondamen-
tal n’est pas unique mais deux fois dégénéré pour le cas d’Ising et infiniment
dégénéré pour le cas de Heisenberg. Dans ce dernier cas, on peut passer d’un
état fondamental à l’autre continûment, sans fournir d’énergie. Il s’en suit qu’il
existe des excitations élémentaires dont le spectre d’énergie ω(q) → 0, lorsque
le vecteur d’onde q → 0 (rotation uniforme du système). Ces excitations, appe-
lées bosons de Goldstone sont dans cet exemple les ondes de spin acoustiques
bien connues.
iii). Températures intermédiaires: en augmentant la température depuis T = 0, tous
les spins ne restant pas alignés dans la même direction, l’aimantation spontanée
diminue. On excite des ondes de spin dans le cas de Heisenberg et l’on renverse
des spins dans le cas d’Ising. L’étendue du désordre peut être caractérisée par
une longueur ξ−. Dans le cas du modèle d’Ising, ξ− est la taille caractéristique
des domaines dans lesquels les spins sont orientés dans une direction opposée
à l’aimantation spontanée. ξ− est appelée la longueur de corrélation de la phase
à basse température. Lorsque la température augmente, ξ− augmente jusqu’à ce
que l’on arrive à la température critique Tc où l’aimantation spontanée s’annule
et la longueur de corrélation diverge.
Inversement, en partant des hautes températures, la situation est la suivante. À très
16
1. Introduction à la complexité
LA COMPLEXITÉ
haute température, les spins sont essentiellement libres. Lorsque la température dimi-
nue, les spins s’orientent dans des domaines dont la taille caractéristique est ξ+, la
longueur de corrélation à haute température. En abaissant la température, le désordre
diminue. Pour T → Tc, ξ+ diverge.
La transition est marquée par l’apparition d’une aimantation spontanée. D’autre
part, au voisinage de la transition, plusieurs quantités physiques exhibent un compor-
tement singulier caractérisé par des lois de puissances.
Notons encore que la susceptibilité magnétique est reliée aux corrélations via le théo-
rème de fluctuation-dissipation (TFD) qui s’écrit pour le cas du modèle de Heisenberg:
1
1
χ(T ) =
[
s
k
0sj
− m2]
(1.23)
B T
3
j
Considérons le TFD dans le cas T > Tc. Pour que la somme diverge, il faut que le sys-
tème ait des corrélations à longue portée. Bien que l’aimantation d’équilibre soit nulle,
il y a des grandes régions qui possèdent une aimantation moyenne non nulle pendant
un temps long à l’échelle microscopique. Il y a donc d’importantes fluctuations dans
le système.
Pour T < Tc, la situation est la même en termes de fonctions de corrélation ré-
duites. De plus, s(r)s(0) → s(r) s(0) = 0 pour r → ∞. Il y a ordre à longue
portée dans le système, caractérisé par le paramètre d’ordre, dans ce cas l’aimantation
spontanée.
Exposants critiques, lois d’échelle et universalité.
Les exposants critiques, dans le cas général d’une transition du second ordre, sont
définis de la manière suivante. Soit ψ le paramètre d’ordre du problème considéré.
Notons que ψ n’est pas forcément un scalaire, mais peut être un objet avec plusieurs
composantes (vecteur, tenseur). Nous noterons n le nombre de composantes de ψ. Le
champ extérieur, c’est-à-dire la variable thermodynamiquement conjuguée au para-
mètre d’ordre, sera noté h.
Soit t = T −Tc la température réduite. t mesure donc l’éloignement relatif du point
Tc
critique Tc. On introduit alors les exposants critiques suivants:
– A). Exposant du paramètre d’ordre β:
Pour t < 0,
ψ ∼ | t |β
– B). Exposants des longueurs de corrélation ν et ν :
Pour t > 0,
ξ+(t) ∼ t−ν
Pour t < 0,
ξ−(t) ∼ | t |−ν
– C). Exposants de la chaleur spécifique à champ constant, α et α :
Pour t > 0,
ch(t) ∼ t−α
Pour t < 0,
ch(t) ∼ | t |−α
17
LA COMPLEXITÉ
Michel DROZ
– D). Exposants de la susceptibilité (longitudinale) γ et γ :
Pour t > 0,
χ(t) ∼ t−γ
Pour t < 0,
χ(t) ∼| t |−γ
– E). Exposant de la dépendance en champ du paramètre d’ordre δ:
1
Au point critique même, c’est-à-dire pour T = Tc,
ψ ∼ hδ
– F). Exposant de la fonction de corrélation à deux points η:
Le paramètre d’ordre ψ est en général la moyenne d’un champ ϕ. La fonction
de corrélation à deux points est définie comme:
G(r) = ϕ(r)ϕ(0)
(1.24)
sa transformée de Fourier est :
˜
G(q) =
dR exp(iq · R)G(R)
(1.25)
Au point critique, c’est-à-dire à T = Tc, ces fonctions ont le comportement
suivant:
1
G(r) ∼
, r
a
(1.26)
rd−2+η
et
˜
1
1
G(q) ∼
, q
(1.27)
q2−η
a
où a est la maille de réseau sous-jacent.
Il y a donc 9 exposants critiques caractérisant le comportement au voisinage du point
critique.
Valeurs des exposants critiques.
Les exposants ont les propriétés suivantes:
1. La température critique Tc est une grandeur qui varie beaucoup d’un modèle à
l’autre. Elle dépend des détails du système.
2. Les exposants critiques par contre exhibent des propriétés d’universalité remar-
quables.
a) Ils ne dépendent que de la dimension du système considéré d et du nombre
de composantes du paramètre d’ordre n. 3
On dit qu’il y a universalité. Ainsi des systèmes si différents qu’un gaz ou
qu’un ferro-aimant ont les mêmes exposants critiques.
b) Les exposants sont “symétriques”, c’est-à-dire α = α , γ = γ et ν = ν .
3. Nous supposons que les interactions sont de courte portée. Des corrections apparaissent si les
interactions sont de longue portée.
18
1. Introduction à la complexité
LA COMPLEXITÉ
c) Les exposants critiques satisfont à des relations particulières appelées lois
d’échelle (ou scaling laws) et qui s’énoncent:
R = α + 2β + γ − 2 = 0
(1.28)
W = γ − β(δ − 1) = 0
(1.29)
F = γ − ν(2 − η) = 0
(1.30)
J = dν − (2 − α) = 0
(1.31)
Seule la dernière relation, appelée relation d’hyperscaling, dépend expli-
citement de la dimension d. Ces relations ont un degré d’universalité encore
plus grand que les exposants eux-mêmes.
Compte tenu des symétries et des lois d’échelle, il n’y a que deux exposants
critiques indépendants parmi les neuf.
La présence de comportement en lois de puissances est une expression forte de l’aspect
coopératif du phénomène.
Dans le cadre des transitions de phase à l’équilibre, il faut ajuster finement certains
paramètres du problème pour se trouver dans un régime dit “critique” pour lequel un
comportement en loi d’échelle est observé. Dans certains cas, le système se place tout
seul dans ce régime critique. On parle alors de systèmes critiques auto-organisés et ce
sujet sera développé dans la troisième partie de ce cours.
1.3.5
Concept de scaling
L’existence de lois d’échelles est un phénomène souvent présent dans la Nature.
Beaucoup de propriétés importantes peuvent découler de l’existence de telles lois.
Aussi, nous allons illustrer dans ce paragraphe quelques-unes de ces propriétés.
Un système possède des propriétés de scaling si des grandeurs mesurables sont
reliées entre-elles par des lois de puissances. Citons quelques exemples:
1. En zoologie: les paramètres de nombreux animaux ont été mesurés pour des
animaux allant des plus petits aux plus grands. De nombreuses lois de scaling
sont présentes. Par exemple pour les mammifères (de la souris de masse ms ∼
10−2 kg à la baleine mb ∼ 104 kg.) l’expérience montre que:
a. Le volume des poumons Vp et la masse de l’animal m obéissent à la rela-
tion:
Vp ∼ mp
(1.32)
avec p ∼ 1.0.
b. D’autre part, la masse du squelette ms et la masse de l’animal m obéissent
à la relation:
ms ∼ mp
(1.33)
avec p ∼ 1.13.
19
LA COMPLEXITÉ
Michel DROZ
2. En géologie: l’intensité d’un tremblement de terre se mesure en terme de l’éner-
gie dissipée E. Il se trouve que les énergies mises en jeu varient sur une plage
de plusieurs ordres de grandeur et que pour une large partie de cette plage, la
distribution des intensités P (E) suit une loi de puissance:
P (E) ∼ E−s
(1.34)
où l’exposant s varie de région en région dans l’intervalle (1.8 − 2.2). De plus,
le nombre de répliques Nr(t) après un tremblement majeur décroît avec le temps
selon la loi d’Omori:
Nr(t) ∼ t−r
(1.35)
où l’exposant r est compris dans l’intervalle (1.8 − 2.2).
3. En physique du solide: Si l’on irradie un solide, il se crée des paires de défauts
dans la structure cristalline (défauts de Frenkel). Une fois créés, ces défauts se
recombinent et leur densité c(t) décroît en fonction du temps de la manière sui-
vante:
c(t) ∼ t−3/2
(1.36)
pour un système tridimensionnel.
4. En hydrodynamique: Un résultat bien connu en hydrodynamique concerne la
vitesse des vagues dans un milieu peu profond de hauteur h relativement à la
longueur d’onde λ (vagues arrivant sur une plage par exemple). En effet, si l’on
néglige la viscosité et les effets de tension de surface, la vitesse de la vague v a
pour valeur v = √gh, où g est la constante de gravitation. Donc:
v ∼ h−1/2
(1.37)
5. En magnétisme: l’aimantation spontanée d’un système magnétique à la tempé-
rature critique croît avec le champ magnétique associé (pour des petits champs)
comme:
m(Tc,h) ∼ h1/δ
(1.38)
Il est alors naturel de se poser la question de savoir sous quelles circonstances
nous pouvons espérer obtenir de telles lois. Donner une réponse générale n’est
pas facile. Cependant, nous pouvons voir comment de telles propriétés arrivent
dans des cas simples.
Scaling et analyse dimensionnelle.
Les lois qui gouvernent les phénomènes physiques ne doivent pas dépendre des
unités choisies. Il est donc judicieux de les exprimer un termes de grandeurs sans
dimension. Cette simple constatation peut conduire à des conclusions non triviales.
Considérons un exemple simple: la propagation des vagues en milieu peu profond.
Sans se restreindre à la limite h/λ → 0, l’hydrodynamique classique montre que la
20
1. Introduction à la complexité
LA COMPLEXITÉ
vitesse des vagues est donnée par:
gλ
2πh
v2 =
tanh
(1.39)
2π
λ
Les seules quantités physiques dont la vitesse v peut dépendre (le liquide est supposé
non visqueux) sont la densité du liquide ρ, la gravitation g, la longueur d’onde λ et
la hauteur h. Si M,L et T sont respectivement les unités de masse, de longueur et
de temps, les dimensions des grandeurs physiques pertinentes sont (pour un système
tridimensionnel):
[ρ] = M L−3, [g] = LT −2, [h] = [λ] = L
(1.40)
où [A] désigne la dimension de A. Donc, pour des raisons dimensionnelles, on est
conduit à écrire
h
v = (gh)1/2f
(1.41)
λ
où f (x) est une fonction inconnue de la variable sans dimension x = h . Dans la limite
λ
x → 0 on retrouve la loi de scaling
v = (gh)1/2f (0) ∼ h1/2
(1.42)
Ainsi, une simple analyse dimensionnelle semble conduire directement à la loi
cherchée. Ce n’est évidemment pas vrai. Une hypothèse importante a été faite: la
fonction f a un comportement “raisonnable” pour un petit argument, c’est-à-dire:
f (x → 0) = const. Ce comportement peut être pressenti à partir d’arguments phy-
siques; dans la limite x → 0, la vitesse v ne doit pas dépendre de λ.
L’importance de cette remarque est mise en évidence par le point suivant. Toujours
pour des raisons dimensionnelles, on peut aussi écrire:
h
v = (gλ)1/2k
(1.43)
λ
où k(x) est une fonction inconnue de la variable x = h . Afin que la vitesse v ne
λ
dépende pas de λ dans la limite x → 0, k(x) doit se comporter de manière particulière
pour x → 0. Il faut que k(x) = x1/2f(x) où f(x) tend vers une constante lorsque
x → 0.
Bien que ne résolvant pas le problème, l’analyse dimensionnelle peut être un outil
très utile [13].
Remarquons encore que les exposants de scaling découlant d’une analyse dimen-
sionnelle ont par construction des valeurs rationnelles simples. L’expérience montre
que les exposants n’ont pas en général des valeurs simples. Leur détermination va au-
delà de l’analyse dimensionnelle et fait appel à des techniques nouvelles telles que la
méthode dite du groupe de renormalisation.
Pour compléter ce paragraphe, regardons encore une application de l’analyse di-
mensionnelle qui illustre l’intérêt d’une telle approche. Lors d’une explosion nucléaire,
21
LA COMPLEXITÉ
Michel DROZ
l’énergie E est libérée de manière quasi ponctuelle et donne lieu à une formidable onde
de choc. Une analyse aérodynamique classique montre que le rayon de l’onde de choc
R(t) à un temps t après l’explosion ne dépend que de la densité initiale de l’air ρ et de
E et t. Par analyse dimensionnelle on obtient,
[R] = [E]1/5 · [t]2/5 · [ρ]−1/5
(1.44)
Il s’en suit que
R = constE1/5t2/5ρ−1/5
(1.45)
D’autre part, l’analyse aérodynamique montre que la constante const est d’ordre 1.
Ainsi, la mesure du rayon de l’onde de choc en fonction du temps (obtenue par une
suite de photographies rapides) permet de déduire l’énergie E libérée. Au cours des
premiers essais américains de la bombe A en 1945, des journalistes ont publié dans la
presse, des photos des explosions. Par contre, l’énergie des bombes était une donnée
hautement secrète. Les responsables du programme furent très surpris de voir publiées
ces valeurs dans des revues scientifiques. Elles découlaient d’une simple analyse di-
mensionnelle.
1.3.6
Mécanique statistique du non-équilibre
La mécanique statistique du non-équilibre se propose de donner une description
microscopique (ou mésoscopique) des phénomènes présents dans des systèmes qui,
dans la limite des temps longs, n’iront pas vers un état d’équilibre (au sens de la méca-
nique statistique de l’équilibre). Le système peut atteindre un état stationnaire, mais il
peut aussi se trouver dans un cycle limite ou avoir un comportement chaotique. Indé-
pendamment des propriétés de l’état stationnaire, la dynamique de ces systèmes peut
être très riche et complexe [14,15].
Le problème de la mécanique statistique de l’équilibre est un cas limite (particu-
lièrement simple) du cas du non-équilibre. On s’attend donc que les systèmes hors-
équilibre montrent des comportements plus complexes que ceux à l’équilibre.
Une caractéristique importante des systèmes hors équilibre est le fait qu’ils sont
“ouverts”. Ils sont traversés par des flux (de matière, d’énergie, etc.). Une grande classe
de systèmes hors-équilibre est fournie par les systèmes vivants dont la complexité n’est
pas à démontrer.
Comment décrire la dynamique de tels systèmes ? C’est un problème difficile. Une
observation importante est de remarquer que dans la majorité des cas, les variables dy-
namiques décrivant un problème donné n’ont pas toutes le même statut. On peut gros-
sièrement diviser ces variables en deux classes: les variables “rapides” et les variables
“lentes” qui sont les variables pertinentes à l’échelle mésoscopique. La dynamique de
ces variables lentes sera alors obtenue en “éliminant” les variables rapides du jeu. Ce
programme n’est que très rarement réalisable explicitement. Ainsi, on postule souvent
les équations régissant la dynamique des variables lentes. Les variables rapides sont
vues comme des perturbations aléatoires agissant sur les variables lentes. On introduit
22
1. Introduction à la complexité
LA COMPLEXITÉ
donc une description statistique, et l’on s’intéresse à la probabilité P (s1,s2,...,sM ,t)
que l’état caractérisé par les variables lentes S = (s1,s2,...,sM ) soit réalisé au temps t.
Dans sa forme la plus simple, l’équation d’évolution de P prend la forme d’une
équation maîtresse de la forme [16,17]:
∂tP (S,t) =
(ω(S → S)P(S ,t) − ω(S → S )P(S,t))
(1.46)
{S }
Le rôle des variables rapides est “caché” dans les taux de transition ω(S → S ). Cette
équation est dite ”markovienne”, car elle ne contient pas d’effets mémoire.
Une description encore plus grossière, en termes de quelques variables lentes est
parfois suffisante. C’est le cadre de la thermodynamique du non-équilibre [18,19]. Les
équations ainsi obtenues appartiennent à la classe des équations différentielles stochas-
tiques (ou équations de Langevin généralisées) [16,17]. Elles ont la forme d’équations
différentielles aux dérivés partielles, généralement non-linéaires, auquel s’ajoute un
terme de bruit (variable stochastique). Dans la limite où l’on néglige le bruit, on re-
trouve des équations du type de celles discutées dans le paragraphe sur les systèmes
étendus.
23
LA COMPLEXITÉ
Michel DROZ
1.4
Bibliographie
[1] G.L. Baker and J.-P. Gollub, “Chaotic dynamics”, Cambridge University Press
(1990).
[2] Ch. Vidal, G. Dewl, P. Borckmans, “Au-delà de l’équilibre”, Hermann edt.
(1994)
[3] A. Turing, “The chemical basis of morphogenesis”, Phil.Trans. Royal Soc.
London 237B, 37, (1952).
[4] H. K. Henisch, “Periodic precipitation”, Pergamon Press, (1991)
[5] K. Huang, “Statistical Mechanics”, John Wiley and Sons, New-York, (196)
[6] K. Wilson, “Les phénomènes de longueur et d’échelles de longueurs”, Biblio-
thèque pour la Science edt. (1988).
[7] B. Chopard and M. Droz, “Cellular Automata and Modeling of Physical Sys-
tems”, Cambridge University Press, (1998).
[8] E.R. Berlekamp and J.O. Conway and R.K. Guy, "Winning Ways for your
Mathematical Plays, Academic Press, New York (1982),
[9] M. Resnick, “Turtles, termites and traffic jams: explorations in massively pa-
rallel microworlds”, Bradford Books, MIT Press, Cambridge Mass. (1994).
[10] S.Y. Auyang, “Foundations of Complex-Systems Theories”, Cambridge Uni-
versity Press, (1998).
[11] “Physique de la Complexité”, T. Daussiox et M. Droz, edts, Edt. Frontières,
(1995).
[12] J. D. Murray, “Mathematical Biology”, Springer-Verlag (1989).
[13] G. I. Barenblatt, “Scaling, self similarity and intermediate asympto-
tics”,Cambridge University Press, (1996).
[14] H. J. Kreutzer, “Nonequilibrium thermodynamics and its Statistical Founda-
tions”, Oxford Science Publ.,Oxford. (1981).
[15] R. Balescu, “Equilibrium and nonequilibrium statistical mechanics”, John Wi-
ley & Sons, New-York, (1975).
[16] C. W. Gardiner, ”Handbook of Stochastic Methods”, Springer Verlag Berlin
(1990).
[17] N.G. van Kampen, “Stochastic Processes in Physics and Chemistry”, North-
Holland, (1981).
[18] I. Prigogine, “Introduction to thermodynamics of irreversible processes”, John
Wiley & Sons, New-York, (1955).
[19] P. Glansdorff and I. Prigogine, “Structure stabilié et fluctuations”, Masson,
Paris (1971).
24
Deuxième Partie
Automates cellulaires et modélisation
de systèmes complexes
Bastien Chopard
Centre Universitaire d’Informatique (CUI), Université de Genève
24, Rue du Général Dufour, CH-1211 Genève 4
25
Chapitre 2
Cellular Automata and Lattice
Boltzmann Techniques
Bastien Chopard, Pascal Luthi and Alexandre Masselot
Computer Science Department, University of Geneva
1211 Geneva 4, Switzerland
Abstract
We discuss the cellular automata approach and its exten-
sions, the lattice Boltzmann and multiparticle methods. The
potential of these techniques is demonstrated in the case of
modeling complex systems. In particular, we consider appli-
cations taken from various fields of physics, such as reaction-
diffusion systems, pattern formation phenomena, fluid flows,
fracture processes and road traffic models.
2.1
The Cellular Automata approach
2.1.1
Introduction
Cellular automata (often termed CA) are an idealization of a physical system in which
space and time are discrete. In addition, the physical quantities (or state of the au-
tomaton) take only a finite set of values. Since it has been invented by von Neumann
in the late 1940s, the cellular automata approach has been applied to a large range of
scientific problems (see for instance [1,2,3,4,5,6,7,8,9,10,11,12,13,14]).
The original motivation of von Neumann was to extract the abstract mechanisms
leading to self-reproduction of the biological organisms[15]. In other words the prob-
27
LA COMPLEXITÉ
Bastien CHOPARD
lem is to devise a system having the capability (and the recipe) to produce another
organism of equivalent complexity with only its own ressource.
Following the suggestions of S. Ulam[16], von Neumann addressed this question in
the framework of a fully discrete universe made up of cells. Each cell is characterized
by an internal state, which typically consists of a finite number of information bits.
Von Neumann suggested that this system of cells evolves, in discrete time steps, like
simple automata which only know of a simple recipe to compute their new internal
state. The rule, determining the evolution of this system is the same for all cells and
is a function of the states of the neighbor cells. Similarly to what happens in any
biological system, the activity of the cells takes place simultaneously. However, the
same clock drives the evolution of each cell and the updating of the internal state of
each cell occurs synchronously.
Such a fully discrete dynamical systems (cellular space) as invented by von Neu-
mann are now referred to as a cellular automaton.
After the work of von Neumann, other authors have followed the same line of
research and nowadays the problem is still of interest [17] and has lead to interesting
developments for new computer architectures [18].
Many other applications of CA’s to physical science have been considered. In
1970, the mathematician John Conway proposed his famous game of life[19]. His
motivation was to find a simple rule leading to complex behaviors. He imagined a
two-dimensional square lattice, like a checkerboard, in which each cell can be either
alive (state one) or dead (state zero). The updating rule of the game of life is as fol-
lows: a dead cell surrounded by exactly three living cells gets back to life; a living cell
surrounded by less than two or more than three neighbors dies of isolation or over-
crowdness. Here, the surrounding cells corresponds to the neighborhood composed
of the four nearest cells (north, south, east and west), plus the four second nearest
neighbors, along the diagonals. It turns out that the game of life automaton has an
unexpectedly rich behavior. Complex structures emerge out of a primitive “soup” and
evolve so as to develop some skills.
As for von Neumann rule, the game of life is a cellular automata capable of uni-
versal computations: it is always possible to find an initial configuration of the cel-
lular space reproducing the behavior of any electronic gate and, thus, to mimic any
computation process. Although this observation has little practical interest, it is very
important from a theoretical point of view since it assesses the ability of CAs to be a
non restrictive computational technique.
A very important feature of CAs is that they provide simple models of complex
systems. They exemplify the fact that a collective behavior can emerge out of the
sum of many, simply interacting, components. Even if the basic and local interac-
tions are perfectly known, it is possible that the global behavior obeys new laws that
are not obviously extrapolated from the individual properties, as if the whole is more
than the sum of all the parts. This properties makes cellular automata a very inter-
esting approach to model physical systems and in particular to simulate complex and
nonequilibrium phenomena.
28
2. Cellular Automata and Lattice Boltzmann Techniques
LA COMPLEXITÉ
The studies undertaken by S. Wolfram in the 1980s [20,12] clearly estalishes that
a CA (the famous Wolfram’s rules) may exhibits many of the behaviors encountered
in continuous systems, yet in a much simpler mathematical framework. A further
step is to recognize that CAs are not only behaving similarly to some dynamical pro-
cesses, they can also represent an actual model of a given physical system, leading to
macroscopic predictions that could be checked experimentally. This fact follows from
statistical mechanics which tells us that the macroscopic behavior of many systems is
quite disconnected from its microscopic reality and that only symmetries and conser-
vation laws survives to the change of observation level: it is well known that the flows
of a fluid, a gas or even a granular media are very similar at a macroscopic scale, in
spite of their different microscopic nature.
An interesting example is the FHP fluid model proposed by Frisch, Hasslacher and
Pomeau in 1986[21] which can be viewed as a fully discrete molecular dynamics and
yet behaves as predicted by the Navier-Stokes equation when the observation time and
length scales are much larger than the lattice and automaton time step.
Cellular automata fluids like the FHP model (or lattice gas automata (LGA) as
these models are often termed), cannot directly compete with standard computational
fluid dynamics techniques for high Reynolds flows. However, they have been very
successful to model complex situations for which traditional computing techniques
are hardly applicable. Flows in porous media [22,23,24], immiscible flows and insta-
bilities [25,26,27,28], spreading of a liquid droplet and wetting phenomena [14,29],
granular flows[30,31] microemulsion [32] erosion and transport problems [14,33] are
some examples pertaining to fluid dynamics.
Other physical situations,
like pattern formation,
reaction-diffusion pro-
cesses
[34,35,36],
nucleation-aggregation
growth
phenomena,
traffic
pro-
cess [37,38,39] are very well suited to the cellular automata approach.
The cellular automata paradigm presents some weaknesses inherent to its discrete
nature. Lattice Boltzmann (LB) models have been proposed to remedy some of these
problems, using real-valued states instead of Boolean variables. It turns out that LB
models are indeed a very powerful approach which combines numerical efficiency with
the advantage of having a model whose microscopic components are intuitive.
This paper is organized as follows. In the remaining of section 1 a precise definition
of a cellular automata is given. We present some argument to justify the approach
and, finally, the advantages and drawbacks of the method are outlined. In section
2, a sampler of CA rules are presented in order to illustrate the methodology and
give an account of the large variety of possible applications. Section 3 shows, for
the case of a fluid, how to derive rigorously the macroscopic behavior of a cellular
automata model, starting from its Boolean dynamics. Section 4 discusses the lattice
Boltzmann (LB) method and presents an application to compute deposition patterns in
snow transport. Section 5 is devoted to reaction-diffusion systems and some examples
of pattern formations. In section 6 we introduce multiparticles models that concile
some of the advantages of the CA and LB approaches. Finally, section 7 proposes a
LB model for wave propagation in heterogeneous media, as well as its application to
29
LA COMPLEXITÉ
Bastien CHOPARD
model a fracture process and wave localization.
2.1.2
Definition
In order to give a definition of a cellular automaton, we first present a simple example.
Although it is very basic, the rule we discuss here exhibits a surprisingly rich behavior.
It has been proposed initially by Edward Fredkin in the 1970s [40] and is defined on a
two-dimensional square lattice.
Each site of the lattice is a cell which is labeled by its position r = (i, j) where i
and j are the row and column indices. A function ψt(r) is associated to the lattice to
describe the state of each cell at iteration t. This quantity can be either 0 or 1.
The cellular automata rule specifies how the states ψt+1 are to be computed from
the states at iteration t. We start from an initial condition at time t = 0 with a given
configuration of the values ψ0(r) on the lattice. The state at time t = 1 will be obtained
as follows
(1) Each site r computes the sum of the values ψ0(r ) on the four nearest neighbor
sites r at north, west, south and east. The system is supposed to be periodic in
both i and j directions (like on a torus) so that this calculation is well defined for
all sites.
(2) If this sum is even, the new state ψ1(r) is 0 (white) and, else, it is 1 (black).
The same rule (steps 1 and 2) is repeated over to find the states at time t = 2, 3, 4, ....
From a mathematical point of view, this cellular automata parity rule can be ex-
pressed by the following relation
ψt+1(i, j) = ψt(i + 1, j) ⊕ ψt(i − 1,j) ⊕ ψt(i,j + 1) ⊕ ψt(i,j − 1)
(2.1)
where the symbol ⊕ stands for the exclusive OR logical operation. It is also the sum
modulo 2: 1 ⊕ 1 = 0 ⊕ 0 = 0 and 1 ⊕ 0 = 0 ⊕ 1 = 1.
When this rule is iterated, very nice geometric patterns are observed, as shown
in figure 2.1. This property of generating complex patterns starting from a simple
rule is generic of many cellular automata rules. Here, complexity results from some
spatial organization which builds up as the rule is iterated. The various contributions
of successive iterations combine together in a specific way. The spatial patterns that
are observed reflect how the terms are combined algebraically.
This example shows that despite the simplicity of the local rule, the global be-
havior of a CA model can be quite complex. In the present case, the mechanisme
yielding these complex patterns can be unraveled by working out how successive it-
erations combine several copies of the initial configuration, all shifted by a different
amount[14].
30
2. Cellular Automata and Lattice Boltzmann Techniques
LA COMPLEXITÉ
(a)
(b)
(c)
Figure 2.1: The ⊕ rule on a 256 × 256 periodic lattice. (a) initial configuration. (b)
and (c) configurations after tb = 93 and tc = 110 iterations, respectively.
Based on this example we now give a definition of a cellular automata. Formally a
cellular automata is made of
(i) A regular lattice of cells covering a portion of a d-dimensional space.
(ii) A set Φ(r, t) = {Φ1(r, t), Φ2(r, t), ..., Φm(r, t)} of boolean variables attached
to each site r of the lattice and giving the local state of each cell at the time
t = 0, 1, 2, ....
(iii) A rule R = {R1, R2, ..., Rm} which specifies the time evolution of the states
Φ(r, t) in the following way
Φj(r, t + τ ) = Rj(Φ(r, t), Φ(r + δ1, t), Φ(r + δ2, t), ..., Φ(r + δq, t)) (2.2)
where r + δk designate the cells belonging to a given neighborhood of cell r.
The example discussed in the previous section is a particular case in which the state
of each cell consists of a single bit Φ1(r, t) = ψt(r) of information and the rule is the
addition modulo 2.
In the above definition, the rule R is identical for all sites and is applied simulta-
neously to each of them, leading to a synchronous dynamics. It is important to notice
that the rule is homogeneous, that is it cannot not depend explicitly on the cell position
r. However, spatial (or even temporal) inhomogeneities can be introduced anyway by
having some Φj(r) systematically 1 in some given locations of the lattice to mark par-
ticular cells on which a different rule apply. Boundary cells are a typical example of
spatial inhomogeneities. Similarly, it is easy to alternate between two rules by having
a bit which is 1 at even time steps and 0 at odd time steps.
The neighborhood (i.e. the spatial region around each cell used to compute the next
state) is usually made of the adjacent cells of the central cell. It is often restricted to the
nearest or next to nearest neighbors, otherwise the complexity of the rule is too large.
For a two-dimensional cellular automaton, two neighborhoods are often considered:
the von Neumann neighborhood which consists of a central cell (the one which is to
31
LA COMPLEXITÉ
Bastien CHOPARD
be updated) and its four geographical neighbors North, West, South and East. The
Moore neighborhood contains, in addition, the second nearest neighbor North-East,
North-West, South-East and South-East, that is a total of nine cells.
According to the above definition, a cellular automaton is deterministic. The rule
R is some well defined function and a given initial configuration will always evolve
identically. However, as we shall see later, it may be very convenient for some applica-
tions to have a certain degree of randomness in the rule. For instance, it may be desir-
able that a rule selects one outcome among several possible states, with a probability
p. Cellular automata whose updating rule is driven by some external probabilities are
called probabilistic cellular automata. On the other hand, those which strictly comply
with the definition given above, are referred to as deterministic cellular automata.
Probabilistic cellular automata are a very useful generalization because they offer
a way to adjust the parameters of a rule in a continuous range of values, despite the
discrete nature of the cellular automata world. This is very convenient when modeling
physical systems in which, for instance, particles are annihilated or created at some
given rate.
2.1.3
CA as a model of the physical world
A natural way to describe a physical system is to propose a model of what we think
is happening. During this process we usually retain only the ingredients we believe to
be essential in order to capture the behavior we are interested in. Using an appropiate
mathematical machinery, such a model can then be expressed in terms a set of equa-
tions whose solution gives the desired answers on the system. The description in terms
of equations is very powerful and corresponds to a rather high level of abstraction. For
a long time, this methodology has been the only tractable way for scientists to address
a problem.
Another approach which has been made possible by the advent of fast computers
is to stay at the level of the model. The idea is that all the information is already
contained in the model and that a computer simulation will be able to answer any
possible question on the system by just running the model for some time. Thus there
is no need to use a complicated mathematical tool to obtain a high level of description.
We just need to express the model a way which is suitable to an effective computer
implementation. In the framework of CAs, this last step is usually very intuitive and
require little development time.
The degree of reality of the model depends on the level of description we expect.
When we are interested in the global or macroscopic properties of a system (and this is
the case here), we already mentioned that, except for the symmetries and conservation
laws, the microscopic details are often not relevant. It is therefore a clear advantage to
invent a much simpler microscopic reality, which is more appropriate to our numerical
means of investigation.
A cellular automata model can be seen as a fictitious universe which has its own
microscopic reality but, nevertheless, has the same macroscopic behavior as the real
32
2. Cellular Automata and Lattice Boltzmann Techniques
LA COMPLEXITÉ
system we are interested in. The example we shall give in the next section will illustrate
this statement.
2.1.4
Limitations, advantage, drawbacks and Extension
Modeling a system at a microscopic level of description has significant advantages.
The interpretation of the cellular automata dynamics in terms of simple microscopic
rules offers a very intuitive and powerful approach to model phenomena that are very
difficult to include in more traditional approaches (such as differential equations). For
instance, boundary conditions are often naturally implemented in a cellular automata
model because it has a natural interpretation at this level of description (e.g. particles
bouncing back on an obstacle). For instance, the phenomena of wetting of a solid
substrate by a spreading liquid illustrates the difficulty to define appropriate boundary
conditions at the level of the Navier-Stokes equation. Yet, in the framework of a CA
description, this can be achieved in a simple way [14].
Numerically, an advantage of the CA approach is its simplicity and its adequation
to computer architectures and parallel machines. In addition, working with Boolean
quantities prevent numerical instabilities since an exact computation is made. There
is no truncation or approximation in the dynamics itself. Finally, a CA model is an
implemetation of a N-body system where all correlations are taken into account, as
well as spontaneous fluctuations arising in a system made up of many particles.
On the other hand, cellular automata models have several drawbacks related to their
fully discrete nature. An important one is the statistical noise requiring a systematic
averaging processes. Another one is the little flexibility to adjust parameters of a rule
in order to describe a wider range of physical situations.
At the end of the 1980s, McNamara and Zanetti [41] Higueras, Jimenez and
Succi [42] have shown the advantage of extending the Boolean dynamics of the au-
tomaton to directly work on real numbers representing, somehow, the probability for a
cell to have a given state. This approach, called the lattice Boltzmann (LB) method, is
numerically much more efficient than the Boolean dynamics and provides an new com-
putational model much more appropriate to simulate high Reynolds flows and many
other relevant applications (for instance glacier flow[43] and fracture processes). On
the other hand, the LB approach re-introduce the risk of numerical instabilities and,
also, requires some hypotheses of factorization of the joint probability in order to write
the interaction. We will return to the this approach in section 2.4.
Another generalization of the original definition of a CA is the multiparticle
method in which the number of state of each cell is infinite so that an arbitrary number
of particles can stay simultaneously at each site. This offers much more flexibility to
tune the parameter of the rule and reduces considerably the statistical noise. A mul-
tiparticle model goes in the same direction as the LB models but it does not need a
factorization assumption and is not sensitive to numerical instability. Unfortunately, as
explained in section 2.6, it requires more implementation effort than the LB approach
and is also numerically less efficient.
33
LA COMPLEXITÉ
Bastien CHOPARD
(a)
(b)
(c)
Figure 2.2: Evolution of the annealing rule. The inherent “surface tension” present
in the rule tends to separate the black phases s = 1 from the white phase s = 0.
The snapshots (a), (b) and (c) correspond to t = 0, t = 72 and t = 270 iterations,
respectively. The extra gray levels indicate how “capes” have been eroded and “bays”
filled: dark gray shows the black regions that have been eroded during the last few
iterations and light gray marks the white regions that have been filled.
Finally, we should remark that the cellular automata approach is not a rigid frame-
work but should allow for many extensions according to the problem at hand. The CA
methodology is a philosophy of modeling where one seeks a description in terms of
simple but essential mechanisms. Its richness and interest of comes from the micro-
scopic contents of its rule for which there is, in general, a clear physical or intuitive
interpretation of the dynamics directly at the level of the cell.
2.2
Examples of simple rules
In this section we consider several CA rules in order to illustates the ideas we have in-
troduced in section 2.1. Although the rules we will present here have a clear physical
contents, some of them should be considered as toy models because their ability to de-
scribe the macroscopic behavior of a real physical system does not resist to a detailled
analysis. However, our goal is to present the flavor of the CA appraoch but not to give
a proof that the rule we propose is rigorously related to a given process.
2.2.1
A growth model
A natural class of cellular automata rules consists of the so-called majority rules. The
updating selects the new state of each cell so as to conform to the value currently hold
by the majority of the neighbors. Typically, in these majority rules, the state is either 0
or 1.
A very interesting behavior is observed with the twisted majority rule proposed by
G. Vichniac [44]: in two-dimensions, each cell considers its Moore neighborhood (i.e
itself plus its eight nearest neighbors) and computes the sum of the cells having a value
34
2. Cellular Automata and Lattice Boltzmann Techniques
LA COMPLEXITÉ
1. This sum can be any value between 0 and 9. The new state sij(t + 1) of each cell is
then determined from this local sum, according to the following table
sumij(t)
0 1 2 3 4 5 6 7 8 9
sij(t + 1)
0 0 0 0 1 0 1 1 1 1
(2.3)
As opposed to the plain majority rule, here, the two middle entries of the table have
been swapped. Therefore, when there is a slight majority of 1 around a cell, it turns to
0. Conversely, if there is a slight majority of 0, the cell becomes 1.
Surprisingly enough this rule describes the interface motion between two phases, as
illustrated in Figure 2.2. Vichniac has observed that the normal velocity of the interface
is proportional to its local curvature, as required by the Allen-Cahn [45] equation. Of
course, due to its local nature, the rule cannot detect the curvature of the interface
directly. However, as the the rule is iterated, local information is propagated to the
nearest neighbors and the radius of curvature emerges as a collective effect.
This rule is particularly interesting when the initial configuration is a random mix-
ture of the two phases, with equal concentration. Otherwise, some pathological be-
haviors may occur. For instance, an initial square of 1’s surrounded by zero’s will not
evolve: right angles are not eroded but stable structures.
2.2.2
Ising-like dynamics
The Ising model is extensively used in physics. Its basic constituents are spins si
which can be in one of two states: si ∈ {−1, 1}. These spins are organized on a regular
lattice in d-dimensions and coupled in the sense that each pair (si, sj) of neighbor spins
contributes an amount −Jsisj to the energy of the system. Intuitively, the dynamics of
such a system is that a spin flips (s i → −si) if this is favorable in view of the energy
of the local configuration.
Vichniac [44], in the 1980s, has proposed a CA rule, called the Q2R, simulating
the behavior of an Ising spin dynamics. The model is as follows:
We consider a two-dimensional square lattice such that each site holds a spin si
which is either up (si = 1) or down (si = 0) (instead of ±1). The coupling between
spins is assumed to come from the von Neumann neighborhood (i.e. north, west south
and east neighbors).
In this simple model, the spins will flip (or not flip) during their discrete time
evolution according to a local energy conservation principle. This means we are con-
sidering a system which cannot exchange energy with its surroundings. The model
will be a microcanonical cellular automata simulation of Ising spin dynamics, without
a temperature but with a critical energy.
A spin si can flip at time t to become 1 − s i at time t + 1 if and only if this move
does not cause any energy change. Accordingly, spin si will flip if the number of
its neighbors with spin up is the same as the number of its neighbors with spin down.
However, one has to remember that the motion of all spins is simultaneous in a cellular
35
LA COMPLEXITÉ
Bastien CHOPARD
automata. The decision to flip is based on the assumption that the neighbors are not
changing. If they are allowed to flip too, (because they obey the same rule), then
energy may not be conserved.
A way to cure this problem is to split the updating in two phases and consider
a partition of the lattice in odd and even sites (e.g. the white and black squares of
a chess-board in 2D): first, one flips the spins located at odd positions, according
to the configuration of the even spins. In the second phase, the even sublattice is
updated according to the odd one. The spatial structure (defining the two sublattices)
is obtained by adding an extra bit b to each lattice site, whose value is 0 for the odd
sublattice and 1 for the even sublattice. The flipping rule described earlier is then
regulated by the value of b. It takes place only for those sites for which b = 1. Of
course, the value of b is also updated at each iteration according to b(t + 1) = 1 −b(t),
so that at the next iteration, the other sublattice is considered. In two-dimensions, the
Q2R rule can be the expressed by the following expressions
1
s
− sij(t) if bij = 1 and si−1,j + si+1,j + si,j−1 + si,j+1 = 2
ij (t + 1) =
(2.4)
sij(t)
otherwise
and
bij(t + 1) = 1 − bij(t)
(2.5)
where the indices (i, j) label the cartesian coordinates and sij(t = 0) is either one or
zero.
The question is now how well does this cellular automata rule performs to describe
an Ising model. Figures 2.3 show a computer simulation of the Q2R rule, starting
from an initial configuration with approximately 11% of spins sij = 1 (figure 2.3
(a)). After a transient phase (figures (b) and (c)), the system reaches a stationary state
where domains with “up” magnetization (white regions) are surrounded by domains
of “down” magnetization (black regions).
In this dynamics, energy is exactly conserved because that is the way the rule is
built. However, the number of spins down and up may vary. In the present experiment,
the fraction of spins up increases from 11% in the initial state to about 40% in the sta-
tionary state. Since there is an excess of spins down in this system, there is a resulting
macroscopic magnetization.
It is interesting to study this model with various initial fractions ρs of spins up.
When starting with a random initial condition, similar to that of figure 2.3 (a), it is
observed that, for many values of ρs, the system evolves to a state where there is, in the
average, the same amount of spin down and up, that is no macroscopic magnetization.
However, if the initial configuration presents an sufficiently large excess of one kind
of spins, then a macroscopic magnetization builds up as time goes on. This means
there is a phase transition between a situation of zero magnetization and a situation of
positive or negative magnetization.
It turns out that this transition occurs when the total energy E of the system is
low enough (a low energy means that most of the spins are aligned and that there is
an excess of one species over the other), or more precisely when E is smaller than
36
2. Cellular Automata and Lattice Boltzmann Techniques
LA COMPLEXITÉ
(a)
(b)
(c)
(d)
Figure 2.3: Evolution of a system of spins with the Q2R rule. Black represents the
spins down sij = 0 and white the spins up sij = 1. The four images (a), (b), (c) and
(d) show the system at four different times ta = 0 < tb < tc << td.
a critical energy Ec. In that sense, the Q2R rule captures an important aspect of a
real magnetic system, namely a non-zero magnetization at low energy (which can be
related to a low temperature situation) and a transition to a non magnetic phase at high
energy.
However Q2R also exhibits unexpected behavior that are difficult to detect from a
simple observation. There is a breaking of ergodicity: a given initial configuration of
energy E0 evolves without visiting completely the region of the phase space character-
ized by E = E0.
This is illustrated by the following simple 1D example, where a ring of four spins
with periodic boundary condition are considered.
t :
1001
t + 1 :
1100
t + 2 :
0110
t + 3 :
0011
t + 4 :
1001
(2.6)
After four iterations, the system cycles back to its original state. The configuration
of this example has E0 = 0. As we observed, it never evolves to 0111, which is
also a configuration of zero energy. This non-ergodicity means that not only energy
is conserved during the evolution of the automaton, but also another quantity which
partitions the energy surface in independent regions.
37
LA COMPLEXITÉ
Bastien CHOPARD
(a)
(b)
Figure 2.4: Final (stationary) configuration of the competition CA model. (a) A typical
situation with about 23% of active cells, obtained with almost any value of panihil and
pgrowth. (b) Configuration obtained with panihil = 1 and pgrowth = .8 and yielding
a fraction of 28% of active cells; one clearly sees the close-packed regions and the
defects.
2.2.3
Competition models and cell differentiation
In section 2.2.1 we have discussed a majority rule in which the cells imitate their
neighbors. In some sense, this corresponds to a cooperative behavior between the cells.
A quite different situation can be obtained if the cells obey a competitive dynamics.
For instance we may imagine that the cells compete for some resources at the expense
of their nearest neighbors. A winner is a cell of state 1 and a looser a cell of state 0.
No two winner cells can be neighbor and any looser cell must have at least one winner
neighbor (otherwise nothing would have prevented it to also win).
It is interesting to note that this problem has a direct application in biology, to
study cells differentiation. It has been observed in the development of the drosophila
that about 25% of the cells forming the embryo are evolving to the state of neurob-
last, while the remaining 75% does not. How can we explain this differentiation and
the observed fraction since, at the beginning of the process all cells can be assumed
equivalent? A possible mechanism [46] is that some competition takes place between
the ajacent biological cells. In other word, each cell produces some substance S but
the production rate is inhibited by the amount of S already present in the neighboring
cells. Differentiation occurs when a cell reaches a level of S above a given threshold.
The competition CA model we propose to describe this situation is the follow-
ing. Due to the analogy with the biological system, we shall consider a hexagonal
lattice which is a reasonable approximation of the cell arrangement observed in the
drosophila’embryo. We assume that the values of S can be 0 (inhibited) or 1 (active)
in each lattice cell.
• A S = 0 cell will grow (i.e. turn to S = 1) with probability pgrow provided that
all its neighbors are 0. Otherwise, it stays inhibited.
• A cell in state S = 1 will decay (i.e. turn to S = 0) with probability pdecay if
38
2. Cellular Automata and Lattice Boltzmann Techniques
LA COMPLEXITÉ
it is surrounded by at least one active cell. If the active cell is isolated (all the
neighbors are in state 0) it remains in state 1.
The evolution stops (stationary process) when no S = 1 cell feels any more inhibition
from its neighbor and when all S = 0 cell are inhibited by their neighborhood. Then,
cells with S = 1 are those which will differentiate.
What is the expected fraction of these S = 1 cells in the final configuration?
Clearly, the maximum value is 1/3 which, according to the inhibition condition we
imposed, is the close-packed situation on the hexagonal lattice. On the other hand,
the minimal value is 1/6, corresponding to a situation where the lattice is partitioned
in blocks with one active cell surrounded by 5 inhibited cells. In practice we do not
expect any of these two limits to occur spontanously after the automaton evolution.
On the contrary, we should observe clusters of close-packed active cells surrounded by
defects, i.e. regions of low density of active cells (see figure 2.4).
CA simulations give a very interesting results, namely that the fraction s of active
cells when the stationary state is reached is
.23 ≤ s ≤ .24
almost irrespectively of the values chosen for panihil and pgrowth. This is exactly what
we expect from the biological observations made on the drosophila’s embryo. Thus,
cell differentiation can be explained by a geometrical competition without having to
specify the inhibitory couplings between adjacent cell and the production rate (i.e. the
values of panihil and pgrowth): the result is quite robust against any possible choices.
In our CA model, there are, however, some pathological results when either panihil
or pgrowth equals to one. For instance, panihil = 1 and pgrowth = .8, we obtain s ≈ .28.
This situation is illustrated in figure 2.4 (b).
2.2.4
Traffic models
Cellular automata models for road traffic have received a great deal of interest during
the past few years (see [47,37,38,39,48,49,50,51] for instance).
One-dimensional models
One-dimensional models for single lane car motions are quite simple and elegant. The
road is represented as a line of cells, each of them being occupied or not by a vehicle.
All cars travel in the same direction (say to the right). Their positions are updated
synchronously. During the motion, each car can be at rest or jump to the nearest
neighbor site, along the direction of motion. The rule is simply that a car moves only
if its destination cell is empty. This means that the drivers are short-sighted and do not
know whether the car in front will move or is also stuck by another car. Therefore, the
state of each cell si is entirely determined by the occupancy of the cell itself and its two
39
LA COMPLEXITÉ
Bastien CHOPARD
nearest neighbors si−1 and si+1. The motion rule can be summarized by the following
table, where all eight possible configurations (si−1sisi+1)t → (si)t+1 are given
(111) (110) (101) (100) (011) (010) (001) (000)
(2.7)
1
0
1
1
1
0
0
0
This cellular automaton rule turns out to be Wolfram’s rule 184 [20,47].
This simple dynamics captures an interesting feature of real car motion: traffic
congestion. Suppose we have a low car density ρ in the system, for instance something
like
. . . 0010000010010000010 . . .
(2.8)
This is a free traffic regime in which all the cars are able to move. The average velocity
< v > defined as the number of motions divided by the number of cars is then
< vf >= 1
(2.9)
where the subscript f indicates a free state. On the other hand, in a high density
configuration such as
. . . 110101110101101110 . . .
(2.10)
only 6 cars over 12 will move and < v >= 1/2. This is a partially jammed regime.
If the car positions were uncorrelated, the number of moving cars (i.e the number
of particle-hole pairs) would be given by Lρ(1 − ρ), where L is the system size. Since
the number of cars is ρL, the average velocity would be
< vuncorrel >= 1 − ρ
(2.11)
However, in this model, the car occupancy of adjacent sites is highly correlated and
the vehicles cannot move until a hole has appeared in front of them. The car distribu-
tion tries to self-adjust to a situation where there is one spacing between consecutive
cars. For densities less than one-half, this is easily realized and the system can organize
to have one car every other site.
Therefore, due to these correlations, equation 2.11 is wrong in the high density
regime. In this case, since a car needs a hole to move to, we expect that the number
of moving cars simply equals the number of empty cells [47]. Thus, the number of
motions is L(1 − ρ) and the average velocity in the jammed phase is
1
< v
− ρ
j >=
(2.12)
ρ
A richer version of the above CA traffic model is due to Nagel and Schrecken-
berg [50,37,38]. The cars may have several possible velocities u = 0, 1, 2, ..., umax.
Let ui be the velocity of car i and di the distance, along the road, separating cars i and
i + 1. The updating rule is:
• The cars accelerate when possible: ui → u = u
i
i + 1, if ui < umax.
40
2. Cellular Automata and Lattice Boltzmann Techniques
LA COMPLEXITÉ
• The cars slow down when required: u
= d
i → ui
i − 1, if ui ≥ di.
• The cars have a random behavior: ui → ui = ui − 1, with probability pi if
u > 0
i
.
• Finally the cars move ui sites ahead.
This rule caputres some important behaviors of real traffic on a highway: velocity fluc-
tuations due to a non-deterministic behavior of the drivers, and “stop-and-go” waves
observed in high density traffic regime (i.e. some cars get stop for no specific reasons.
A 2D traffic model
A CA traffic model can also be defined for the situation of a street network, where
several lane may cross provided that the rule is extended to deal with cars entering the
same road junction. In the case of an urban traffic, we may restrict ourselves to a one
speed CA.
Our approach is to model a road intersection as a rotary. Cars in the rotary have
priority over those willing to enter. It is easy to add traffic lights in such a model by
blocking the entry to the rotary to to car coming from a given road. Note that road
crossings may be a bottleneck limiting the traffic flow and, thus, causing congestion.
Let us consider the case of a Manhattan-like city. We assume that horizontal roads
consist of two lanes, one for eastward motion and the other for westward motion.
Similarly, vertical streets are composed of northbound and southbound lanes. Road
junctions are formed by central points around which the traffic moves always in the
same direction.
A four-corner junction is shown in figure 2.5. The four middle cells constitute the
rotary. A vehicle on the rotary (like b or d) can either rotate counterclockwise or exit.
A local flag t f is used to decide of the motion of a car in a rotary. If tf = 0, the vehicle
(like d) exits in the direction allowed by the color of its lane (see figure caption).
If tf = 1, the vehicle moves counterclockwise, like b. The value of the local turn
flag t f can be updated according to the modeling needs: it can be constant for some
amount of time to impose a particular motion at a given junction, completely random,
random with some bias to favor a direction of motion, or may change deterministically
according to any user specified rule.
Figure 2.6 shows a typical traffic configurations. In figure (a), a vehicle has a
probability 1/2 to exit at each rotary cell. In figure (b), the turn flag t f has an initial
random distribution on the rotary. This distribution is fixed for the first 20 iterations
and then flips to t f = 1 − tf for the next 20 steps an so on. In this way, a junction
acts as a kind of traffic light, which for some amount of time, allows only a given flow
pattern. We observed that the global traffic pattern is different in the two cases: in case
(a), the car distribution is quite homogeneous along the streets. On the other hand, in
case (b), cars get queued at some junctions while some other streets remain empty.
41
LA COMPLEXITÉ
Bastien CHOPARD
f
g
d
c
a
b
h
e
Figure 2.5: Example of a traffic configuration near a junction. The four central cells
represent a rotary which is traveled counterclockwise. The grey levels indicate the
different traffic lanes: white is a northbound lane, light grey an eastbound lane, grey a
southbound lane and, finally, dark grey is a westbound lane. The dots labeled a, b, c, d,
e, f , g and h are cars which will move to the destination cell indicated by the arrows,
as determined by the cell turn flag tf . Cars without an arrow are forbidden to move.
The behavior of the above traffic model can be described analytically [39]. The first
important fact is that a rotary junction has a maximum possible flow of cars. Thus, the
number of vehicles able to enter a rotary per unit time cannot be larger than a given
value determined by the rule of motion. Therefore, there is a critical average density
ρcrit
1
above which the traffic is not free but constrained by this maximum rotary flow.
As a result, car queues are formed at road junctions.
The second key observation is that, in the regime above ρcrit
1
, the system self-
organizes in three different regions of fixed car densities: the queues that form before
a junction, the road segments after a junction, characterized by a low traffic density
and the region inside a rotary. The three densities associated to these different regions
correspond to a jammed density ρj, a free traffic density ρf and a rotary density ρr,
respectively.
As the overall car number is increased, ρj, ρf and ρr remain constant: the result
of increasing the number of cars is to extend the length
of the car queues, without
changing the density in the three regions. The reason for fixed densities is that, due
to the flow diagram of rule 184 [47], there are only two possible densities ρ f and ρj
compatible with a given traffic flow ρ < v >, along a road segment. Thus, the only
way to absorb an excess of car is to increase the size of the queue.
When one keeps adding cars in the system, there is a second critical average density
ρcrit
2
for which the length of some queues becomes larger than the distance separating
two consecutive street intersections. The up-traffic rotary output gets disturbed and,
from a maximum-flow traffic regime, one gets into a strongly jammed phase.
Provided that the turning decision at rotaries is random and not time correlated,
42
2. Cellular Automata and Lattice Boltzmann Techniques
LA COMPLEXITÉ
(a)
(b)
Figure 2.6: Traffic configuration after 600 iterations, for a car density of 30%. Streets
are white, buildings grey and the black pixels represent the cars. The Situation (a)
corresponds to an equally likely behavior at each rotary junction, whereas image (b)
mimics the presence of traffic lights. In the second case, queues are more likely to form
and the global mobility is less than in the first case.
one typically obtains [39]
1
3
1
ρf =
ρ
ρ
(2.13)
4
j = 4
r = 2
Assuming that the queues length is
along all road segments and that the separation
between two consecutive junctions is L (the network period), we can relate the average
car density ρ to by the relation[47]
4(L − 2 − )ρf + 4 ρj + 4ρr = 4Lρ
(2.14)
Equation 2.14 simply reflects that the total number of cars is distributed in three re-
gions: queues of length and density ρj, free traffic segments of length L − − 2 and
density ρf and rotaries of size four and density ρr.
In the case of large L, the queue length can be approximated by
ρ
=
− ρf
(2.15)
L
ρj − ρf
Equation 2.15 provides a way to determine the critical densities ρcrit
1
and ρcrit
2
. For
ρ < ρf , is negative, which should be interpreted in the sense that no queue is formed.
This is the free traffic regime. Thus, ρcrit
1
= ρf = 1/4 and the average velocity is
< v >= 1, independent of ρ.
On the other hand, for ρf < ρ < ρj, car queues form but their lengths are smaller
than the distance between successive intersections. This is the maximum flow regime.
In this case, we have ρ < v >= J = const = 1/4, that is < v >= 1/(4ρ).
Finally, for ρ > ρj = ρcrit
2
, the queues reach their maximum length L and the rotary
exits are hindered. This is the strongly jammed traffic regime. The traffic velocity is
43
LA COMPLEXITÉ
Bastien CHOPARD
governed by the motion of holes and obeys 2.12, namely < v >= (1 − ρ)/ρ. If < v >
is taken as the order parameter, both of these transitions are second order.
Figure 2.7 (a) shows the velocity-density diagram obtained from CA simulations,
for the situation we just described. We have considered various road spacings for our
measurements (i.e the distance L separating consecutive intersections). The larger the
spacing the better the agreement with the analytical description. Note that for small L,
the correlation along the lane cannot build up and < v > obeys 2.11.
In figure 2.7 (b), we also show the velocity-density diagram in the case the drivers
choose the rotary exit at random but stick to this decision even if the exit they have
chosen is not free.
(a)
(b)
1
1
free rotary
fixed decision
road spacing=256
road spacing=128
road spacing=64
road spacing=32
<v>
<v>
road spacing=256
road spacing=32
road spacing=4
0
0
0
1
0
1
car density
car density
Figure 2.7: Average velocity versus average density for the cellular automata street
network, for (a) time-uncorrelated turning strategies and (b) a fixed driver’s decision.
The different curves correspond to different distances L between successive road junc-
tions. The dashed line is the analytical prediction. Junction deadlock is likely to occur
in (b), resulting in a completely jammed state.
The present CA model can be adapted to simulate traffic in more realistic situa-
tions. We have considered the case of the city of Geneva and its suburbs[52,53]. The
simulations uses the full road network (4000 km, 3145 road segments and 1066 junc-
tions with a number of 800765 cells) and a large set of origin and destination pairs
(about 50’000) for the cars traveling during the rush hour.
The precise departure time of each vehicle is not known from observations. It is
natural to assume that the distribution of these departure times is not uniform. Here
we assume that this distribution has the form shown in figure 2.8 and is characterized
by two parameters: (i) the duration I of the departure period and (ii) the ratio p2/p1
specifying the degree of non-uniformity. Empirically we choose p2/p1 = 6 and I = 45
minutes (so that almost all cars have arrived after 90 minutes).
Due to the lack of data concerning the real evolution of the traffic state in the city
of Geneva, we did not investigate systematically the effect of varying p2/p1 and I.
Rather, we focused on the problem of measuring the time necessary for a test car to
travel from a given origin A to a given destination B. This time is of direct interest
to the drivers because it determines, for instance, when they must leave their house in
44
2. Cellular Automata and Lattice Boltzmann Techniques
LA COMPLEXITÉ
Probability
p2
p1
time
0
I
I
2
I
3
3
Figure 2.8: Distribution of departure times used in the simulation of the city of Geneva
traffic.
order to be on time at their work. This is also a quantity which is easily comparable
with the reality by actually driving from A to B.
The interesting fact is that the travel time is a fluctuating quantity. If one repeats
the same trip under the same condition (for instance the next day, at the same time), the
drive is likely to be longer or shorter. This fact is well known from everyday experience
and is also well reproduced in the CA model because the probability distribution of
the departure times gives the necessary randomness to produce fluctuations when the
simulation is repeated.
Our main result is that the amplitude of the variations of the travel times depends
very much on the departure time of the test car and on its trip. In the simulations, we
studied the four trips shown in figure 2.9.
The measured times obtained from the simulation for trips 2 and 3 are shown in
figure 2.10. The results for trip 1 and 4 are similar.
For trip 3, the average time needed to reach the desired destination is not constant:
it is maximal if the driver leaves 15 to 20 minutes after the start of the rush hour. It is
minimal if the diver leaves at the very beginning or the very end of interval I. On the
other hand, the average time for trip 2 is quite stable. These two situations differ by
the fact that trip 3 uses heavily loaded sections with many crossings while trip 2 uses
higher capacity sections.
We also observe that, for trip 3, it is impossible to make accurate predictions on
the time needed to reach the destination point. Variations up to 30% show up. We call
this variation the risk1 associated to the trip (for a given departure time) to describe
the fact that an expected outcome is likely not to occur. In practice, for trip 3, in
which the variation is high, there is a large risk to arrive late at destination, or to be
too early, which may not be acceptable either. This also means that it is not possible to
establish an accurate schedule for taxis or public transportation, unless dedicated lanes
are available.
Finally, figure 2.11 shows the dependence of < v >, the average car velocity in
1In finance, the term risk is also used to describe the standard deviation of a random quantity.
45
LA COMPLEXITÉ
Bastien CHOPARD
Origin
Destination
2
4
3
3
1
4
2
1
Figure 2.9: The road network of Geneva used in our simulation and the four selected
trips considered to measure the travel time of a test car.
46
2. Cellular Automata and Lattice Boltzmann Techniques
LA COMPLEXITÉ
Trip 2
Trip 3
35
35
30
30
25
25
20
20
Average travel time
15
Average travel time
15
Travel time [minutes]
Travel time [minutes]
10
10
0
5
10
15
20
25
30
35
40
45
0
5
10
15
20
25
30
35
40
45
Departure time [minutes]
Departure time [minutes]
Figure 2.10: Expectation time and “risk” of trips 2 and 3 of figure 2.9. The horizontal
axis corresponds to the departure time of a test vehicle within interval I. The dashed
line shows the average driving time and the shaded region indicates the amplitude of
the variation of this time (computed as the standard deviation). Note that the times
shown here are pretty realistic, thus giving an indirect validation of our simulations
for the case of Geneva.
50
49
48
47
46
45
44
average speed [km/h]
43
42
410
1
2
3
4
5
6
car density [%]
Figure 2.11: Dynamical flow diagram for p2/p1 = 6. As time goes on (t ∈ [0, I]), the
car density first increases and the upper branch of the diagram is formed; then, when
the density decreases, the lower branch is measured.
the network, as a function of the average car density ρ. Since the traffic load is not
stationary but concentrated within about one and a half hour, the steady-state density-
velocity diagram (as shown for instance in fig. 2.7) is no longer valid and must be
replaced by a “dynamic” diagram which shows a significant hysteresis.
2.2.5
A simple gas: the HPP model
The HPP rule is a simple example of an important class of cellular automata models:
lattice gas automata (LGA). The basic ingredient of such models are point particles
that move on a lattice, according to appropriate rules so as to mimic a fully discrete
“molecular dynamics.”
The HPP lattice gas automata is traditionally defined on a two-dimensional square
47
LA COMPLEXITÉ
Bastien CHOPARD
lattice. Particles can move along the main directions of the lattice, as shown in fig-
ure 2.12. The model limits to 1 the number of particles entering a given site with a
given direction of motion. This is the exclusion principle which is common in most
LGA. Consequently, four bits of information in each site are enough to describe the
system during its evolution. For instance, if at iteration t site r has the following state
s(r, t) = (1011), it means that three particles are entering the site along direction 1,3
and 4, respectively.
Figure 2.12: Example of a configuration of HPP particles
The cellular automata rule describing the evolution of s(r, t) is often split in two
steps: collision and motion (or propagation). The collision phase specifies how the par-
ticles entering the same site will interact and change their trajectories. The purpose of
the HPP rule is to model a gas of colliding particles and, thus, essential features of this
step are borrowed from the real microscopic interactions, namely local conservation of
momentum and particle number. Since the collision phase amounts to rearranging the
particles in different direction, it ensures that the exclusion principle will be satisfied,
provided that it was at time t = 0.
During the propagation phase, the particles actually move to the nearest neighbor
site they are traveling to. Figure 2.13 illustrates the HPP rules. This decomposition
into two phases is a quite convenient way to partition the space so that the collision
rule is purely local.
According to our Boolean representation of the particles at each site, the collision
part for the two head on collisions are expressed as
(1010) → (0101)
(0101) → (1010)
(2.16)
all the other configurations being unchanged. During the propagation phase, the first
bit of the state variable is shifted to the east neighbor cell, the second bit to the north
and so on.
The aim of this rule is to reproduce some aspect of the real interactions between
particles, namely that momentum and particle number are conserved during a collision.
48
2. Cellular Automata and Lattice Boltzmann Techniques
LA COMPLEXITÉ
(a)
(b)
(c)
time t
time t+1
Figure 2.13: The HPP rule: (a) a single particle has a ballistic motion until it expe-
riences a collision; (b) and (c) the two non-trivial collisions of the HPP model: two
particles experiencing a head on collision are deflected in the perpendicular direction.
In the other situations, the motion is ballistic, that is the particles are transparent to
each other when they cross the same site.
From figure 2.13, it is easy checked that these properties are obeyed: a pair of zero
momentum particles along a given direction is transformed into another pair of zero
momentum along the perpendicular axis.
The HPP rule captures another important ingredient of the microscopic nature of
a real interaction: invariance under time reversal. Figures 2.13 (b) and (c) show that,
if at some given time, the directions of motion of all particles are reversed, the system
will just trace back its own history. Since the dynamics of a deterministic cellular au-
tomaton is exact, this fact allows us to demonstrate the properties of physical systems
to return to their original situation when all the particles reverse their velocity.
Figure 2.14 illustrate the time evolution of a HPP gas initially confined in the left
compartment of a container. There is an aperture on the wall of the compartment and
the gas particles will flow so as to fill the entire space available to them. In order to
include a solid boundary in the system, the HPP rule is modified as follows: when a
site is a wall (indicated by an extra bit), the particles no longer experience the HPP
collision but bounce back from where they came. Therefore, particles cannot escape a
region delimited by such a reflecting boundary.
49
LA COMPLEXITÉ
Bastien CHOPARD
(a)
(b)
Figure 2.14: Time evolution of a HPP gas. (a) From the initial state to equilibrium.
(b) Illustration of time reversal invariance: in the rightmost image of (a), the velocity
of each particle is reversed and the particles naturally return to their initial position.
If the system of figure 2.14 is evolved, it reaches an equilibrium after a long enough
time and no macroscopic trace of its initial state is any longer visible. However, no
information has been lost during the process (no numerical dissipation) and the system
has the memory of where it comes from. Reversing all the velocities and iterating the
HPP rule makes all particle go back to the compartment in which they were initially
located.
This behavior is only possible because the dynamics is perfectly exact and that
no numerical errors are present in the numerical scheme. If one introduces externally
some errors (for instance, one can add an extra particle in the system) before the direc-
tion of motion of each particle is reversed, then reversibility is lost.
The HPP rule is important because it contains the basic ingredients of many models
we are going to discuss below. However, the capability of this rule to model a real gas
of particle is poor, due to a lack of isotropy and spurious invariants. We shall see in
section 2.3 that a remedy to this problem is to use a different lattice.
2.2.6
Random walk
The HPP rule we discussed in the previous section can be easily modified to produce
many synchronous random walks. Instead of experiencing a mass and momentum
conserving collision, each particle now selects, at random, a new direction of motion
among the possible values permitted by the lattice. Since several particles may enter
the same site (up to four, on a two-dimensional square lattice), the random change of
directions should be such that there are never two or more particle exiting a site in the
same direction. This would otherwise violate again the exclusion principle.
The solution is to shuffle the directions of motion or, more precisely, to perform
a random permutation of the velocity vectors, independently at each lattice site and
50
2. Cellular Automata and Lattice Boltzmann Techniques
LA COMPLEXITÉ
each time step. Figure 2.15 illustrate this probabilistic evolution rule. Note that at
a macroscopic level of description, the random walk rule corresponds to a diffusion
process (see section 2.5.3).
p
p
p
p
0
2
Figure 2.15: How the entering particles are deflected at a typical site, as a result of the
diffusion rule. The four possible outcomes occur with respective probabilities p0, p1,
p2 and p3. The figure shows four particles, but the mechanism is data-blind and any
one of the arrows can be removed when fewer entering particles are present.
As an example of the use of the present random walk cellular automata rule, we
discuss an application to growth processes. In many cases, growth is governed by
a spatial quantity such as an electric field, a local temperature, or a particle density
field [54]. Aggregation constitutes an important mechanism: like particles stick to
each other as they meet and, as a result, form a complicated pattern with a branching
structure.
A prototype model of aggregation is the so-called DLA model (diffusion-limited
aggregation), introduced by Witten and Sander[55] in the early 1980s. Since its in-
troduction, the DLA model has been investigated in great detail. However, diffusion-
limited aggregation is a far from equilibrium process which is not described theoret-
ically by first principle only. Spatial fluctuations that are typical of the DLA growth
are difficult to take into account and a numerical approach is necessary to complete the
analysis.
DLA-like processes can be readily modeled by our diffusion cellular automata,
provided that an appropriate rule is added to take into account the particle-particle
aggregation. The first step is to introduce rest particle to represent the particles of
the aggregate. Therefore, in a two-dimensional system, a lattice site can be occupied
by up to four diffusing particles, or by one “solid” particle. Our approach has some
differences compared with the original Witten and Sanders model. All particles reside
on a lattice and move simultaneously. They can stick to different part of the cluster and
51
LA COMPLEXITÉ
Bastien CHOPARD
Figure 2.16: Two-dimensional cellular automata DLA-like cluster (black), obtained
with ps = 1, an aggregation threshold of 1 particle and a density of diffusing particle
of 0.06 per lattice direction. The gray dots represent the diffusing particles not yet
aggregated.
we do not launch them, one after the other, from a region far away from the cluster.
For this reason, we may expect some quantitative variation from the original DLA
properties.
Figure 2.16 shows a two-dimensional DLA-like cluster grown by the cellular au-
tomata dynamics. At the beginning of the simulation, one or more rest particles are
introduced in the system to act as aggregation seeds. The rest of the system is filled
with particle with average concentration ρ. When a diffusing particle gets nearest
neighbor to a rest particle, it stops and sticks to it by transforming into a rest particle.
Since several particle can enter the same site, we may choose to aggregate all of them
at once (i.e. a rest particle is actually composed of several moving particles), or to
accept the aggregation only when a single particle is present.
In addition to this question, the sticking condition is important. If any diffusing
particle always sticks to the DLA cluster, the growth is very fast and can be influenced
by the underlying lattice anisotropy. It is therefore more appropriate to stick with some
probability ps. Since up to four particles may be simultaneously candidate for the ag-
gregation, we can also use this fact to modify the sticking condition. A simple way is to
require that the local density of particle be larger than some threshold (say 3 particles)
to yield aggregation. The cluster shown in figure 2.16 has fractal dimension df = 1.78
which is not very different from the genuine, off-lattice DLA fractal dimension[56,54]
df = 1.70.
The cellular automata approach is also well suited to study dynamical properties
such as the DLA growth rate. The standard numerical experiment is to distribute uni-
52
2. Cellular Automata and Lattice Boltzmann Techniques
LA COMPLEXITÉ
12.3
ln(mass)
2-D
3-D
2.3
3
9.8
ln(time)
Figure 2.17: Formation rate of cellular automata DLA clusters in two- and three di-
mensions. The lattice has periodic boundary conditions.
formly the initial diffusing particles on the lattice with a single aggregation seed in
the middle. As time t goes on, more and more particles get solidified and the cluster
mass M (t) increases. Our simulations indicate (see figure 2.17) that this process has
an intermediate regime governed by a power law
M (t) =∼ tα
where
α ≈ 2
in both two and three dimensions. Although these results are not sufficient to conclude
definitely that the 2-D and 3-D exponents are the same, an explanation would be that in
3-D there is more surface to stick to than in 2-D, but also more space to explore before
a diffusing particles they can aggregate. These two effects may just compensate.
2.2.7
The traveling ant
The ant rule is a cellular automata invented by Chris Langton[57] and Greg Turk which
models the behavior of a hypothetical animal (ant) having a very simple algorithm of
motion. The ant moves on a square lattice whose sites are either white or grey. When
the ant enters a white cell, it turns 90 degrees to the left and paints the cell in gray.
Similarly, if it enters a gray cell, it paints it in white and turn 90 degree to the right.
It turns out that the motion of this ant exhibits a very complex behavior. Suppose
the ant starts in a completely white space. After a series of about 500 steps where it
essentially keeps returning to its initial position, it enters a chaotic phase during which
its motion is unpredictable. Then, after about 10000 steps of this very irregular motion,
the ant suddenly performs a very regular motion which brings it far away from where
it started.
53
LA COMPLEXITÉ
Bastien CHOPARD
t=6900
t=10431
t=12000
Figure 2.18: The Langton’s ant rule. The motion of a single ant starts with a chaotic
phase of about 10000 time steps, followed by the formation of a highway. The figure
shows the state of each lattice cell (gray or white) and the ant position (marked by
the black dot). In the initial condition all cells are white and the ant is located in the
middle of the image.
Figure 2.18 illustrates the ant motion. The path the ant creates to escape the chaotic
initial region has been called a highway[58]. Although this highway is oriented at 45
degrees with respect to the lattice direction, it is traveled by the ant in a way which
makes very much think of a sewing machine: the pattern is a sequence of 104 steps
which are repeated indefinitely.
The Langton ant is a good example of a cellular automata whose rule is very sim-
ple and yet generates a complex behavior which seems beyond our understanding.
Somehow, this fact is typical of the cellular automata approach: although we do know
everything about the fundamental laws governing a system (because we set up the rules
ourselves!), we are often unable to explain its macroscopic behavior.
There is anyway a global property of the ant motion: the ant visits an unbounded
region of space, whatever the initial space texture is (configuration of gray and white
cells).
The proof (due to Bunimovitch and Troubetzkoy) goes as follows: supposed the
region the ant visits is bounded. Then, it contains a finite number of cells. Since the
number of iteration is infinite, there is a domain of cells that are visited infinitely often.
Moreover, due to the rule of motion, a cell is either entered horizontally (we call it a
H cell) or vertically (we call it a V cell). Since the ant turns by 90 degrees after each
step, a H cell is surrounded by four V cells and conversely. As a consequence, the
H and V cells tile the lattice in a fixed checkerboard pattern. Now, we consider the
upper rightmost cell of the domain, that is a cell whose right and upper neighbor is not
visited. This cell exists if the trajectory is bounded. If this cell is an H cell (and be
54
2. Cellular Automata and Lattice Boltzmann Techniques
LA COMPLEXITÉ
so for ever), it has to be entered horizontally from left and exited vertically downward
and, consequently be gray. However, after the ant has left, the cell is white and there is
a contradiction. The same contradiction appears if the cell is a V cell. Therefore, the
ant trajectory is not bounded.
As it has been described, the above rule is defined only when a single ant moves on
the lattice. We can easily generalize it when many ants are simultaneously present so
that up to four of them may enter the same site at the same time, from different sides
Following the same idea as in the HPP rule, we will introduce ni(r, t) as a boolean
variable representing the presence (ni = 1) or the absence (ni = 0) of an ant entering
site r at time t along lattice direction ci, where c1, c2, c3 and c4 stand for direction
right, up, left and down, respectively. If the color µ(r, t) of the site is gray (µ = 0), all
entering ants turn 90 degrees to the right. On the other hand, if the site is white(µ = 1),
they all turn 90 degrees to the left. The color of each cell is modified after one or more
ants have gone through. Here, we chose to switch µ → 1−µ only when an odd number
of ant are present.
t=2600
t=4900
t=8564
Figure 2.19: Motion of several Langton’s ants. Gray and white indicate the colors
of the cells at the current time. Ant locations are marked by the black dots. At the
initial time, all cells are white and a few ants are randomly distributed in the central
region, with random directions of motion. The first highway appears much earlier
than when the ant is alone. In addition the highway can be used by other ants to travel
much faster. However, the “highway builder” is usually prevented from continuing its
construction as soon as it is reached by the following ants. For instance, the highway
heading north-west after 4900 steps get destroyed. A new highway emerges later on
from the rest, as we see from the snapshot at time t = 8564.
When several ant travel simultaneously on the lattice, cooperative and destructive
behaviors are observed. First, the erratic motion of several ants favors the formation of
55
LA COMPLEXITÉ
Bastien CHOPARD
a local arrangement of colors allowing the creation of a highway. One has to wait much
less time before the first highway appears. Second, once a highway is being created,
other ants may use it to travel very fast (they do not have to follow the complicated
pattern of the highway builder. In this way, the term “highway” is very appropriate.
Third, a destructive effect occurs as the second ant gets to the highway builder. It
breaks the pattern and several situations may be observed. For instance, both ants may
enter a new chaotic motion; or the highway is traveled in the other direction (note that
the rule is time reversal invariant) and destroyed. Figure 2.19 illustrates the multi-ant
behavior.
The problem of an unbounded trajectory pauses again with this generalized motion.
The assumption of Bunimovitch-Troubetzkoy’s proof no longer holds in this case be-
cause a cell may be both an H or a V cell. Indeed, two different ants may enter a same
cell one vertically and the other horizontally. Actually, the theorem of an unbounded
motion is wrong in several cases where two ants are present. Periodic motions may
occur when the initial positions are well chosen.
For instance, when the relative location of the second ant with respect to the first
one is (∆x, ∆y) = (2, 3), the two ants returns to their initial position after 478 it-
erations of the rule (provided they started in an uniformly white substrate, with the
same direction of motion). A very complicated periodic behavior is observed when
(∆x, ∆y) = (1, 24): the two ant start a chaotic-like motion for several thousands of
steps. Then, one ant builds a highway and escape from the central region. After a
while, the second ant finds the entrance of the highway and rapidly catches the first
one. After the two ants meet, they start undoing their previous paths and return to their
original position. This complete cycle takes about 30000 iterations.
More generally, it is found empirically that, when ∆x + ∆y is odd and the ants
enter their site with the same initial direction , the two-ant motion is likely to be pe-
riodic. However, this is not a rule and the configuration (∆x, ∆y) = (1, 0) yields an
unbounded motion, a diamond pattern of increasing diameter which is traveled in the
same direction by the two ants.
It turns out that the periodic behavior of a two-ant configuration is not so surprising.
The rule we defined is reversible in time, provided that there is never more than one ant
at the same site. Time reversal symmetry means that if the direction of motion of all
ants are reversed, they will move backward through their own sequence of steps, with
an opposite direction of motion. Therefore, if at some point of their motion the two
ants cross each other (on a lattice link, not on a site), the first ant will go through the
past of the second one, and vice versa. They will return to the initial situation (the two
ants being exchanged) and build a new pattern, symmetrical to the first one, due to the
inversion of the directions of motion. The whole process then cycles for ever. Periodic
trajectories are therefore related to the probability that the two ants will, at a some
time, cross each other in a suitable way. The conditions for this to happen are fulfilled
when the ants sit on a different sublattice (black or white sites on the checkerboard)
and exit two adjacent sites against each other. This explain why a periodic motion is
likely to occur when ∆x + ∆y is odd.
56
2. Cellular Automata and Lattice Boltzmann Techniques
LA COMPLEXITÉ
2.2.8
Population dynamics
In addition to physical, chemical or biological systems, the CA approach is interesting
for the study of simple population models. Several different problems can be envis-
aged, such as the simulation of ecosystems or the social behavior in a population of
interacting individuals. Here we consider an example of the latter situation.
The social behavior of the group of persons is certainly related to the fact that each
individual has its own autonomy and perception of the environment. On the other hand,
the behavior of a whole population may also reflect some “mechanical” or spontaneous
response of an individual to the situation it is confronted with. We may hope that the
collective behavior that may emerge from such a proceess could be captured by some
CA model, provided that one is able to find the rule to which each individual obey. At
least it is worth to check whether a given social behavior can be explained with such
mechanisms before incriminating the fact that each individual is free to think and act
its own way.
Here we address the generic problem of the competing fight between two different
groups over a fixed area. We present a “voter model” which describes the dynamical
behavior of a population with bimodal conflicting interests and study the conditions of
extinction of one of the initial groups [59].
This model can be thought of as describing the smoker - non smoker fight: in a
small group of persons, a majority of smokers will usually convince the few others to
smoke and vice versa. The point is really when an equal number of smokers and non-
smokers meet. In that case, it may be assumed that a social trend will decide between
the two attitudes. In the US, smoking is viewed as a disadvantage whereas, in France,
it is rather well accepted. In other words, there is a bias that will select the winner
party in an even situation. In our example, whether one studies the French or US case,
the bias will be in favor of the smokers or the non-smokers, respectively.
The same mechanism can be associated with the problem of competing standards.
The choice of one or the other standard is often driven by the opinion of the majority
of people one meets. But, when the two competing systems are equally represented,
the intrinsic quality of the product will be decisive. Price and technological advance
then play the role of a bias.
Here we consider the case of four-person confrontations in a spatially extended
system in which the actors (species A or B) move randomly. Initially, the B species
is present with density b0 and the A species with density 1 − b0. The B individuals
are supposed to have a qualitative advantage over the As but are less numerous. The
question we want to address is what is the minimal density b0 which make the Bs win
over A (i.e. invade the entire system at the expense of A individuals). The process of
spatial contamination of opinion plays a crucial role in this dynamics.
The CA rule we propose here [59] to describe this proceess is derived from a model
by Galam [60], in which the four individuals involved in a tournament are randomy
chosen among the current population, whose composition in A or B type of persons
evolves after each confrontation. The density threshold for an invading emergence of
57
LA COMPLEXITÉ
Bastien CHOPARD
B is bc = 0.23 if the B group has a qualitative bias over A. With a spatial distribution
of the species, even if b0 < bc, B can still win over A provided that it strives for
confrontation. Therefore a qualitative advantage is found not to be enough to win. A
geographic as well a definite degree of aggressiveness are instrumental to overcome
the less fitted majority.
The model we use to describe the two populations A and B influencing each other
or competing for some unique resources, is based on the diffusion automaton proposed
in section 2.2.6. The particles have two possible internal states (±1), coding for the A
or B species, respectively.
The individuals move on a two-dimensional square lattice. At each site, there are
always four individuals (any combination of A’s and B’s is possible). These four
individuals all travels in a different lattice direction (north, east, south and west).
The interaction takes place in the form of “fights” between the four individuals
meeting on the same site. At each fight, the group nature (A or B) is updated according
to the majority rule, when possible, otherwise with a bias in favor of the best fitted
group:
• The local majority species (if any) wins:
(n + m)A if n > m
nA + mB → (n + m)B if n < m
where n + m = 4.
• When there is an equal number of A and B on a site, B wins the confrontation
with probability 1/2 + β/2. The quantity β ∈ [0, 1] is the bias accounting for
some advantage (or extra fitness) of species B.
The above rule is applied with probability k. Thus, with probability 1 − k the group
composition does not change because no fight occurs. Between fights both population
agents perform a random walk on the lattice.
The behavior of this model is illustrated in figure 2.20. The current configuration
is shown at three different time steps. We can observe the growth of dense clusters of
B invading the system.
It is clear that the model richness comes from the even confrontations. If only
odd fights would happen, the initial majority population would always win after some
short time. The key parameters of this model are (i) k, the aggressiveness (probability
of confrontation), (ii) β, the B’s bias of winning a tie and (iii) b0, the initial density of
B.
The strategy according to which a minority of B’s (with yet a technical, genetic,
persuasive advantage) can win against a large population of A’s is not obvious. Should
they fight very often, try to spread or accept a peace agreement? We study the param-
eter space by running the cellular automaton.
58
2. Cellular Automata and Lattice Boltzmann Techniques
LA COMPLEXITÉ
t=10
t=30
t=70
Figure 2.20: Configurations of the voter CA model, at three different times. The A and
B species are represented by the gray and white regions, respectively. The parameters
of the simulation are b0 = 0.1, k = 0.5 and β = 1.
In the limit of low aggressiveness (k → 0), the particles move a long time before
fighting. Due to the diffusive motion, correlation between successive fights are de-
stroyed and B wins provided that b0 > 0.23 and β = 1. This is the mean-field level of
our dynamical model which corresponds to the theoretical calculations made in [60].
More generally, we observe that B can win even when b0 < 0.23, provided it
acts aggressively, i.e. by having a large enough k. Thus, there is a critical density
bdeath(k) < 0.23 such that, when b0 > bdeath(k), all A are eliminated in the final
outcome. Below bdeath, B looses unless some specific spatial configurations of B’s are
present.
Therefore the growth of species B at the expense of A is obtained by a spatial
organization. Small clusters that may accidentally form act as nucleus from which the
B’s can develop. In other words, above the mean-field threshold bc = 0.23 there is no
need to organize in order to win but, below this value only condensed regions will be
able to grow. When k is too small, such an organization is not possible (it is destroyed
by diffusion) and the strength advantage of B does not lead to success.
Figure 2.21 summarizes, as a function of b0 and k, the regions where either A or
B succeeds. It is found that the separation curve satisfies the equation (k + 1)7(b0 −
0.077) = 0.153.
It is also interesting to study the time needed to annihilate completely the looser.
Here, time is measured as the number of fights per site (i.e. kt where t is the iteration
time of the automaton). We observe that the dynamics is quite fast and a few units of
time are sufficient to yield a collective change of opinion.
Following the same methodology, more complicated interactions between individ-
uals can be investigated. The case of a non-constant bias is quite interesting and is
described in [59]. In conclusion, although this model is very simple, it abstracts the
complicated behavior of real life agents by capturing some essential ingredients. For
this reason, the results we have presented may shed light on the generic mechanisms
observed in a social system of opinion making.
59
LA COMPLEXITÉ
Bastien CHOPARD
1.0
0.9
0.8
0.7
0.6
k
B
0.5
0.4
0.3
0.2
0.1
A
0.0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
b0
Figure 2.21: Phase diagram for our socio-physical model with β = 1. The curve
delineates the regions where, on the left, A wins with high probability and, on the
right, B wins with probability one. The outcome depends on b0, the initial density of
B and k, the probability of a confrontation.
In particular we see that the correlations existing between successive fights may
strongly affect the global behavior of the system and that an organization is the key
feature to obtain a definite advantage over the other population. This observation is
important. For instance, during a campaign against smoking or an attempt to impose a
new system, it is much more efficient (and cheaper) to target the effort on small nuclei
of persons rather than sending the information in an uncorrelated manner.
2.3
From micro-physics to macro-physics
In the previous section, we have discussed several cellular automata rules which are
relevant to the description of physical processes. The question is of course how close
these models are to the reality they are supposed to simulate?
In general, space and time are not discrete and, in classical physics, the state vari-
ables are continuous. Thus, it is crucial to show how a cellular automata rule is con-
nected to the laws of physics or to the usual quantities describing the phenomena which
are modeled. This is particularly important if the cellular automata is intended to be
used as a numerical scheme to solve practical problems.
Lattice gas automata have a large potential of applications in hydrodynamics and
reaction-diffusion processes. The purpose of this section is to present the techniques
that are used to establish the connection between the macroscopic physics and the
microscopic discrete dynamics of the automaton. The problem one has to address is
the statistical description of a system of many interacting particles. The methods we
shall discuss here are very close, in spirit, to those applied in kinetic theory: the N-body
dynamics is described in terms of macroscopic quantities like the particle density or the
velocity field. The derivation of a Boltzmann equation is a main step in this process.
60
2. Cellular Automata and Lattice Boltzmann Techniques
LA COMPLEXITÉ
To illustrate the method we first present the so-called FHP CA fluid model because
this system features all the relevant steps of the derivation.
2.3.1
The FHP model
The FHP rule is a model of a two-dimensional fluid which has been introduced by
Frisch, Hasslacher and Pomeau [21], in 1986. We will show here how the fully discrete
microscopic dynamics maps onto the macroscopic behavior of hydrodynamics.
The model describes the motion of particles traveling in a discrete space and col-
liding with each other, very much in the same spirit as the HPP lattice gas discussed
in section 2.2.5. The main difference is that, for isotropy reasons that will become
clear below, the lattice is hexagonal (i.e. each site has six neighbors, as shown in
figure 2.22).
The FHP model is an abstraction, at a microscopic scale, of a fluid. It is expected
to contain all the salient features of real fluid. It is well known that the continuity and
Navier-Stoke equations of hydrodynamics express the local conservation of mass and
momentum in a fluid. The detailed nature of the microscopic interactions does not
affect the form of these equations but only the values of the coefficients (such as the
viscosity) appearing in them. Therefore, the basic ingredients one has to include in
the microdynamics of the FHP model is the conservation of particles and momentum
after each updating step. In addition, some symmetries are required so that, in the
macroscopic limit, where time and space can be considered as continuous variables,
the system be isotropic.
As in the case of the HPP model, the microdynamics of FHP is given in terms of
Boolean variables describing the occupation numbers at each site of the lattice and at
each time step (i.e. the presence or the absence of a fluid particle). The FHP particles
move in discrete time steps, with a velocity of constant modulus, pointing along one
of the six directions of the lattice. The dynamics is such that no more than one particle
enters the same site at the same time with the same velocity. This restriction (the
exclusion principle) ensures that six boolean variables at each lattice site are always
enough to represent the microdynamics.
Interactions take place among particles entering the same site at the same time and
result in a new local distribution of particle velocities. In order to conserve the num-
ber of particle and the momentum during each interaction, only a few configurations
lead to a non trivial collision (i.e a collision in which the directions of motion have
changed). For instance, when exactly two particles enter the same site with opposite
velocities, both of them are deflected by 60 degrees so that the output of the collision
is still a zero momentum configuration with two particles. As shown in figure 2.22,
the deflection can occur to the right or to the left, indifferently. For symmetry reasons,
the two possibilities are chosen randomly, with equal probability.
Another type of collision is considered: when exactly three particles collide with
an angle of 120 degrees between each other, they bounce back (so that the momentum
after collision is zero, as it was before collision). Figure 2.23 illustrates this rule.
61
LA COMPLEXITÉ
Bastien CHOPARD
p=1/2
p=1/2
Figure 2.22: The two-body collision in the FHP model. On the right part of the figure,
the two possible outcomes of the collision are shown in dark and light gray, respec-
tively. They both occur with probability one-half.
Figure 2.23: The three-body collision in the FHP model.
Several variants of the FHP model exist in the literature [3], including some with rest
particles, like the FHP-II and FHP-III models.
For the simplest case we are considering here, all interactions come from the two
collision processes described above. For all other configurations (i.e those which are
not obtained by rotations of the situations given in figures 2.22 and 2.23) no collision
occurs and the particles go through as they were transparent to each other.
Both two- and three-body collisions are necessary to avoid extra conservation laws.
The two-particle collision removes a pair of particles with a zero total momentum and
moves it to another lattice direction. Therefore, it conserves momentum along each
line of the lattice. On the other hand, three-body interactions deflect particles by 180
degrees and cause the net momentum of each lattice line to change. However, three-
body collisions conserve the number of particles within each lattice line.
2.3.2
Microdynamics
The full microdynamics of the FHP model can be expressed by evolution equations for
the occupation numbers: we introduce ni(r, t) as the number of particles (which can
62
2. Cellular Automata and Lattice Boltzmann Techniques
LA COMPLEXITÉ
c
c
3
2
c
c
4
1
c
c
5
6
Figure 2.24: The direction of motion ci
be either 0 or 1) entering site r at time t with a velocity pointing along direction ci,
where i = 1, 2, ..., 6 labels the six lattice directions. The unit vectors ci are shown in
figure 2.24.
We also define the time step as τ and the lattice spacing as λ. Thus, the six possible
velocities vi of the particles are related to their directions of motion by
λ
vi = c
τ i
Without interactions between particles, the evolution equations for the ni would be
given by
ni(r + λci, t + τ ) = ni(r, t)
(2.17)
which expresses that a particle entering site r with velocity along ci will continue in
straight line so that, at the next time step, it will enter site r + λci with still the same
direction of motion. However, due to collisions, a particle can be removed from its
original direction or another one can be deflected into direction c i.
For instance, if only ni and ni+3 are 1 at site r, a collision occurs and the particle
traveling with velocity vi will then move with either velocity vi−1 or vi+1 (note that the
operations on index i are wrapped onto the value 1,2,...,6). The quantity
Di = nini+3(1 − ni+1)(1 − ni+2)(1 − ni+4)(1 − ni+5)
(2.18)
indicates, when Di = 1 that such a collision will take place. Therefore,
ni − Di
is the number of particles left in direction ci due to a two-particle collision along this
direction.
Now, when ni = 0, a new particle can appear into direction ci, as the result of a
collision between ni+1 and ni+4 or a collision between ni−1 and ni+2. It is convenient
to introduce a random boolean variable q(r, t) which decides whether the particles are
deflected to the right (q = 1) or to the left (q = 0) when a two-body collision takes
place. Therefore, the number of particle created into direction ci is
qDi−1 + (1 − q)Di+1
63
LA COMPLEXITÉ
Bastien CHOPARD
Figure 2.25: Development of a sound wave in a FHP gas, due to an over particle
concentration in the middle of the system
Particles can also be created into (or removed from) direction ci because of a three-
body collision. The quantity which expresses the occurrence of a three-body collision
with particles ni, ni+2 and ni+4 is
Ti = nini+2ni+4(1 − ni+1)(1 − ni+3)(1 − ni+5)
(2.19)
As before, the result of a three-body collision is to modify the number of particles in
direction ci as
ni − Ti + Ti+3
Thus, in full generality, the microdynamics of a LGA is written as
ni(r + λci, t + τ ) = ni(r, t) + Ωi(n(r, t))
(2.20)
where Ωi is called the collision term.
For the FHP model, Ωi is defined so as to reproduce the collisions, that is
Ωi(n) = −Di + qDi−1 + (1 − q)Di+1
−Ti + Ti+3
(2.21)
Using the full expression for Di and Ti, we obtain
Ωi(n) =
−nini+2ni+4(1 − ni+1)(1 − ni+3)(1 − ni+5)
+ni+1ni+3ni+5(1 − ni)(1 − ni+2)(1 − ni+4)
−nini+3(1 − ni+1)(1 − ni+2)(1 − ni+4)(1 − ni+5)
+(1 − q)ni+1ni+4(1 − ni)(1 − ni+2)(1 − ni+3)(1 − ni+5)
+qni+2ni+5(1 − ni)(1 − ni+1)(1 − ni+3)(1 − ni+4)
(2.22)
These equations are easy to code in a computer and yield a fast and exact implemen-
tation of the model. As an example, figure 2.25 illustrates a sound wave in the FHP
gas at rest. Note that, usually, the so-called FHP-III model [61], which include a rest
particle and a more complete set of collisions, is prefered when simulating a fluid, due
to better physical properties.
64
2. Cellular Automata and Lattice Boltzmann Techniques
LA COMPLEXITÉ
2.3.3
The macroscopic variables
In a lattice gas automaton, the physical quantities of interest are not so much the
Boolean variables ni but macroscopic quantities or average values, such as, for in-
stance, the average density of particles and the average velocity field at each point
of the system. These quantities are defined from the ensemble average Ni(r, t) =<
ni(r, t) > of the microscopic occupation variables. Ni(r, t) is also the probability of
having a particle entering site r, at time t, with velocity vi = (λ/τ )ci.
In general, a LGA is characterized by the number z of lattice directions and the
spatial dimensionality d. For a square lattice in d = 2 dimensions, we have z = 4,
whereas, for hexagonal lattice, z = 6. It is also convenient to add a (z + 1)th direction,
i = 0, corresponding to a population of rest particles for which, obviously, v0 = 0.
Following the usual definition of statistical mechanics, the local density of particles
is the sum of the average number of particles traveling along each direction ci
z
ρ(r, t) =
Ni(r, t)
(2.23)
i=0
Similarly, the particle current, which is the density ρ times the velocity field u, is
expressed by
z
ρ(r, t)u(r, t) =
viNi(r, t)
(2.24)
i=1
Another quantity which will play an important role in the up coming derivation is the
momentum tensor Π defined as
z
Παβ =
viαviβNi(r, t)
(2.25)
i=1
where the greek indices α and β label the d spatial components of the vectors. The
quantity Π represents the flux of the α-component of momentum transported along the
β-axis. This term will contain the pressure contribution and the effects of viscosity.
2.3.4
Multiscale Chapman-Enskog expansion
It is important to show that the discrete CA world is, at some appropriate scale of
observation, governed by admissible equations: the physical conservation laws and
the symmetry of the space are to be present and the discreteness of the lattice should
not show up. The connection between the microscopic Boolean dynamics and the
macroscopic, continuous world has to be established in order to assess the validity of
the model.
In what follows we restrict the discussion to the case where all speed vi have the
same modulus and no particle at rest exists.
65
LA COMPLEXITÉ
Bastien CHOPARD
The starting point to obtain the macroscopic behavior of the CA fluid is to derive
an equation for the Ni’s. Averaging the microdynamics 2.20 yields
Ni(r + λci, t + τ ) − Ni(r,t) =< Ωi >
(2.26)
where Ωi is the collision term of the LGA under study. It is important to notice that
Ωi(n) has some generic properties, namely
z
z
Ωi = 0
viΩi = 0
(2.27)
i=1
i=1
expressing the fact that particle number and momentum are conserved during the col-
lision process (the incoming sum of mass or momentum equals the outgoing sum).
If more conservation laws exists (e.g. enery), the collision term should reflect them.
It is also expected that no extra quantities are conserved in addition to the physical
ones. This is usually not the case: spurious invariants are found in several lattice mod-
els [14,62,63,64,65] and they may affect the physical behavior.
Equation 2.26 is still discrete in space and time. The Ni’s vary between 0 and 1
and, at a scale L >> λ, T >> τ , one can expect them to be smooth functions of the
space and time coordinates. Therefore, equation 2.26 can be Taylor expanded up to
second order and gives
τ 2
λ2
τ ∂tNi + λ(ci · )Ni + ∂2N
(c
2 t i + 2
i · )2Ni + λτ(ci · )∂tNi =< Ωi > (2.28)
The macroscopic limit of a LGA dynamics will require the solution of this equation.
However, under the present form, there is little hope to solve it. Several approximations
will be needed. At some point, it will be necessary to use the so-called Boltzmann as-
sumption saying that Ni and Nj are uncorrealated if i = j and approximate < Ωi(n) >
as Ωi(N ) (with all random Boolean variables replaced by their average values).
Then, we will have to solve a nonlinear equation, which can be handeled provided
that we use a perturbation technique. For this purpose, we need a small parameter .
As we said we are interested to observe the system at a macroscopic scale L >> λ.
Thus, we introduce a new space variable r1 such that
r1 = r
∂r = ∂r
(2.29)
1
with
<< 1
Unfortunately, the equation we obtain by substituting into 2.28 ∂r with its expres-
sion in terms of r1 cannot be solved with a naive perturbation method. It is necessary
to introduce several time scales, otherwise divergences will occur. Following the pro-
cedure of the so-called multiscale expansion (see for instance [66]), we introduce the
extra time variables t1 and t2, as well as new functions Ni depending on r1, t1 and t2
N = N (t
i
i
1, t2, r1)
66
2. Cellular Automata and Lattice Boltzmann Techniques
LA COMPLEXITÉ
Suppose that now we formally substitue into equation 2.28
Ni → Ni
∂t → ∂t + 2∂
∂
(2.30)
1
t2
r → ∂r1
together with the corresponding expressions for the second order derivatives. Rig-
orously, we have no reason to consider only two time scales. However this will be
enough here and, from a physical standpoint, we may anticipate that t1 will be the
scale giving the convective phenomena, while t2 will describe dissipative processes.
After substitution in 2.30, we then obtain new equations for the new functions Ni .
The advantage is that, now, these equations can be solved by a pertubation method and
the divergences removed [66]. Thus, we may write
N = N (0) + N (1) + 2N (2) + ...
i
i
i
i
(2.31)
In addition we notice that, on the region,
t1 = t
t2 = 2t
we precisely have the equality
∂t = ∂t + 2∂
(2.32)
1
t2
and our new equations have for solutions
N ( t, 2t, r
i
1) = Ni(t, r)
From now on, we shall omitt the superscript on Ni because we are only interested in
what happens when t1 = t and t2 = 2t.
2.3.5
Chapman-Enskog procedure
The Chapman-Enskog method is the standard procedure used in statistical mechan-
ics to solve an equation like 2.28 with a perturbation parameter . Assuming that
< Ωi(n) > can be factorized into Ωi(N), we write the contributions of each order in
. According to multiscale expansion 2.31, the right-hand side of 2.28 reads
z
∂Ω
Ω
i(N (0))
i(N ) = Ωi(N (0)) +
N (1) + O( 2)
∂N
j
j=1
j
Using expressions 2.31, 2.32 and 2.29 for Ni, ∂t and ∂r in the left-hand side of 2.28
yields the following conditions for the first two orders in
O( 0) :
Ωi(N(0)) = 0
(2.33)
and
1 z
∂Ω
O( 1) :
∂
i(N (0))
t N (0) + ∂
=
N (1)
1
i
1αviαN (0)
i
τ
∂N
j
(2.34)
j=1
j
67
LA COMPLEXITÉ
Bastien CHOPARD
where the subscript 1 in spatial derivatives (e.g. ∂1α) indicates a differential operator
expressed in the variable r1.
The first equation determines the N (0)
i
’s. Once they are known, they can be substi-
tuted into the second equation in order to obtain a solution for the N (1)
i
. Unfortunately,
this procedure is not as simple as it first looks, because the matrix (∂Ω/∂N ) (whose
elements are ∂Ωi/∂Nj) is not invertible, due to the conservation laws 2.27. Indeed
∂Ωi
∂
=
Ω
∂N
∂N
i = 0
i
j
j
i
and, similarly
∂Ω
v
i
iα
= 0
∂N
i
j
Thus, the columns of the matrix (∂Ω/∂N ) are linear combinations of each other and
the determinant is zero. The above two equations can also be written as
∂Ω T
∂Ω T
E
E
∂N
0 = 0
∂N
α = 0
1 ≤ α ≤ d
(2.35)
where the quantities E0 and Eα are called the collisional invariants and are vectors of
Rz defined as
E0 = (1, ..., 1)
Eα = (v1α, ..., vzα)
(2.36)
The reason the Eα are called collisional invariants is because they described the con-
served quantities of the dynamics, namely
N · E0 = ρ
N · Eα = ρuα
(2.37)
where · denotes the scalar product in Rz.
In order for equation 2.34 to have a solution, it is necessary that ∂t N (0) +
1
i
∂1αviαN(0)
i
be in the image space of (∂Ω/∂N ). It is well known from linear alge-
bra that the image of a matrix is orthogonal (in the sense of the scalar product) to the
kernel of its transpose
∂Ω
∂Ω T ⊥
Im
= Ker
(2.38)
∂N
∂N
Therefore, the solubility condition of equation 2.34 requires that ∂t N (0) + ∂
1
i
1αviαN (0)
i
be orthogonal to E0, E1 and E2. We shall see in the next section that this condition is
satified (equations 2.41 and 2.42).
68
2. Cellular Automata and Lattice Boltzmann Techniques
LA COMPLEXITÉ
Finally, note that when a solution to equation 2.34 exists, it is not unique (again,
due to the fact that (∂Ω/∂N ) is not invertible). For this reason, we also impose the
extra conditions that the macroscopic quantities ρ and ρu are entirely given by the zero
order of expansion 2.31
z
z
ρ =
N (0)
ρu =
v
i
iN (0)
i
(2.39)
i=1
i=1
and, therefore
z
z
N ( ) = 0
v
= 0,
i
iN ( )
i
for
≥ 1
(2.40)
i=1
i=1
In other words, this amounts to asking that the solution N (1) is also orthogonal to the
collisional invariants and belongs to Im(∂Ω/∂N ).
2.3.6
Balance equations
Before we solve equations 2.33 and 2.34, remember that we are interested in the be-
havior of the macroscopic quantities ρ and ρu. Conservation laws 2.27 imply some
important balance equation for these variables.
Summing equation 2.28 over i yields zero for the right-hand side. The same is true
if we first multiply 2.28 with vi before summing. If, again, we express (2.28) in terms
of r1, t1, t2 and N ( )
i
we obtain (using equations 2.39) the following result at order
O( ) :
∂t ρ + div
1
1ρu = 0
(2.41)
and
O( ) :
∂t ρu
= 0
(2.42)
1
α + ∂1β Π(0)
αβ
The quantity Π(0) =
v
αβ
i
iαviβ N (0)
i
is the zero order approximation of the momentum
tensor defined in 2.25. One recognizes in equation 2.41 the usual continuity equation,
while equation 2.42 expresses momentum conservation and corresponds to the Euler’s
equation of hydrodynamics, in which dissipative effects are absent.
The same calculation can be repeated for the order O( 2). Remembering rela-
tions 2.40, we find
τ
τ
∂t ρ + ∂2 ρ + ∂
+ τ ∂ ∂
2
2 t
1α∂1β Π(0)
t
1αρuα = 0
(2.43)
1
2
αβ
1
and
τ
τ
∂t ρu
+ ∂2 ρu
∂
+ τ ∂ ∂
= 0
(2.44)
2
α + ∂1β Π(1)
αβ
2 t
a +
1β ∂1γ S(0)
t
1β Π(0)
1
2
αβγ
1
αβ
where S is the third-order tensor
z
Sαβγ =
viαviβviγNi
(2.45)
i=1
69
LA COMPLEXITÉ
Bastien CHOPARD
These last two equations can be simplified using relations 2.41 and 2.42. Let us first
consider the case of equation 2.43. One has
τ
τ
∂2 ρ =
∂ ∂
2 t
t
1αρua
(2.46)
1
−2 1
and, therefore
τ
τ
τ
∂2 ρ + ∂
+ τ ∂ ∂
∂
ρu
= 0
(2.47)
2 t
1α∂1β Π(0)
t
1αρuα =
1α
∂t
a + ∂1β Π(0)
1
2
αβ
1
2
1
αβ
Thus, equation 2.43 reduces to
∂t ρ = 0
(2.48)
2
Similarly, since
τ
τ
∂2 ρu
∂ ∂
2 t
a =
t
1β Π(0)
1
−2 1
αβ
equation 2.44 becomes
τ
∂t ρu
+
∂ Π(0) + ∂
= 0
(2.49)
2
α + ∂1β
Π(1)
αβ
2
t1
αβ
1γ S(0)
αβγ
This last equation contains the dissipative contributions to the Euler equation 2.42.
The first contribution is Π(1) which is the dissipative part of the momentum tensor.
αβ
The second part, namely τ
∂ Π(0) + ∂
comes from the second order terms
2
t1
αβ
1γ S(0)
αβγ
of the Taylor expansion of the discrete Boltzmann equation. These terms account for
the discreteness of the lattice and have no counterpart in standard hydrodynamics. As
we shall see, they will lead to the so-called lattice viscosity.
The order and 2 can be grouped together to give the general equations governing
our system. Summing equations 2.49 and 2.42 with the appropriate power of as factor
gives
∂
τ
∂
∂tρuα +
Π
∂ Π(0) +
S(0)
= 0
(2.50)
∂r
αβ +
t1
αβ
αβγ
β
2
∂rγ
where we have used that ∂t = ∂t + 2∂ and ∂
1
t2
α =
∂1α. Similarly, equation 2.41
and 2.48 yield
∂tρ + divρu = 0
(2.51)
which is the standard continuity equation. Equation 2.50 corresponds to the Navier-
Stokes equation. With the present form, it is not very useful because the tensors Π and
S are not given in terms of the quantities ρ and u. To go further, we will have to solve
equations 2.33 and 2.34 to find an expression for N (0)
i
and N (1)
i
as a function of ρ and
u. However, for the time being, it is important to remember that the derivation of the
continuity equation 2.51 and the Navier-Stokes equation 2.50 are soley based on very
general considerations, namely that
Ωi =
viΩi = 0. The specific collision rules
of the LGA under study (FHP for instance) do not affect the structure of these balance
equations. However, the details of the collision rule will play a role for the explicit
expression of Π and S.
70
2. Cellular Automata and Lattice Boltzmann Techniques
LA COMPLEXITÉ
2.3.7
Local equilibrium
We now turn to the problem of solving equation 2.33, together with conditions 2.39 in
order to find N (0)
i
as a function of ρ and ρu.
The solutions N (0)
i
which make the collision term Ω vanish are known as the local
equilibrium solutions. Physically, they correspond to a situation where the rate of
each type of collision equilibrates. Since the collision time τ is much smaller than the
macroscopic observation time, it is reasonable to expect, in first approximation, that
an equilibrium is reached locally.
Provided that the collision behaves reasonably, it is found [61] that the generic
solution is
1
N (0) =
i
(2.52)
1 + exp(−A − B · vi)
This expression has the form of a Fermi-Dirac distribution. This is a consequence of
the exclusion principle we have imposed in the cellular automata rule (no more than
one particle per site and direction). This form is explicitly obtained for the FHP model
by assuming that the rate of direct and inverse collisions are equal, namely
Ti(N(0)) = Ti+3(N(0))
and
1
1
1
1
D
D
D
D
2 i(N (0)) = 2 i+1(N (0))
2 i(N (0)) = 2 i−1(N (0))
The quantities A and B in 2.52 are functions of the density ρ and the velocity field
u and are to be determined according to equations 2.39. In order to carry out this
calculation, N (0)
i
is Taylor expanded, up to second order in the velocity field u (i.e.
second order in the Mach number). One obtains (see [14] for full details in the case
the FHP model).
bρ
ρG(ρ)
N (0) = aρ +
v
Q
i
v2 i · u +
v4
iαβ uαuβ
(2.53)
where v = λ/τ and
v2
Qiαβ = viαviβ − δ
d αβ
(2.54)
The coefficients entering this expression can be determined from 2.39. First we
assume that the lattice velocities have the following important properites
z
vi = 0
(2.55)
i=1
z
viαviβ = v2C2δαβ
(2.56)
i=1
z
viαviβviγ = 0
(2.57)
i=1
z
viαviβviγviδ = v4C4(δαβδγδ + δαγδβδ + δαδδβγ)
(2.58)
i=1
71
LA COMPLEXITÉ
Bastien CHOPARD
These conditions express the isotropy of tensors up to fourth order on the lattice. These
properties are necessary in order for the CA fluid flow to be isotropic (i.e. independent
of a specific lattice orientation), up to order u2. They hold for the hexagonal lattice
with C2 = 3 and C4 = 3/4 (see [14,67]), but 2.58 is wrong for a 2D square lattice and
that is the reason why the FHP model is defined on a hexagonal lattice.
From 2.56, one has
v
i
iαviα = v2C2δαα = v2dC2. On the other hand, if all vi
have same modulus v, a direct calculation gives
v
i
iαviα = zv2. Thus
z
C2 = d
Similarly, using 2.58,
z
viαviβviγviγ = v4C4(dδαβ + δαβ + δαβ) = v4(d + 2)C4δαβ
i=0
Again, if all vi are of same length,
z
v
v
i=0
iαviβ viγ viγ = v2
z
i=0
iαviβ and, from 2.56,
it is equal to v4C2. Therefore,
C
z
C
2
4 =
=
d + 2
d(d + 2)
Using the above properties, it is easy to see that
Q
Q
i
iαβ =
i
iαβ viγ = 0. Thus
the determination of the values of a and b is staightforward from 2.39
z
z
ρ =
N (0) = azρ
ρu
v
= bC
i
α =
iαN (0)
i
2ρuα
i=1
i=1
Hence,
1
1
d
a =
b =
=
z
C2
z
The function G is obtained from the fact that N (0)
i
is the Taylor expansion of a
Fermi-Dirac distribution. For FHP, it is found [14,61]
2 (3
G(ρ) =
− ρ)
3 (6 − ρ)
The fact that G(p) is not equal to 1 and depends on ρ expresses the lack of Galilean
invariance of the CA fluid. Note that adding several rest particles to the model is a way
to restore gradually this invariance.
We may now compute the local equilibrium part of the momentum tensor, Π(0).
αβ
This calculation requires to multiplying equation 2.53 by viαviβ and summing over i.
Π(0) =
N (0)v
αβ
i
iαviβ
i
C
= ρaC
2ρ
2v2δαβ −
G(ρ)u2δ
d
αβ + C4ρG(ρ) u2δαβ + 2uαuβ
C
=
aC
2
2v2ρ − d − C4 ρG(ρ)u2 δαβ + 2C4ρG(ρ)uαuβ
(2.59)
72
2. Cellular Automata and Lattice Boltzmann Techniques
LA COMPLEXITÉ
The quantity
C
p =
aC
2
2v2ρ − d − C4 ρG(ρ)u2
(2.60)
is called the pressure term and 2C4G(ρ)ρuαuβ the convective part of the momentum
tensor. Thus the microdynamics gives an explicit expression for the pressure. The term
aC2v2ρ corresponds to a perfect gas contribution, at fixed temperature. It is usually
written as
p = ρc2s
(2.61)
where cs is the speed of sound. From this relation, we may identify
c2 = aC
s
2v2 = v2/d
The other term, containing a u2 depenence is not physical and imply a spurious behav-
ior. This contribution can be suppressed in LB models (see section 2.4).
Note that in the FHP model, the temperature is not defined and the balance equation
for the kinetic energy is identical to the mass conservation equation, since all particles
have the same velocities. Temperature has been introduced in multispeed lattice gas
models, through the equipartition theorem[68,69,70].
2.3.8
Correction to local equilibrium
The next step is to compute the terms involved in the Navier-Stokes equation 2.50
∂
τ
∂
∂tρuα +
Π(0) + Π(1) +
∂ Π(0) +
S(0)
= 0
(2.62)
∂r
αβ
αβ
t1
αβ
αβγ
β
2
∂rγ
We shall restrict ourselves to first order in the velocity field u.
The lattice viscosity:
From equation 2.59 we have
Π(0) = c2ρδ
αβ
s
αβ + O(u2)
Since ∂t ρ =
ρ =
1
−div1ρu (from equation 2.41), one has ∂t1
−divρu and
τ ∂
τ c2 ∂
τ c2 ∂divρu
∂ Π(0) =
s
∂ ρδ
s
(2.63)
2 ∂r
t1
αβ
t1
αβ = −
β
2 ∂rβ
2
∂rα
To compute the term involving
z
S(0) =
v
,
αβγ
iαviβ viγ N (0)
i
i=1
we first notice that the only contribution to N (0)
=
i
given by equation 2.53 will be N (0)
i
[ρb/v2]vi · u because the other terms contain an odd number of vi. Thus, using 2.58,
S(0) = v2C
αβγ
4bρ(δαβ uγ + δαγ uβ + δβγ uα)
73
LA COMPLEXITÉ
Bastien CHOPARD
and
τ
∂2
τ v2
∂
S(0) =
C
divρu
(2.64)
2 ∂r
αβγ
4b
2ρuα + τv2C4b
β rγ
2
∂rα
Substituting the results 2.63 and 2.64 into the Navier-Stokes equation 2.62 yields
∂
∂
τ v2
∂tρuα +
Π(0) =
Π(1)
C
∂r
αβ
−
αβ −
4b
2ρuα
β
∂rβ
2
c2
∂
−τ v2C
s
4b −
divρu
(2.65)
2
∂rα
The last term vanishes since v2C4b
2
− c2/2 = 0
s
and the other term has the form of a
viscous effect νlattice
ρu, where
τ v2
τ v2
νlattice = −C4b
=
(2.66)
2
−2(d + 2)
where νlattice is a negative viscosity. The origin of this contribution is the discreteness
of the lattice (S(0) and ∂ Π(0) comes from the Taylor expansion). For this reason, this
αβγ
t1
αβ
term is referred to as a lattice contribution to the viscosity. The fact that it is negative
is of no consequence because the last contribution −∂β Π(1) which we still have to
αβ
calculate will be positive and larger than the present one.
The collisional viscosity:
The usual contribution to viscosity is due to the collision
between the fluid particles. This contribution is captured by the term ∂ β Π(1) in equa-
αβ
tion 2.65. In order to compute it, we first have to solve equation 2.34 for N (1)
i
. To
lowest order in the velocity flow u, we have
1 z
∂Ωi(N(0)) N(1) = ∂ N(0) + ∂
τ
∂N
j
t1
i
1αviαN (0)
i
j=1
j
∂N (0)
∂N (0)
= − i div
i
∂
+ ∂
∂ρ
1ρu − ∂ρu 1βΠ(0)
αβ
1αviαN (0)
i
α
(2.67)
where we have expressed the time derivative of N (0)
i
in terms of the derivatives with
respect to ρ and ρuα
∂N (0)
∂N (0)
∂
i
i
t N (0) =
∂ ρ +
∂ ρu
1
i
∂ρ
t1
∂ρu
t1
α
α
and used equations 2.41 and 2.42 to express ∂t ρ and ∂ ρu
1
t1
α. These substitutions will
ensure that the right-hand side of equation 2.67 will is the image of (∂Ω/∂N ).
As we did for the lattice viscosity, we shall only consider the first order in the
velocity flow u. The omitted terms are expected to be of the order O(u 3). From the
expressions 2.53 and 2.59, we have for the lowest order in u
b
N (0) = aρ +
v
= c2ρδ
i
v2 iαρuα
and
Π(0)
αβ
s
αβ
74
2. Cellular Automata and Lattice Boltzmann Techniques
LA COMPLEXITÉ
Thus
∂N (0)
∂N (0)
bv
i
= a
and
i
=
iα
∂ρ
∂ρuα
v2
and we can rewrite 2.67 as
1 z
∂Ωi(N(0))
bv
N (1) =
iα ∂
+ ∂
τ
∂N
j
−adiv1ρu − v2 1βΠ(0)
αβ
1αviαN (0)
i
j=1
j
b
v2
=
(v
δ
v2
iαviβ − d αβ)∂1βρuα
b
=
Q
v2 iαβ∂1βρuα
(2.68)
From this result, it is now clear that equation 2.67 will have a solution since, as noticed
previously, the z-dimensional vectors Qαβ of component Qiαβ are orthogonal to the
collisional invariants E0 and Eγ. Since E0 and Eα are in the kernel of (∂Ω/∂N )T ,
then Qαβ is in the image space of (∂Ω/∂N ) (see equation 2.38).
We now consider the left-hand side of equation 2.67, for u = 0 (remember that we
want to obtain the first contribution to N (1)
i
.
An interesting observation is that, in general, the vectors Qαβ are eigenvectors of
the matrix (∂Ω/∂N ). Thus we write
∂Ω(N (0)
Q
∂N
αβ = −ΛQαβ
u=0
where −Λ is the associated eigenvalue (for FHP, it is found that Λ = 3s(1 − s)3,
with s = ρ/6). This yields immediately the solution for N (1) as a multiple of Qαβ.
Since Qαβ is orthogonal to the collisional invariants, N (1) will clearly satisfy the extra
conditions 2.40. Thus we have
τ b
N (1) =
Q
i
−Λv2 iαβ∂1βρuα
(2.69)
We may now compute the correction Π(1) to the momentum tensor. Since ∂1β = ∂β,
we get
Π(1) =
N (1)v
αβ
i
iαviβ
i
τ v2b C
=
2 divρuδ
Λ
d
αβ − C4(δαβδγδ + δαγδβδ + δαδδβγ)∂γρuδ
b
C
= τ v2
2
Λ
d − C4 divρuδαβ − C4(∂αρuβ + ∂βρuα)
(2.70)
75
LA COMPLEXITÉ
Bastien CHOPARD
2.3.9
The Navier-Stokes equation
We can now rewrite (to first order in
and second order in the velocity flow u), the
Navier-Stokes equation 2.65. Using expression 2.59 for Π(0), we get
αβ
τ v2
∂tρuα + ∂β (ρ2C4G(ρ)uαuβ) = − p −
C
2
4b
2ρuα
b
C
− ∂
2
β
τ v2 Λ d − C4 (δαβdivρu − C4(∂αuβ + ∂βuα)) (2.71)
where the pressure p is given by relation 2.60
In the limit of low Mach number, the density can be assumed to be a constant,
except in the pressure term [71]. From the continuity equation 2.51, we then get
divρu = 0 and
1
1
bC
∂
=
τ v2
4
∂
u
ρ β Π(1)
αβ
−ρ
Λ
α∂β ρuβ + ρ∂2
β α
= −ν
2
coll
uα
(2.72)
with
bC
ν
4
coll = τ v2 Λ
Within this approximation equation 2.71 can be cast into
1
∂tu + 2C4G(ρ)(u · )u = − p + ν 2u
(2.73)
ρ
The quantity ν is the kinematic viscosity of our discrete fluid, whose expression is
composed of the lattice and collisional viscosities
1
1
τ v2
1
1
ν = τ v2bC4
=
(2.74)
Λ − 2
d + 2
Λ − 2
The presence of the coefficient C4 for the viscosity indicates that our results relies on
the isotropy of the fourth order tensor
v
i
iαviβ viγ viδ . Thus, for a 2D square lattice
(e.g. the HPP model), a viscosity cannot be defined, even in the first order in u.
For the FHP model, the viscosity depends strongly on the density (Λ = (ρ/2)[1 −
(ρ/6)]3 and may become arbitrarily large for the limiting values ρ = 0 and ρ = 6. Its
minimal value is obtained for ρ = 3/2.
Whereas the form of equation 2.73 depends little on the type of collision the parti-
cles experience, the expression 2.74 is very sensitive to the collision processes, through
the value of Λ. In a lattice gas dynamics, the viscosity is intrinsic to the model and is
not an adjustable parameter. In order to change the viscosity, collision rules should be
modified. This is why the FHP model has been extended to obtain the FHP-III model
with a lowest intrinsic viscosity.
Up to the factor of 2C4G(ρ), equation 2.73 is the standard Navier-Stokes equation.
The fact that the coefficient of the convective term (u · )u is different from 1 is an
76
2. Cellular Automata and Lattice Boltzmann Techniques
LA COMPLEXITÉ
indication of the non Galilean invariance of the model. However, if we assume that
ρ
const), this factor can be absorbed in a renormalization of the time and the lattice
dynamics is described by the usual hydrodynamic equation.
In section 2.4 we shall see that Galilean invariance can be restored in a more general
way when using a lattice Boltzmann dynamics. Also, viscosity will be an adjustable
parameter.
2.3.10
A two-phase CA fluids
The ability of a cellular automata fluids, like FHP, to model a real fluid depends very
much on the application one considers. It is not appropriate to simulate high Reynolds
flows (because the viscosity is too high), but can be very useful to describe situa-
tions with complicated boundary conditions (porous media) and multi-phase or reac-
tive flows (see for instance [13,72,73,74,75]).
In this section we consider a two-phase cellular automata fluid. Each particle of
the fluid can be in two possible states, say s = 1 or s = −1. If we call this extra degree
of freedom a spin, this fluid can be compared with an Ising system in which, the spins
can move according to some hydrodynamics rules.
We consider an interaction between nearest neighbor similar to that found in classi-
cal dynamical Ising models. This will produce a surface tension effect at the interface
between the two phases. The introduction of such an interaction requires real-valued
fields (like the temperature) and, thus, the present model goes beyond a simple, fully
discrete cellular automaton.
It is interesting to remark that, in addition to being a binary fluid model, this sys-
tem has some of the ingredients of a ferrofluid [76], if the spin is interpreted as the
magnetization carried by the particles.
The collision rule
Before we define more precisely the spin interaction, let us return
to the particle motion. A collision rule which conserve mass, momentum and spin can
be defined in analogy with the FHP rule described in section 2.3.1.
We denote by si(r, t) ∈ {−1, 0, 1} the state of the automaton along lattice direction
i at site r and time t (si = 0 means an absence of particle). Clearly the presence of a
particle is charcterized by s2 = 1
i
, regardless of its spin. Thus, the collision term can
be obtained by using s2i as an occupation number.
When a collision takes place, the particles are redistributed among the lattice direc-
tions but the same number of spin +1 and -1 particles should be present in the output
state as there were in the input state. A way to guarantee this spin conservation is to
assume that the particles are distinguishable, at least for what concerns their spin.
77
LA COMPLEXITÉ
Bastien CHOPARD
Therefore the full collision of a Ising fluid obeying FHP-like collision reads
si(r + λci, t + τ ) = si
−sis2i+2s2i+4(1 − s2i+1)(1 − s2i+3)(1 − s2i+5)
+si+3s2 s2 (1
)(1
)(1
)
i+1 i+5
− s2i − s2i+2 − s2i+4
−sis2 (1
)(1
)(1
)(1
)
i+3
− s2i+1 − s2i+2 − s2i+4 − s2i+5
+pqsi+1s2i+4(1 − s2i)(1 − s2i+2)(1 − s2i+3)(1 − s2i+5)
+p(1 − q)si+4s2 (1
)(1
)(1
)(1
)
i+1
− s2i − s2i+2 − s2i+3 − s2i+5
+(1 − p)(1 − q)si+2s2 (1
)(1
)(1
)(1
)
i+5
− s2i − s2i+1 − s2i+3 − s2i+4
+(1 − p)qsi+5s2i+2(1 − s2i)(1 − s2i+1)(1 − s2i+3)(1 − s2i+4) (2.75)
where p and q and random boolean variables that are 1 with probability 1/2, inde-
pendently at each site and time step. These quantities select one of the two possible
outcome in the two-body collisions.
Spin interaction
An important part of this Ising fluid model is the interaction be-
tween spins at the same sites and spins sitting on adjacent lattice sites. This interaction
produces the surface tension and can be adjusted through a parameter which corre-
sponds to the temperature of the system (which is asumed to be uniform here).
The interaction we propose here does not conserve the number of spins of each
sign. It only conserves the number of particles and, for this reason, does not represent
two different fluids but two possible state of the same fluid. Of course, the miscibility
or immiscibility of the two phases can be tuned through the temperature.
The updating rule for the spin dynamics is taken from the Monte-Carlo
method [77], using the so-called Glauber transition rule. The main idea is that a spin
flips (change sign) if it can lower the local energy of the system. The energy of the pair
of spin si and sj is computed as E = −J1sisj if the two spins are nearest neighbors
on the hexagonal lattice and E = −J0sisj if they both sit on the same site (remember
that up to six particles can populate a given site).
However, a spin can flip even if this results in a local increase of enery. But, then,
the change is accepted only with a probability W (s → −s) which depends on the
temperature. In the Glauber dynamics, this probability is given by
1
W (si → −si) = (1
2
− sitanh(Ei))
where Ei is the energy before the flip
1
Ei =
(J
k
0mi + J1Mi) si
B T
and mi =
s
s
j=i j is the on-site “magnetization” seen by spin si and Mi =
<ji> j
is the “magnetization” carried by all the particles j on the neighboring sites of spin i.
78
2. Cellular Automata and Lattice Boltzmann Techniques
LA COMPLEXITÉ
2
0
1
2
0
1
2
0
1
2
0
1
0
1
2
0
1
2
0
1
2
0
1
2
2
0
1
2
0
1
2
0
1
2
0
1
0
1
2
0
1
2
0
1
2
0
1
2
Figure 2.26: The three sub-lattices on the hexagonal lattice used for the synchrounous
spin update. The values 0, 1, 2 label the sites according to the sub-lattice they belong.
The quantity T is the temperature and kB the Boltzmann constant that we can set to 1
when working with an arbitrary temperature scale. When more than one particle are
present at a site, only one of them, chosen at random, is checked for such a spin flip.
The above transition rule is obtained from the detailed balance condition, namely
W (si → −si) exp(
=
−E(−si)/(kBT))
W (−si → si)
exp (−E(si)/(kBT))
where E(±si) denotes the Ising energy as a function of si and it has the properties to
drive an ergodic system to thermodynamic equilibrium.
As opposed to the standard the Monte-Carlo approach, where the lattice sites are
visited sequentially and in a random way, here we update synchronously all the sites
belonging to a given sub-lattice. Indeed, for the coherence of the dynamics it is impor-
tant not to update simultaneoulsy any two spins that are neighbors on the lattice. This
is for the same reason as explained in section 2.2.2 when we discussed the Q2R rule.
In an hexagonal lattice, it is easy to see that the space can be partitioned in three
sub-lattices so that all the neighbors of one sub-lattice always belong to the two others
(see figure 2.26).
Therefore, the spin interaction rule described above cycles over these three sub-
lattices and alternate with the FHP particle motion given by equation 2.75.
It is of course possible to vary the relative frequency of the two rules (Glauber and
FHP). For instance we can perform n successive FHP steps followed by m successive
steps of the Ising rule in order to give more or less importance to the particle motion
with respect to the spin flip. When n = 0 we have a pure Ising model on an hexagonal
lattice but with possibly a different number of spins per site.
If the temperature is large enough and periodic boundary conditions are imposed,
the system evolves to a configuration where, on average, there are the same amount of
particles with spins up and down. Of course, the situation is not frozen and the particles
keep moving and spins continuously flip. As in regular Ising systems, there is a critical
temperature below which we can observe a global magnetization and the growth of
domains containing one type of spin. This situation is illustrated in figure 2.27 and
corresponds to the case n = m = 1, namely one spin update cycle followed by one
step of FHP motion. It is observed that the critical temperature depends on the update
frequency n and m.
Another interesting situation corresponds to the simulation of a Raleigh-Taylor
79
LA COMPLEXITÉ
Bastien CHOPARD
Figure 2.27: Three snapshot of the evolution of the Ising FHP model below the crit-
ical temperature. Particles with spin +1 are shown in black while gray points show
particles with -1. White cells indicate empty sites.
t=50
t=150
t=300
t=350
Figure 2.28: Rayleigh-Taylor instability of the interface between two immiscible fluids.
Particles with spin +1 are shown in black and are “lighter” than gray particles with
having spin -1. An approximate immisciblity is obtained by choosing a low tempera-
ture in the model.
instability (see figure 2.28). Two immiscible fluids are on the top of each other and
the heavier is above the lighter. Due to gravity, the upper fluid wants to penetrate
through the lower one. Since the two fluids are immiscible, the interface between
them becomes unstable and, as time goes on, gives rise to a mushroom-like pattern.
An external force like gravity can be added to our model by deflecting with some
probability (and when possible) the trajectory of particles in a given direction. Two
immiscible fluids can modeled by having a low temperature T in the Glauber dynamics
so as to produce the necessary surface tension. The upper fluid layer is initialized with
only particles of spin -1, whereas the lower layer contains only spins +1. Gravity
is adjusted so that “light” particles go up and heavy particles go down. After a few
iterations, the flat interface destabilizes as shown in the last panel of figure 2.28.
80
2. Cellular Automata and Lattice Boltzmann Techniques
LA COMPLEXITÉ
2.4
The Lattice Boltzmann Method
Cellular automata fluids, such as those discussed in the previous section, represent ide-
alized N-body systems. Their time evolution can be performed exactly on a computer,
without many of the approximations usually done when computing numerically the
motion of a fluid. In particular, there is no need, in a CA simulation to assume some
factorization of the many-body correlation functions into a product of one-particle
density function.
Of course, the cellular automata model may be inadequate to represent a real situa-
tion but it includes naturally the intrinsic fluctuations present in any system composed
of many particles. This features is out of reach of most tractable numerical technique.
In many physical situations, spontaneous fluctuations and many-particle correlations
can be safely ignored. This is however not always the case and, in sections 2.6.3 and
2.6.5, we shall see some examples of systems where intrinsic fluctuations are crucial.
On the other hand, a cellular automata simulation is very noisy (because it deals
with Boolean quantities). In order to obtain the macroscopic behavior of a system (like
the streaklines in a flow past an obstacle), one has to average the state of each cell over a
rather large patch of cells (for instance a 32 × 32 square) and over several consecutive
time steps. This requires larger systems and longer simulation times Therefore, the
benefit of the cellular automata approach over more traditional numerical techniques
get blurred [78] when simulating pure fluid flows in simple geometries.
In addition, due to its Boolean nature, cellular automata models offer little flexibil-
ity to finely adjust external parameters. Many tunings are done through probabilities,
which is not always the most efficient way.
2.4.1
From Boolean to real-valued fields
When correlations can be neglected and the Boltzmann molecular chaos hypothesis is
valid, it may be much more effective to directly simulate on the computer the lattice
Boltzmann equation
Ni(r + λci, t + τ ) = Ni(r, t) + Ωi(N)
(2.76)
with Ωi given, for instance, by 2.22 with q replaced by 1/2. It is more advantageous to
average the microdynamics before simulating it rather than after doing it. The quanti-
ties of interest Ni are no longer boolean variables but probabilities of presence which
are continuous variable ranging in the interval [0, 1].
A direct simulation of the lattice Boltzmann dynamics has been first considered by
McNamara and Zanetti [41]. It considerably decreases the statistical noise that plague
cellular automata models and considerably reduces the computational requirements.
The main drawback of this approach is that it neglects many-body correlations and
may become numerically unstable.
The lattice Boltzmann (LB) method has been widely used for simulating various
fluid flows [79] and is believed to be a very serious candidate to overcome traditional
81
LA COMPLEXITÉ
Bastien CHOPARD
numerical techniques of CFD (Computational Fluid Dynamics). Their microscopic
level of description provide a natural interpretation of the numerical scheme and per-
mits intuitive generalizations to complex flow problems (two-phase flow [13,28,74],
magnetohydrodynamics [80], flow in porous media [23,24] or thermohydrodynam-
ics [81]).
The main weakness of current LB models is that they are defined on a regular
lattices, while CFD techniques can deal with arbitrary irregular meshes. For some
applications where the geometry cannot be fitted by a regular lattice, this is a strong
limitation. Some effort are now devoted to extend LB models to irregular lattices [82].
The succesful approach is probably to assume an underlying discrete velocity Boltz-
mann equation and express its evolution on a coarse grain discrete spatial mesh. As a
tentative example, section 2.5.3 shows a simple LB diffusion model in polar coordi-
nates.
In a lattice Boltzmann fluid, the most natural way to define the collision term Ω i, is
to average the microdymanics of a given underlying cellular automata fluid and factor-
ize it into a product of average quantities, as we did in section 2.3 to get the Boltzmann
approximation. However, as one considers more sophisticated lattice gas fluid (like
FHP-III [3]) or 3D models [61]), the collision term requires a very large number of
floating point operations at each lattice site and time step. Even on a massively par-
allel computer, in which every cells are computed simultaneously, this may not be
acceptable.
The first solution to this problem is to consider the same approximation as we
used with the Chapman-Enskog expansion when deriving the macroscopic behavior
of the FHP fluid. The idea is to linearize the collision term around its local equilib-
rium solution. This approach has been proposed by Higuera and coworkers [42] and
considerably reduces the complexity of the operations involved.
2.4.2
BGK models
Following the same idea, a further simplification can be considered [83]: the collision
term need not be related to an existing cellular automata microdynamics, as long as
particle and momentum are conserved. In its simplest form, the lattice Boltzmann
dynamics can be written as a relaxation equation [84,85]
1
fi(r + τ vi, t + τ ) − fi(r,t) = Ωi(f) =
f (0)(r, t)
ξ
i
− fi(r,t)
(2.77)
where fi(r, t) denotes the probability that, at time t, a particle is entering site r along
lattice direction i (note that here, we use the notation fi instead of Ni). The quantity ξ
is a relaxation time, which is a free parameter of the model. It actually will determine
the fluid viscosity.
Equivalently, equation 2.77 reads
1
1
fi(r + τ vi, t + τ ) = f (0)(r, t) + 1
f
ξ i
− ξ i(r,t)
(2.78)
82
2. Cellular Automata and Lattice Boltzmann Techniques
LA COMPLEXITÉ
which is the appropriate form for a numerical implementation.
The local equilibrium solution f (0)
i
is a function of the actual density ρ =
fi
and velocity flow ρu =
fivi. Therefore, when implementing 2.78 on a computer,
one first compute, at each site, ρ and u from the current values of the fi’s and then one
may compute f (0)(ρ, u)
i
. In general, f (0) is a nonlinear function of ρ and u and thus,
equ. 2.78 is nonlinear in the fi’s.
It is important to notice that f (0) is model dependent and can adjusted so as to
produce a given, expected, behavior. In particular, the lack of Galilean invariance that
plague cellular automata fluid can be cured, as well the spurious velocity contribution
appearing in expression 2.60 of the pressure term. In a more general context, f (0)
i
could include other physical features, such as a local temperature [81,86] and can be
tuned to describe other physical situations, as shown in section 2.7.
Equation 2.77 is referred to as the lattice BGK method[84] (BGK stands for Bhat-
nager, Gross and Krook [87] who first considered a collision term with a single relax-
ation time, in 1954). Equation 2.77 is studied by several authors [88,79], due to its
ability to deal with high Reynolds number flows. However, one difficulty of this ap-
proach are the numerical instabilities which may develop when large velocity gradients
are present.
2.4.3
Lattice Boltzmann fluids
In this section we define the generic dynamics of LB fluid models (precisely BGK
models) and derive the corresponding macrosopic behavior.
A common example of LB fluid is the so-called D2Q9 model (see [79]) defined
in two dimensions (D2) with nine variables, or quantities per sites (Q9). This lattice
and its possible directions of motion are shown in figure 2.29. Note that a nine th
direction i = 0 is defined to describe a population f0(r, t) of particles at rest (i.e.
having v0 = 0). The isotropy problems inherent to square lattices in 2D are solved
by weighting differently the eight possible directions of motion. Here we interprete
these weights as masses mi associated to the particles traveling along each direction.
Figure 2.29 (right) gives the approriate masses for the D2Q9 model.
In a general DdQ(z + 1) LB fluid, the macroscopic quantities, such as the local
density ρ or the velocity flow u are defined as usual as
z
z
ρ =
mifi
ρu =
mifivi
(2.79)
i=0
i=0
where z is again the number of non-zero velocities in the model.
We set v = λ/τ and assume that the lattice has the following properties
z
z
mi = C0
miviαviβ = C2v2δαβ
(2.80)
i=1
i=1
83
LA COMPLEXITÉ
Bastien CHOPARD
v
4
3
v
v
1
1
4
2
v
v
4
4
5
1
v
1
4
1
6
v7
v8
Figure 2.29: The eight velocities in the D2Q9 lattice Boltzmann model of a two-
dimensional fluid (on the left) and the mass associated to each of these directions
(on the right).
and
z
miviαviβviγviδ = C4v4(δαβδγδ + δαγδβδ + δαδδβγ)
(2.81)
i=1
Note, aslo, that odd tensors are supposed to vanish. For the D2Q9 model, we have
mi = 1 for diagonal motions, mi = 4 for horizontal and vertical motions and (see [14])
C0 = 20,
C2 = 12,
C4 = 4
For the D2Q7 (hexagonal lattice in two dimensions) one has mi = 1 for all i and
3
C0 = 6,
C2 = 3,
C4 = 4
The next step is to define the local equilibrium distribution f (0)
i
as a function of the
macroscopic quantities ρ and u. A natural choice is to adopt a similar expression as
obtained for the FHP model, namely equation 2.53. Accordingly, we define
b
u2
h
f (0) = aρ +
ρv
+ ρ
v
i
v2
i · u + ρev2
v4 iαviβuαuβ
i ≥ 1
u2
f (0)
0
= a0ρ + ρe0
(2.82)
v2
where a, a0, b, e, e0 and h are coefficients which will now be determined, first using
mass and momentum conservation, and second by matching the form of the momen-
tum tensor with the standard expression of hydrodynamics.
Mass and momentum conservation impose
z
z
miΩi = 0
and
miviΩi = 0
i=0
i=0
This implies that
z
z
mif (0) = ρ
m
= ρu
i
and
ivif (0)
i
(2.83)
i=0
i=0
84
2. Cellular Automata and Lattice Boltzmann Techniques
LA COMPLEXITÉ
because ρ and ρu are defined through relations 2.79. Using relations 2.80 and 2.81, we
obtain from 2.82
z
u2
mif (0) = (m
i
0a0 + C0a)ρ + (m0e0 + C0e + C2h)ρ v2
i=0
z
mivif (0) = C
i
2bρu
i=0
(2.84)
As in the case of a CA fluid (see section 2.3) we assume here that the LB dynamics
can be solved by a multiscale Chapman-Enskog expansion. Thus, we write
fi = f (0) + f (1) + ...
i
i
and the zeroth order of the momentum tensor is
8
Π(0) =
m
αβ
iviαviβ f (0)
i
i=0
C
u2
= C
4
2v2
a + e +
h
ρδ
C
αβ + 2C4huαuβ
2
v2
(2.85)
In a real fluid, when the dissipative terms are disregarded (Euler equation) one has the
following expression for the momentum tensor
Π(0) = c2ρδ
αβ
s
αβ + ρuαuβ
(2.86)
where cs is the sound speed.
By comparing equation 2.83 with 2.84 and equation 2.86 with 2.85 we obtain the
following conditions
1 c2
C c2
a =
s
m
0
s
C
0a0 = 1 −
2 v2
C2 v2
and
1
1
C
C
1
b =
e =
m
0
2
h =
C
−
0e0 =
−
2
2C2
2C2
2C4
2C4
Where the sound speed is considered as an adjustable parameter. With the above result,
we can rewrite 2.82 as
1 c2
1 v
1
C
f (0) = ρ
s +
i · u +
v
4 δ
i
C
iαviβ − v2
αβ
uαuβ
2 v2
C2 v2
2C4v4
C2
C c2
C
C
u2
m
0
s
0
2
0f (0)
= ρ 1
+
0
− C
−
2 v2
2C2
2C4
v2
(2.87)
85
LA COMPLEXITÉ
Bastien CHOPARD
2.4.4
The Navier-Stokes equation
In equation 2.50, we have obtained the following result
τ
∂tρuα + ∂β Παβ +
∂ Π(0) + ∂
= 0
(2.88)
2
t1
αβ
γ S(0)
αβγ
where t = t1 + t2 and r = r
2
1/
take into account the different time scales of the
problem (see equations 2.29) and (2.32).
The derivation of 2.88 only relies on the fact that
miΩi = 0 and
miΩivi = 0
and, thus, this equation is still valid here.
We have already obtained Π(0) in the previous section. We still need to compute
S(0) and Π(1). The quantity f (1) is defined by a similar equation as obtained in rela-
αβγ
tion 2.34, namely
1 z
∂Ωi(f (0)) f(1) = ∂ f(0) + ∂
τ
∂f
j
t1 i
1αviαf (0)
i
j=0
j
Since Ωi = 1 f (0)(r, t)
ξ
i
− fi(r,t) , the above equation simply reads
1
− f(1) = ∂ f(0) + ∂
τ ξ i
t1 i
1αviαf (0)
i
∂f (0)
∂f (0)
= − i div
i
∂
+ ∂
∂ρ
1ρu − ∂ρu 1βΠ(0)
αβ
1αviαf (0)
i
α
(2.89)
with div1 =
∂
α
1α. We shall now compute f (1)
i
to the first order in u. We have
b
f (0) = aρ +
ρv
i
v2
iαuα
f (0)
0
= a0ρ
and
Π(0) = c2ρδ
αβ
s
αβ
(2.90)
Thus,
∂f (0)
∂f (0)
b
∂f (0)
∂f (0)
i
= a
i
=
v
0
= a
0
= 0
∂ρ
∂ρu
iα
0
α
v2
∂ρ
∂ρuα
and we obtain
1
f (1) =
v
i
−τξC
iγ viδ − c2sδγδ ∂1γρuδ and f(1)
0
= −τξa0div1ρu
2v2
Using that ∂1γ = ∂γ, the order O( ) contribution to Π reads
z
Π(1) =
m
v
αβ
if (1)
i
iαviβ
i=0
c2
C
C
= τ v2ξ
s
4
δ
4 (∂
v2 − C
αβ divρu −
β ρuα + ∂αρuβ )
(2.91)
2
C2
86
2. Cellular Automata and Lattice Boltzmann Techniques
LA COMPLEXITÉ
Thus
C
c2
C
∂
4
s
4
β Π(1) =
∂
∂2
αβ
−τv2ξ 2C −
αdivρu +
β ρuα
(2.92)
2
v2
C2
From this expression, we get two viscosity coefficients (shear and bulk viscosity), as
usual in compressible fluids.
The final step is the calculation of the lattice viscosity. The first term in 2.88 giving
a contribution to the lattice viscosity is ∂β(τ /2)∂t Π(0). With Π(0) = c2ρδ
1
αβ
αβ
s
αβ + O(u2),
we have
τ
c2
c2
∂ Π(0) = τ s ∂ ρδ
s δ
2 t1 αβ
2
t1
αβ = −τ 2 αβdivρu
where we have used that ∂t ρ + div
1
1ρu = 0 (see equation 2.41) and the definition of
the length scale div1 = div. Therefore
τ
c2
∂
Π(0) =
s ∂
2 β∂t1 αβ
−τ 2 αdivρu
(2.93)
Similarly, we must compute the contribution due to S(0) in equation 2.88
αβγ
z
S(0)
=
m
αβγ
iviαviβ viγ f (0)
i
i=0
C
= v2 4 ρ(u
C
γ δαβ + uβ δαγ + uαδβγ )
(2.94)
2
Consequently, we obtain the dissipative lattice contributions
τ
τ v2 C
2C
c2
∂
Π(0) + ∂
=
4
2(ρu
4
s
∂
2 β
∂t1 αβ
γ S(0)
αβγ
2
C
α) +
−
αdivρu
(2.95)
2
C2
v2
Finally, after substitution of 2.95, 2.92 and 2.90 into equation 2.88, we obtain
C
1
∂
4
2
tρuα + ρuβ ∂β uα
+ uαdivρu = −c2∂
ξ
ρu
s αρ + τ v2 C
−
α +
2
2
1
C
c2
τ v2 ξ −
2 4
s
∂
2
C −
αdivρu
(2.96)
2
v2
In the case of an incompressible fluid (at low Mach number, for instance) one has
divρu = 0 and one recovers the usual Navier-Stokes equation
1
∂
2
tu + (u · )u = −
p + ν
u
(2.97)
ρ
lb
where p = c2sρ is the scalar pressure and νlb is the kinematic viscosity
C
1
ν
4
lb = τ v2
ξ
(2.98)
C
−
2
2
87
LA COMPLEXITÉ
Bastien CHOPARD
As we see from this reslut, there are two free physical quantities in this model, c2s and
ξ, and three parameters C0, C2 and C4 depending of the specific lattice chosen for the
simulation.
Changing cs within acceptable limits will modify the sound speed (or the tempera-
ture, since p = c2ρ
< (C
s ). Clearly c2
s
2/C0)v2 otherwise a0 become negative.
Also, the relaxation time ξ can be tuned to adjust the viscosity within some range.
We can see that when ξ is small, relaxation to f (0) is fast and viscosity small. This
means that the collision between the particles are quite effective to restore the local
equilibrium.
However, ξ cannot be made arbitrarily small since ξ < 1/2 would imply a negative
viscosity. Practically, more restrictions are expected, because the dissipation length
scale should be much larger than the lattice spacing. The value ξ = 1/2 yields numer-
ical instability and the smaller acceptable value depends on the velocity gradients.
Figure 2.30: Non-stationary flow past a plate obtained with the D2Q8 lattice Boltz-
mann model. System size is 512 × 128, ξ = 1. and the entry speed is u∞ = 0.025.
From left to right and top to bottom, the figure shows the different stage of evolution.
Figure 2.30 illustrates the behavior of the LB fluid in a simulation of a flow past a
plate leading to a von Karman street pattern.
2.4.5
A short summary of LB models
This section summarizes the main finding of the above discussion, in order to highlight
the important steps necessary to implement on a computer a LB fluid simulation.
1.
The system is described in terms of z + 1 quantities fi(r, t) giving the probability
of presence of a particle entering lattice site r with velocity vi, at time t. The field f0
corresponds to a population of rest particles, with v0 = 0. The other possible velocities
vi depend on the lattice under consideration. Usually, one has slow and fast velocities.
88
2. Cellular Automata and Lattice Boltzmann Techniques
LA COMPLEXITÉ
The former ones have modulus v = λ/τ where λ is the lattice spacing and τ the time
step. The modulus of fast velocities are lattice-dependent.
The physical quantities are the density ρ and velocity field u defined as
z
z
ρ =
mifi
ρu =
mifivi
i=0
i=1
where m0 can be set to 1 without loss of generality, and the other mi are chosen so
as to ensure the isotropy of the fourth order tensor
z
m
i=1
iviαviβ viγ viδ . Usually, the
value of mi depends whether vi is a fast or slow velocity.
2.
For lattice sites not corresponding to a boudary of the system, the dynamics is
given by equ. 2.78
1
1
fi(r + τ vi, t + τ ) = f (0)(r, t) + 1
f
ξ i
− ξ i(r,t)
(2.99)
For boundary sites, a no-slip condition is enforced by bouncing back the incoming fi.
The local equilibrium distribution is given by equ. 2.87. From a numerical point of
view, it make sense to compute directly mifi and mif (0)
i
1 c2
1 v
1
C
m
s
i · u
4
if (0)
= m
+
+
v
δ
i
iρ
C
iαviβ − v2
αβ
uαuβ
2 v2
C2 v2
2C4v4
C2
C c2
C
C
u2
m
0
s
0
2
0f (0)
0
= ρ 1 −
+
C
−
2 v2
2C2
2C4
v2
where ρ and u are computed from the current values of fi, as explained in step 1.
3.
The coefficients C0, C2 and C4 are defined in equ. 2.80 and 2.81, i.e.
z
z
mi = C0
miviαviβ = C2v2δαβ
i=1
i=1
and
z
miviαviβviγviδ = C4v4(δαβδγδ + δαγδβδ + δαδδβγ)
i=1
These quantities are lattice dependent and are given in table 2.1 for some standard
lattices (see also [79] for a slightly different formulation). Note that there is some
arbitrariness in the choice of the mi. The important point is to keep the correct ratio
between the slow and fast masses. Then, if all mi, mi ≥ 1 are multiplied by the same
factor, C0, C2 and C4 are modified proportionally and it is easy to check that f (0)
0
and
mif (0)
i
are invariant under such a scaling of mass.
89
LA COMPLEXITÉ
Bastien CHOPARD
model
slow velocities
fast velocities
C0 C2
C4
geometry
D1Q3
|vi| = v, mi = 1
2
2
2/3
linear lattice
D2Q9
|vi| = v, mi = 4 |vi| = √2v, mi = 1 20 12 4 square lattice
D2Q7
|vi| = v, mi = 1
6
3
3/4
hex lattice
D3Q15
|vi| = v, mi = 1 |vi| = √3v,mi = 1/8 7 3 1 cubic lattice
D3Q19
|vi| = v, mi = 2 |vi| = √2v,mi = 1 24 12 4 cubic lattice
Table 2.1: The geometrical coefficients necessary to compute the local equilibrium
distribution in a LB simulation.
The coefficients ξ determines the fluid viscosity as
C
1
ν
4
lb = τ v2
ξ
C
−
2
2
and cs can be tune to select the sound speed. The maximal value is limited by
c2 < (C
s
2/C0)v2
A commonly chosen value is c2 = v2(C
s
4/C2).
Remember that numerical instabilities may develop in LB fluid models (see sec-
tion 2.4.6).
4.
Up to order O(u2), the above numerical scheme solves the continuty equation
∂tρ + divρu = 0
and Navier–Stokes equation
∂
2
tρuα + ρuβ ∂β uα
+ uαdivρu = −c2∂
ρu
s αρ + νlb
α +
1
C
c2
τ v2 ξ −
2 4
s
∂
2
C −
αdivρu
(2.100)
2
v2
This equation simplifies when (u/cs) << 1 since, at low Mach number one may
assume that divρu = 0.
2.4.6
Subgrid models
In order to achieve high Reynolds number, the viscosity should be made as small as
possible. A solution to overcome the problem of numerical instabilities that appear
when ξ ≈ 1/2 is to make the viscosity vary locally at each time step so as to self-
adjust to the flow pattern.
This is the main idea of what is called a subgrid model (a standard approach in
computational fluid dynamics). One assumes that an effective viscosity results from
the unresolved scales, that is the scale below the lattice spacing λ. Our goal is not
90
2. Cellular Automata and Lattice Boltzmann Techniques
LA COMPLEXITÉ
to give a theoretical discussion of subgrid models but, rather, to adopt a pragmatic
approach and show how to introduce eddy viscosity in a lattice Boltzmann model,
following Hou et al.’ s work [88].
Instability appears where the magnitude of the local strain tensor
1
Sαβ =
(∂
2
β uα + ∂αuβ )
is large. In the so-called Smagorinski subgrid model, the relaxation time is modified
as
ξ = ξ + 3C2
λ2
smago
|S|
where ξ = 1/2 and Csmago > 0 is the Smagorinski constant (in practice, a parameter
of the model set for instance to 0.5).
The magnitude |S| of the tensor Sαβ can be computed locally, without taking
extra derivatives, just by considering the nonequilibrium momentum tensor Π(1) =
αβ
v
) Then, the quantity
i
iαviβ (fi − feq
i
|S| is directly obtained as
−ξ + ξ2 + 18λ2C2
Π(1)Π(1)
smago
αβ
αβ
|S| =
6λ2C2smago
Therefore the bare viscosity ν is transformed into
ν = ν + νt
with νt the Smagorinski eddy viscosity
νt = C2
λ2
smago
|S|
2.4.7
Pattern formation in snow transport
In this section, we apply the LB and CA methods to the problem of modeling the
formation of snowdrifts (see [89,90] for more details). The dynamics of solid particles
erosion, transport and deposition due to the action of a streaming fluid plays a crucial
role in sand dune formation, sedimentation problems and snow transport. This field
remains rather empirical compared to other domains of science and experts do not all
agree on the mechanisms involved in these process. The CA and LB approach give a
new and promising way to address these difficult problems.
Phenomenologically, snow transport has been divided in three main processes,
creeping, saltation and suspension, each corresponding to a different observation
scale [91]. Various patterns of accumulations can be observed [91,92], with quite
different characteristic sizes: they range from the small ripples (oscillations of a few
centimeters over a flat surface) up to large wind slabs tens of meters long leeward a
mountain crest.
91
LA COMPLEXITÉ
Bastien CHOPARD
The wind-snow model
In our approach, the wind is described with a D2Q9 LB model, using the subgrid
technique presented in the previous section in order to achieve large enough Reynolds
number flows. The Smagorinski constant is varied from 0 to C ∞ within the few lattice
sites above the ground profile (i.e. the top of the deposition layer) so as to ensure a log
profile for the wind velocity field u [89].
The snow transport is obtained by adding solid particles on top of the LB wind
model. We define Ni(r, t) ∈ {0, 1, 2, ..., ∞} as the number of particles entering site r
at time t with velocity vi. Snow particles can be injected in the simulation (snowfall) or
eroded from the ground, deposited and transported according to the combined effect
of gravity and local wind velocity. We consider the following rules to describe the
transport, erosion and deposition phenomena:
Transport: an arbitrary number of snow grains may reside at each lattice site.
During the updating step, they synchronously move to the nearest neighbor sites. Be-
tween times t and t + τs, particles at site r should move to r + τsw, where τs is the time
step associated to solid particle motion, w = u + ufall, with u the local wind speed
and ufall the falling speed (accounting for gravity). Usually, r + τsw does not corre-
spond to a lattice node and the amount of grains that reach each neighbor is computed
according to the following randomized algorithm, which ensures that the average mo-
tion is correct. One computes pi = max(0, (τs/τ )(vi · w)/|vi|2), for i = 1, 3, 5, 7
(if pi > 0, then pi+4 = 0, since vi = −vi+4). For efficiency, we choose τs ≥ τ,
but small enough so that pi is always less than 1. Then, each particle
jumps to site
r +µ v
v
v
v
1 1 + µ3 3 + µ5 5 + µ7 7, where µi is a Boolean quantity which is 1 with probabil-
ity pi. If N =
Ni is large enough, this binomial scattering can be approximated by
a Gaussian distribution [93]. Note that in this algorithm, there is no attempt to include
specific rules for creeping, saltation or suspension.
Deposition: lattice sites can be either solid (original landscape or deposited snow)
or free (air). Snow particles on a free site may “freeze” if the neighbor site i they want
to jump to is a solid site: Nfrozen → Nfrozen + Ni, Ni → 0. When Nfrozen exceed
some pre-assigned threshold Ns, the site becomes solid and subsequent incoming wind
particle will bounce back (hence defining a new ground profile). This threshold gives
a way to assign some size to the snow flakes. When a site solidifies the wind particles
that may be present get trapped until erosion frees them again.
Erosion: deposited particles may be eroded under some conditions. For snow, the
erosion rate seems to be related to the wind speed above the solid site [94], the con-
centration of snow being transported, the saturation concentration and the efficiency
of the transport [91]. In our model, we express these mechanisms in a very simple
way: erosion means that each frozen snow particle is ejected upwards (N3 → N3 + 1,
Nfrozen → Nfrozen − 1) with probability p. When the local wind is fast enough, these
ejected snow particles will be transported. Otherwise they fall back and freeze again.
92
2. Cellular Automata and Lattice Boltzmann Techniques
LA COMPLEXITÉ
0.8
0.6
30
0.4
0.2
20
0.00.0
0.5
1.0
1.5
[m]
10
0 0
20
40
60
[lattice sites]
Figure 2.31: Deposition pattern of snow in a trench (0.7m×1.7m, lattice spacing
0.03m, Ns = 10, p = 0.04 and C∞ = 0.3). Snow particles are introduced on the
left corner of the simulation; profiles are shown every 1000 iterations. The experimen-
tal profiles measured by Kobayashi [95] (given every 1/2 hour for the first layers) are
sketched in the inset.
Deposition patterns
The above simple rules, when combined with the LB wind dynamics, are sufficient to
produce realistic deposition patterns at different space scales by varying the Smagorin-
ski constant C∞, the threshold Ns and the erosion probability p [90].
Figure 2.31 illustrates the filling of a trench excavated in a large flat area. Good
qualitative agreement is observed between the model and reality [95], mainly for the
first part of the experiment (growth of two deposition peaks) before the wind has
slowed down in the outdoor experiment.
Figure 2.32 shows small scale patterns known as ripples occurring with both sand
and snow transport. Ripples are mainly due to creeping transport. The ratio we find
between the height and the spacing of the oscillations (called the wave index) ranges
around 6; this value agrees with the lowest index found for sand [96] in field obser-
vations, fits well wind tunnel experiments values [97] and sand ripples in water [96].
Outdoor snow ripples are more complicated since freezing and cohesion have to be
taken into account; their wave index has been measured to be around 16 [91,98]. In
agreement with real observations, we also see in our simulation that ripples move hor-
izontally. This effect is illustrated in the figure. As observed in [99], our model also
shows that large ripples can be built through the merging of smaller ones, traveling
faster.
This model not only produces quantitatively realistic deposits, it also provides,
through simple and intuitive rules, a better understanding of the basic (and quite con-
troversial) mechanisms that occur in particle transport. Various patterns of deposition
result from the emergence of a collective effect rather than from mechanisms that have
93
LA COMPLEXITÉ
Bastien CHOPARD
60000
50000
40000
30000
[time steps]
20000
10000
0 0
100
200
300
400
[lattice sites]
Figure 2.32: Formation of ripples, as obtained from our model. Particles are contin-
uously injected in the lower left corner of the simulation and the ripples grow spon-
taneously. The deposition profile is given every 1000 time steps, which makes the
horizontal ripple motion quite clear (as well as the higher speed of the smaller ripples
“escaping” rightwards). The lattice spacing is around 0.03m, Ns = 10, p = 0.02 and
C∞ = 5.0.
not yet been identified. In a CA type of approach, creeping, saltation or suspension are
no longer three phenomena requiring each a special treatment: they are all captured
by the same erosion/transport mechanisms. Therefore, a unified view of the basic laws
governing the formation of particle deposition pattern is gained.
2.5
Reaction-Diffusion systems
Diffusive phenomena and reaction processes play an important role in many areas of
physics, chemistry and biology and still constitute an active field of research. Sys-
tems in which reactive particles are brought into contact by a diffusion process and
transform, often give rise to very complex behaviors. Pattern formation [100,101], is a
typical example of such a behavior in reaction-diffusion processes.
In addition to a clear academic interest, reaction-diffusion phenomena are also
quite important in technical sciences and still constitute numerical challenges. As an
example, we may mention the famous problem of carbonation in concrete [102,103].
In many reaction-diffusion problems a particle based model, such as a lattice gas
dynamics, provides a useful approach and efficient numerical tool.
For instance, processes such as aggregation, formation of a diffusion front, trapping
of particles performing a random walk in some specific region of space [104,105], or
the adsorption of diffusing particles on a substrate [106] are important problems that
are difficult to solve with the standard diffusion equation. A microscopic model, based
on a cellular automata dynamics, is therefore of clear interest.
Reaction processes, as well as growth mechanisms are most of the time nonlinear
phenomena, characterized by a threshold dynamics. While they are naturally imple-
mented in terms of a point-particles description they may be very difficult to analyze
94
2. Cellular Automata and Lattice Boltzmann Techniques
LA COMPLEXITÉ
theoretically and even numerically, with standard techniques, due to the important role
that fluctuations may play. In the simplest cases, fluctations are responsible for sym-
metries breaking which may produce interseting patterns, as we shall see later in this
section.
More surprisingly, microscopic fluctuations are sometime relevant at a macro-
scopic level of observation because they may induce an anomalous dynamics, as in
the A + A → 0 or A + B → 0 annihilation reactions [34,107]. These systems depart
from the behavior predicted by the classical approach based on differential equations
for the densities. The reason is that they are fluctuations-driven and that correlations
cannot be neglected. In other words, one has to deal with a full N-body problem and
the Boltzmann factorization assumption is not valid. For this kind of problem, a lattice
gas automata approach turns out to be a very successful approach.
Cellular automata particles can be equipped with diffusive and reactive proper-
ties, in order to mimic real experiments and model several complex reaction-diffusion-
growth processes, in the same spirit as a cellular automata fluid simulates a fluid flow:
these systems are expected to retain the relevant aspects of the microscopic world
they are modeling. Diffusion can be obtained with the rule described in section 2.2.6.
Chemical reactions, such as A + B → C, are treated in an abstract way, as a particle
transformation phenomena rather than a real chemical interaction.
Within the CA approach, there are two ways of modeling a spatially extended
system with local reactive interactions. The first one is to use a standard CA scheme:
each cell is updated according to the state of its neighbors. The second way is to
consider a lattice gas (LG) approach. As already mentioned, LG are a particular class
of cellular automata, characterized by a two-phase dynamics: first, a completely local
interaction on each lattice point, and then particle transport (or propagation) to nearest-
neighbor sites. This way of partitioning the space prevents the problem of having a
particle simultaneously involved in several different interactions.
Here we shall start the discussion with the first kind of model. Some reactive
phenomena can be nicely described by simple rule, without the space partitioning of
the LG paradigm. In section 2.5.1, we present a model of excitable media in which
chemical waves are observed and, in section 2.5.2, we shall see an example of a surface
reaction on a catalytic substrate.
Then, in section 2.5.3, we shall concentrate on the LG approach which is well
suited to represent many reaction-diffusion processes in terms of fictitious particles
evolving in a discrete universe. We shall first present the generic model for diffusion
with only one species of particles. The approach can be extended to the case where
several different chemical species coexist simultaneously on the same lattice and dif-
fuse. It just requires more bits of information to store the extra automaton states. Then,
it is easy to supplement the diffusion rule with the annihilation or creation of particles
of a different kind, depending on the species present at each lattice site and the reaction
rule under study.
The microdynamics will be given, as well as its link to macroscopic rate equa-
tions. The corresponding LB extension will be discussed too. As an illustration of the
95
LA COMPLEXITÉ
Bastien CHOPARD
method, an application to the formation of patterns of precipitate in a reaction-diffusion
process (the so-called Liesegang structures) will be presented.
Note that in section 2.6, we shall consider other reaction-diffusion processes, using
the multiparticle method. Other examples and applications can be found in [36].
2.5.1
Excitable media
t=5
t=110
t=115
t=120
Figure 2.33: Excitable medium: evolution of a stable initial configuration with 10% of
excited states φ = 1, for n = 10 and k = 3. The color black indicates resting states.
After a transient phase, the system sets up in a state where pairs of counter-rotating
spiral waves propagate. When the two extremities come into contact, a new, similar
pattern is produced.
An excitable medium is basically characterized by three states [36]: the resting
state, the excited state and the refractory state. The resting state is a stable state of the
system. But a resting state can respond to a local perturbation and become excited.
Then, the excited state evolves to a refractory state where it no longer influences its
neighbors and, finally, returns to the resting state.
A generic behavior of excitable media is to produce chemical waves of various
geometries [108,109]. Ring and spiral waves are a typical pattern of excitations. Many
96
2. Cellular Automata and Lattice Boltzmann Techniques
LA COMPLEXITÉ
chemical systems exhibits an excitable behavior. The Selkov model [110] and the
Belousov–Zhabotinsky reaction are examples. Chemical waves play an important role
in many biological processes (nervous systems, muscles) since they can mediate the
transport of information from one place to another.
The Greenberg–Hasting model is an example of a cellular automata model of
an excitable media. This rule, and its generalization, have been extensively stud-
ied [111,112].
The implementation we propose here for the Greenberg–Hasting model is the fol-
lowing: the state φ(r, t) of site r at time t takes its value in the set {0, 1, 2, ..., n − 1}.
The state φ = 0 is the resting state. The states φ = 1, ..., n/2 (n is assumed to be even)
correspond to excited states. The rest, φ = n/2 + 1, ..., n − 1 are the refractory states.
The cellular automata evolution rule is the following:
1. If φ(r, t) is excited or refractory, then φ(r, t + 1) = φ(r, t) + 1 mod n.
2. If φ(r, t) = 0 (resting state) it remains so, unless there are at least k excited sites
in the Moore neighborhood of site r. In this case φ(r, t) = 1.
The n states play the role of a clock: an excited state evolves through the sequence of
all possible states until it returns to 0, which corresponds to a stable situation.
The behavior of this rule is quite sensitive to the value of n and the excitation
threshold k. Figures 2.33 and 2.34 show the evolution of this automaton for two dif-
ferent sets of parameters n and k. Both simulations are started with a uniform config-
uration of resting states, perturbed by some excited sites randomly distributed over the
system. If the concentration of perturbation is low enough, excitation dies out rapidly
and the system returns to the rest state. Increasing the number of perturbed states leads
to the formation of traveling waves and self-sustained oscillations may appear in the
form of ring or spiral waves.
t=5
t=20
t=250
Figure 2.34: Excitable medium: evolution of a configuration with 5% of excited states
φ = 1, and 95% of resting states (black), for n = 8 and k = 3.
The Greenberg–Hasting model has some similarity with the “tube-worms” rule
proposed by Toffoli and Margolus [10]. This rule is intended to model the Belousov–
Zhabotinsky reaction and is as follows. The state of each site is either 0 (refractory) or
1 (excited) and a local timer (whose value is 3, 2, 1 or 0) controls the refractory period.
97
LA COMPLEXITÉ
Bastien CHOPARD
Figure 2.35: The tube-worms rule for an excitable media
Each iteration of the rule can be expressed by the following sequence of operations:
(i) where the timer is zero, the state is excited; (ii) the timer is decreased by 1 unless
it is 0; (iii) a site becomes refractory whenever the timer is equal to 2; (iv) the timer is
reset to 3 for the excited sites which have two, or more than four, excited sites in their
Moore neighborhood.
Figure 2.35 shows a simulation of this automaton, starting from a random initial
configuration of the timers and the excited states. We observe the formation of spiral
pairs of excitations. Note that this rule is very sensitive to small modifications (in
particular to the order of operations (i) to (iv)).
Another rule which is also similar to Greenberg-Hasting and Margolus-Toffoli
tube-worm models is the so-called forest-fire model. This rule describes the propa-
gation of a fire or, in a different context, may also be used to mimic contagion in case
of an epidemic. Here we describe the case of a forest-fire rule.
The forest-fire rule is a probabilitic CA defined on a d-dimensional hypercubic
lattice. Initially, each site is occupied by either a tree, a burning tree or is empty. The
state of the system is parallel updated according to the following rule: (1) a burning
tree becomes an empty site; (2) a green tree becomes a burning tree if at least one of
its nearest neighbors is burning; (3) at an empty site, a tree grows with probability p;
(4) A tree without a burning nearest neighbor becomes a burning tree during one time
step with probability f (lightning).
Figure 2.36 illustrates the behavior of this rule, in a two-dimensional situation.
Provided that the time scales of tree growth and burning down of forest clusters are
well separated (i.e. in the limit f /p → 0), this models has self-organized critical
states [113]. This means that in the steady state, several physical quantities character-
izing the system have a power law behavior. For example, the cluster size distribution
N(s) and radius of a forest cluster R(s) vary with the number of trees s in the forest
98
2. Cellular Automata and Lattice Boltzmann Techniques
LA COMPLEXITÉ
Figure 2.36: The forest fire rule: grey sites correspond to a grown tree, black pixels
represent burned sites and the white color indicates a burning tree. The snapshot given
here represents the situation after a few hundred iterations. The parameters of the rule
are p = 0.3 and f = 6 × 10−5.
cluster as N(s) ∼ s−τC(s/smax) and R(s) ∼ s1/µS(s/smax) Scaling relations can be
established between the critical exponents τ and µ, and the scaling functions C and S
can be computed.
2.5.2
Surface reaction models
The problem of nonequilibrium phase transition is an important topics in physics. The
situation is not as clear as it is for equilibrium systems and no general theory is avail-
able to describe such systems. Most of the known results are based on numerical
simulations. However, as is the case for equilibrium systems, the concept of universal-
ity classes appears to be relevant although we do not completely understand how the
universality classes are characterized.
In this section, we discuss the case of a nonequilibrium phase transition in a simple
model of reaction on a catalytic surface. The system is out of equilibrium because it
is an open system in which material continuously flows in and out. However, after a
while, it reaches a stationary state and, depending on some control parameters, may be
in different phases.
The system we shall consider is the so-called Ziff model [114]. This model is
based upon some of the known steps of the reaction A − B2 on a catalyst surface (for
example CO − O2). The basic steps are
• A gas mixture with concentrations XB of B
2
2 and XA of A sits above the surface
and can be adsorbed. The surface can be divided into elementary cells. Each cell
can adsorb one atom only.
99
LA COMPLEXITÉ
Bastien CHOPARD
• The B species can be adsorbed only in the atomic form. A molecule B2 disso-
ciates into two B atoms only if two adjacent cells are empty. Otherwise the B2
molecule is rejected. The first two steps correspond to the reactions
A → A(ads)
B2 → 2B(ads)
(2.101)
• If two nearest neighbor cells are occupied by different species they chemically
react according to the reaction
A(ads) + B(ads) → AB(desorb)
(2.102)
and the product of the reaction is desorbed. In the example of the CO − O2
reaction, the desorbed product is a CO2 molecule.
This final desorption step is necessary for the product to be recovered and for the cata-
lyst to be regenerated. However, the gas above the surface is assumed to be continually
replenished by fresh material so that its composition remains constant during the whole
evolution.
It is found by sequential numerical simulation [114] that a reactive steady state
occurs only in a window defined by
X1 < XA < X2
where X1 = 0.389±0.005 and X2 = 0.525±0.001 (provided that XB = 1
2
−XA). This
situation is illustrated in figure 2.37, though for the corresponding cellular automata
dynamics and XB = 1
2
− XA.
Outside this window, the steady state is a “poisoned” catalyst of pure A (XA > X2)
or pure B (XA < X1). For XA > X1, the coverage fraction varies continuously with
XA and one speaks of a continuous (or second-order) nonequilibrium phase transition.
At XA = X2, the coverage fraction varies discontinuously with XA and one speaks
of a discontinuous (or first-order) nonequilibrium phase transition. The asymmetry
of behavior at X1 and X2 comes from the fact that A and B atoms have a different
adsorption rule: two vacant adjacent sites are necessary for B to stick on the surface,
whereas one empty site is enough for A.
From a physical point of view, the dynamics of such a system is not sequential
since many cells can be reacting simultaneously, within a given small time interval.
A parallel, asynchronous dynamics would then be a more realistic updating scheme.
However, it is interesting to study the Ziff model with a fully parallel, synchronous
cellular automata dynamics [115], which represents the other limiting case.
In a CA approach the elementary cells of the catalyst are mapped onto the cells of
the automaton. In order to model the different processes, each cell j can be in one of
four different states, denoted |ψj = |0 , |A , |B or |C .
The state |0 corresponds to an empty cell, |A to a cell occupied by an atom A,
and |B to a cell occupied by an atom B. The state |C is artificial and represents a
100
2. Cellular Automata and Lattice Boltzmann Techniques
LA COMPLEXITÉ
precursor state describing the conditional occupation of the cell by an atom B. Con-
ditional means that during the next evolution step of the automaton, |C will become
|B or |0 depending upon the fact that a nearest neighbor cell is empty and ready to
receive the second B atom of the molecule B2. This conditional state is necessary to
describe the dissociation of B2 molecules on the surface.
Figure 2.37: Typical microscopic configuration in the stationary state of the CA Ziff
model, where there is coexistence of the two species. The simulation corresponds to
the generalized model described by rules R1, R2, R3 and R4 below. The gray and black
dots represent, respectively, the A and B particles, while the empty sites are white. The
control parameter XA is larger in the right image than it is in the left one.
The main difficulty when implementig the Ziff model with a fully synchronous
updating scheme is to ensure that the correct stoichiometry is obeyed. Indeed, since
all atoms take a decision at the same time, the same atom could well take part to a
reaction with several different neighbors, unless some care is taken.
The solution to this problem is to add a vector field to every site in the lattice [116],
as shown in figure 2.38. A vector field is a collection of arrows, one at each lattice site,
that can point in any of the four directions of the lattice. The directions of the arrows
at each time step are assigned randomly. Thus, a two-site process is carried out only
on those pairs of sites in which the arrows point toward each other (matching nearest-
neighbor pairs (MNN)). This concept of reacting matching pairs is a general way to
partition the parallel computation in local parts.
In the present implementation, the following generalization of the dynamics is in-
cluded: an empty site remains empty with some probability. One has then two control
parameters to play with: XA and XB that are the arrival probability of an A and a B
2
2
molecule, repectively.
Thus, the time evolution of the CA is given by the following set of rules, fixing the
state of the cell j at time t + 1, |ψj (t + 1), as a function of the state of the cell j and
its nearest neighbors (von Neumann neighborhood) at time t. Rules R1, R4 describe
the adsorption–dissociation mechanism while rules R2, R3 (illustrated in figure 2.38)
101
LA COMPLEXITÉ
Bastien CHOPARD
Figure 2.38: Illustration of rules R2 and R3. The arrows select which neigbor is
considered for a reaction. Dark and white particles represent the A and B species,
respectively. The shaded region corresponds to cells that are not relevant to the present
discussion such as, for instance, cells occupied by the intermediate C species.
describe the reaction–desorption process.
R1 : If |ψj (t) = |0 then
|A withprobabilityXA
|ψj (t + 1) = |C withprobabilityXB
(2.103)
2
R2 : If
|0 withprobability1−XA−XB2
|ψj (t) = |A then
|0 iftheMNN
ψj (t + 1) = ofj wasinthestate|B attimet
(2.104)
R3 : If
|A otherwise
|ψj (t) = |B then
|0 iftheMNN
|ψj (t + 1) = ofj wasinthestate|A attimet
(2.105)
R4 : If
|B otherwise
|ψj (t) = |C then
|ψj (t + 1) = |B if MNN is in the state |C at time t
(2.106)
|0 otherwise
In addition, equation 2.106 is supplemented by the following rule: a cell in the interme-
diate state C will give two adjacent B atoms if its matching arrow points to an empty
site which is not pointed to by another C state. Rule R4 is illustrated in figure 2.39.
102
2. Cellular Automata and Lattice Boltzmann Techniques
LA COMPLEXITÉ
Figure 2.37 shows typical stationary configurations obtained with a cellular au-
tomata version of the Ziff model. At time t = 0, all the cells are empty and a randomly
prepared mixture of gases with fixed concentrations XA and XB sits on top of the
2
surface. The rules are iterated until a stationary state is reached. The stationary state
is a state for which the mean coverage fractions X aA and XaB of atoms of type A or
B does not change in time, although microscopically the configurations of the surface
changes.
(a)
(b)
Figure 2.39: Dissociation rule R4. The B2 molecule (or C state) is represented as two
disks on top of each other. Dissociation is possible if the upper disk can move to the
site indicated by the arrow without conflict with other moves.
The phase diagram obtained for this generalized CA Ziff model is given in fig-
ure 2.40, with the value XB = 0.1. This phase diagram is topologically similar to
2
the sequential updating case (with XB = 1
2
− XA) since we observe a first and a sec-
ond order transition surrounding a region of coexistence of both species. However the
locations of the critical points are different, illustrating the nonuniversal character of
these quantities.
1
X =0.1
B2
B coverage
A coverage
0
0.04
0.06
XA
Figure 2.40: Stationary state phase diagram corresponding to the CA Ziff model.
103
LA COMPLEXITÉ
Bastien CHOPARD
2.5.3
The reaction-diffusion rule
In this section we shall introduce a LGA model for rection-diffusion proceesses. Our
model will be very similar in spirit to the cellular automata fluids discussed in sec-
tion 2.3 except that, here, the collision rule will reproduce a diffusive behavior and
implement some particle transformations. We shall first discuss the diffusion rule and
then show how a reaction term can be included.
The diffusion rule
At a microscopic level of description, a diffusive phenomena corresponds to the ran-
dom walk of many particles. Particle number is conserved but not momentum. This
random motion is typically due to the properties of the environment the particles are
moving in. When one is not interested in an explicit description of this environment, it
can be considered as a source of thermal noise and its effective action on the particles
can be assumed to be stochastic. Thus, the CA rule proposed in section 2.2.6 gives us
the basic model for diffusion.
This evolution rule requires random numbers and then corresponds to a probabilis-
tic cellular automaton.
Thus, our diffusion model consists of particles moving along the main directions
of a hypercubic lattice (a square lattice in two dimensions or a cubic lattice in three
dimensions). As opposed to cellular automata fluids, we do not have to consider here
more complicated lattices. The reason is that diffusion processes do not require a
fourth-order tensor for their description. The random motion is obtained by permuting
the direction of the incoming particles. If d is the space dimension, there are 2d lattice
directions. These 2d directions of motion can be shuffled in 2d! ways, which is the
number of permutations of 2d objects. However, it is not necessary to consider all
permutations. A subset of them is enough to produce the desired random motion and,
as in section 2.2.6, we restrict ourselves to cyclic permutations. Thus, at each time
step, the directions of the lattice are “rotated” by an angle αi chosen at random, with
probability pi, independently for each site of the lattice. With this mechanism, the
direction a particle will exit a given site depends on the direction it had when entering
the site. The modification of its velocity determines its next location on the lattice.
By labeling the lattice directions with the unit vectors ci we can introduce the
occupation numbers ni(r) defined as the number of particles entering the site r, at
time t with a velocity pointing in direction ci.
With this notation, the CA rule governing the dynamics of our model reads
2d−1
ni(r + λci, t + τ ) =
µ (r, t)ni+ (r, t)
(2.107)
=0
where i is wrapped onto {1, 2, ..., 2d}. The µ ∈ {0, 1} are Boolean variables which
select only one of the 2d terms in the right-hand side. Therefore they must obey the
104
2. Cellular Automata and Lattice Boltzmann Techniques
LA COMPLEXITÉ
condition
2d−1
µ = 1
(2.108)
=0
Practically, this condition can be enforced in a simulation by dividing the interval [0,1]
into 2d bins of length p , each assigned to one of the ν . Then, at each lattice site and
each time step, a real random number between 0 and 1 is computed (with a random
number generator). The bin it falls in will determine which µ is the one that will
be non-zero. This rule is illustrated in figure 2.15 for the case of a two-dimensional
system.
The macroscopic behavior resulting from microdynamics 2.107 in the limit of in-
finitly small lattice spacing λ and time step τ can be obtained with the same techniques
as developed in section 2.3, namely the multiscale Chapman-Enskog expansion [14].
Since the dynamics is linear, a more direct calculation is also possible if the limit is
taken in such a way that λ2/τ remains constant.
As expected, the results is that the quantity ρ =
2d
< n
i=1
i >, where < ni > is
the average occupation number at site r and time t obeys the diffusion equation [14]
∂tρ + div [−Dgradρ] = 0
where D is the diffusion constant whose expression, in a two-dimensional square lat-
tice, is
λ2
1
1
λ2
p + p
D =
=
0
(2.109)
τ
4(p + p2) − 4
τ
4[1 − (p + p0)]
For the one- and three-dimensional cases, a similar expression can be found [14].
Lattice Boltzmann diffusion rule
If, instead of Boolean variables, the diffusion process is described in terms of the
probability of presence fi(r, t) of a particle entering site r at time t along direction
i, the diffusion rule can be written down using the LB (lattice Boltzmann) formalism
introduced in section 2.4.
The evolution rule takes the form
1
1
fi(r + τ vi, t + τ ) = f (0)(r, t) + 1
f
ξ i
− ξ i(r,t)
where f (0)
i
is the local equilibrium distribution and ξ the relaxation time. Since the
only conserved quantity in a diffusive process is the particle number ρ =
2d f
i=1
i, we
choose
1
f (0) =
ρ
i
2d
so that (i) ρ is indeed conserved and (ii) the local equilibrium depends on r and t only
through the conserved quantities.
105
LA COMPLEXITÉ
Bastien CHOPARD
Thus, the evolution rule can be rewritten as
1
1
1
fi(r + τ vi, t + τ ) = 1 −
1
f
f
ξ
− 2d
i(r, t) +
(2dξ) j(r, t)
j=i
This is equivalent to the lattice Boltzmann equation asssociated with the diffusive CA
having the probability of rotation
1
1
1
p0 = 1 −
1
p
ξ
− 2d
j = 2d
For a two-dimensional square lattice and according to equation 2.109, these values of
pi correspond to a diffusion constant
1
1
λ2
D =
ξ
2
− 2 τ
From this, we conclude that ξ ≥ 1/2, otherwise D becomes negative. However, from
the expression for p0, we see also that ξ ≥ 1 − 1/(2d), if we want to interprete p0 as a
probability. Thus, in two dimensions, the situation 1/2 < ξ < 3/4 does not correspond
to a CA realization. Yet, the CA model can have D = 0 in a different way since it does
not impose that all pi’s are equal but p0. This also shows that the numerical behaviour
of the LB scheme must be checked in more detail when 1/2 < ξ < 3/4. Finally,
notice that a too large value of ξ may yield an anisotropic behavior because it favors
too much the lattice axis.
LB diffusion in polar coordinates
The models presented so far (whether hydrodynamical or diffusive) require a regular
lattice to be defined properly. There is a clear interest to relax this limitation and allow
“body-fitted” meshes that can be adapted to a given geometry of boundaries. This
problem is still an active field of research [82,117].
Here we simply present a way to define a LB model in polar coordinate, assuming
that the system has an angular symmetry. Thus, the variables fi depends only on the
distance r to the center of the system. We shall also assume that the system is described
by an underlying lattice dynamics, independent of the space discretization given by the
polar coordinates.
We want to compute how many particles enter a polar cell located at distance r.
Particles traveling toward larger values of r are described by the quantity f1(r, t),
whereas particle moving to the center of the system are described by f2(r, t). Due
to the angular symmetry, there is no need to consider other directions of motion.
In the case of a diffusive system, the population f1 and f2 are mixed according to
f = pf
= pf
1
1 + (1 − p)f2 and f2
2 + (1 − p)f1.
The number of particles entering cell r + dr in the positive direction are those
exiting cell r after the diffusion step. The density of such particles is given by f1.
106
2. Cellular Automata and Lattice Boltzmann Techniques
LA COMPLEXITÉ
Since we work in polar coordinate, the cross section of cell r is σrd−1, where σ is
some constant and d the space dimensionality. Therefore, there are σrd−1f1 particles
moving from cell r to cell r + dr. Since the cross section of cell r + dr is σ(r + dr)d−1,
the density f1(r + dr, t + τ ) is defined by the balance equation
σ(r + dr)d−1f1(r + dr, t + τ ) = σrf1
A similar derivation hold for f2(r −dr, t+τ). Thus, for a diffusion process, we obtain
r
d−1
f1(r + dr, t + τ ) =
[pf
r + dr
1 + (1 − p)f2]
r
d−1
f2(r − dr,t + τ) =
[pf
r − dr
2 + (1 − p)f1]
(2.110)
Therefore, the effect of the polar coordinate system is to modified the propagation
scheme. It can be checked that numerical simulations of equ. 2.110, with fixed boudary
conditions at r = r0 and r = r1, converge to the corresponding solution of Laplace
equation in polar coordinate.
The reaction rule
In this section we add a reaction term on top of the diffusion rule described in the
previous section. Our aim is to simulate processes such as
A + B K
→ C
(2.111)
where A, B and C are different chemical species, all diffusing in the same solvent, and
K is the reaction constant. To account for this reaction, one can consider the following
mechanism: at the “microscopic” level of the discrete lattice dynamics, all the three
species are first governed by a diffusion rule. When an A and a B particle enter the
same site at the same time, they disappear and form a C particle.
Of course, there are several ways to select the events that will produce a C when
more than one A or one B are simultaneously present at a given site. Also, when Cs
already exist at this site, the exclusion principle may prevent the formation of new
ones. A simple choice is to have A and B react only when they perform a head-on
collision and when no Cs are present in the perpendicular directions. Other rules can
be considered if we want to enhance the reaction (make it more likely) or to deal with
more complex situations (2A + B → C, for instance).
A parameter k can be introduced to tune the reaction rate K by controlling the
probability of a reaction taking place.
In order to write down the microdynamic equation of this process, we shall denote
by ai(r, t), bi(r, t) and ci(r, t) ∈ {0, 1} the presence or absence of a particle of type A,
B or C, entering site r at time t pointing in lattice direction i.
We shall assume that the reaction process first takes place. Then, the left-over
particles, or the newly created ones, are randomly deflected according to the diffusion
107
LA COMPLEXITÉ
Bastien CHOPARD
rule. Thus, using equation 2.107, we can write the reaction-diffusion microdynamics
as (d is the dimensionality of the Cartesian lattice)
2d−1
ai(r + λei, t + τ ) =
µ (r, t) ai+ (r, t) + Rai+ (a, b, c)
(2.112)
=0
and similarly for the two other species B and C.
As before, the µ (r, t) are independent random Boolean variables producing the
direction shuffling. The lattice spacing λ and time steps τ are introduced as usual
and the lattice directions ei are defined as east, north, west and south, in the case of a
two-dimensional lattice.
The quantity Raj(a, b, c) is the reaction term: it describes the creation or the anni-
hilation of an A particle in the direction j, due to the presence of the other species. In
the case of an annihilation process, the reaction term takes the value Ra =
j
−1 so that
aj − Raj = 0. On the other hand, when a creation process takes place, aj = 0 and
Ra = 1
= 0
j
. When no interaction occurs, Raj
.
For instance, in the case of the reaction 2.111 (illustrated in figure 2.41), the reac-
tion terms could be written as
Ra =
i
−κaibi+2 [ν(1 − ci+1) + (1 − ν)(1 − ci−1)]
Rb = Ra
i
i+2
Rci = κ(1 − ci)[νai−1bi+1 + (1 − ν)ai+1bi−1]
(2.113)
A
B
C
A
B
C
ν=1
ν=0
Figure 2.41: Automata implementation of the A + B → C reaction process.
Rai and Rbi are annihilation operators, whereas Rci corresponds to particle creation.
One can easily check that, for each A (or B) particle which disappears, a C particle is
created. That is,
2d
2d
2d
Rai =
Rbi = −
Rci
i=1
i=1
i=1
The quantities ν(r, t) and κ(r, t) in equations 2.113 are independent random bits,
introduced in order to select among the various possible events: ν(r, t) is 1 with prob-
ability 1/2 and decides in which direction the reaction product C is created. When
108
2. Cellular Automata and Lattice Boltzmann Techniques
LA COMPLEXITÉ
ν = 1, the new C particle forms a +90o angle with respect to the old A particle. This
angle is −90o when ν = 0.
The occurrence of the reaction is subject to the value of the Boolean variable κ.
With probability k, κ = 1. Changing the value of k is a way to adjust the reaction
constant K. We shall see that k and K are proportional.
The presence of the terms involving ci in the right-hand side of equations 2.113
may seem unphysical. Actually, these terms are introduced here in order to satisfy the
exclusion principle: a new C cannot be created in direction i if ci is already equal to 1.
With this formulation, the reaction slows down as the number of C particles increases.
At some point one may reach saturation if no more room is available to create new
particles.
In practice, however, this should not be too much of a problem if one works at low
concentration. Also, quite often, the C species also undergoes a transformation: the
reaction can be reversible or C particles can precipitate if the concentration reaches
some threshold. Or, sometimes, one is only interested in the production rate
Ra =
j
j
Rb
j
j of the species C and one can forget about them once they are created. In this
case, one simply puts ci = 0 in the first two equations of 2.113.
Clearly, the exclusion principle may introduce some renormalization of the reaction
rate. If for some reason, this is undesirable, multiparticle models offer an alternative
to the LGA approach. This will be discussed in section 2.6.
Due to the simple microscopic interpretation, equation 2.113 is easily generalized
to other reaction processes. A common situation is when one species is kept at a fixed
concentration. This means that the system is fed a chemical by an external mecha-
nism. In this case, the corresponding occupation numbers (for instance the bis) can
be replaced be random Boolean variables which are 1 with a probability given by the
selected concentration of the species.
2.5.4
The macroscopic behavior
Here we establish the link between the discrete reaction-diffusion cellular automata
dynamics and the corresponding macroscopic level of description. We shall perform
this calculation for the case of three species A, B and C but a generalization to other
reaction schemes is straightforward.
Our approach is similar to that used in section 2.3 and 2.4. We use the Boltzmann
molecular chaos assumption, in which correlations are neglected. Within this approx-
imation, we shall see that the microdynamics of the A + B → 0 reaction-diffusion
processes yields the usual rate equation
∂tρA = D 2ρA − KρAρB
(2.114)
To derive the macroscopic behavior of our automata rule, we first average equa-
109
LA COMPLEXITÉ
Bastien CHOPARD
tion 2.112
2d
2d
Ai(r + λei, t + τ ) −Ai(r,t) =
ΩijAj(r, t) +
(δij + Ωij)Ra(A, B, C)
j
(2.115)
j=1
j=1
where Ai =< ai > is the average value of the occupation numbers ai. The matrix Ω is
the matrix expressing the diffusion rule, that is
Ωii = p0 − 1
Ωij = pj−i
where j − i is defined modulo 2d. Similar equations as 2.115 hold for Bi and Ci.
Using the Boltzmann hypothesis, the average value of the reaction term is written
as
< Rai(a, b, c, κ, ν) >≈ Rai(A,B,C,< κ >,< ν >)
(2.116)
Note that this factorization may be wrong for simple annihilation reaction-diffusion
processes, as discussed in section 2.6.3.
The second step is to replace the finite difference in the left-hand side of 2.115 by
its Taylor expansion
Ai(r + λei, t + τ ) − Ai(r,t) =
τ 2
λ2
τ ∂t +
∂2 + λ(c
(c
2 t
i · ∂r) + 2 i · ∂r)2 + τλ∂t(ci · ∂r) Ai
(2.117)
and similarly for the other species B and C. As in the hydrodynamic case, we consider
a Chapman–Enskog-like expansion and look for a solution of the following form
Ai = A(0) + A(1) + 2A(2) + ...
i
i
i
(2.118)
Since particle motion is governed by the diffusion process, we will use the fact that
when taking the continuous limit, the time and length scale are of the following order
of magnitude
λ = λ1
and
τ = 2τ2
(2.119)
In reactive systems, as opposed to hydrodynamics or pure diffusion, neither momen-
tum nor particle number are conserved in general. For instance, in the annihilation
process A + A → ∅, no conservation law holds.
On the other hand, the reaction term can be considered as a perturbation to the
diffusion process, which makes derivation of the macroscopic limit rather simple. In
equation 2.114, the reaction constant K has the dimension of the inverse of a time.
This quantity defines at what speed the reaction occurs. At the level of the automaton,
this reaction rate is controlled by the reaction probability k =< κ > introduced in the
previous section.
When the continuous limit is taken, the automaton time step τ goes to zero. Thus,
the number of reactions per second will increase as τ decreases, unless the reaction
110
2. Cellular Automata and Lattice Boltzmann Techniques
LA COMPLEXITÉ
probability k also diminishes in the right ratio. In other words, to obtain a finite reac-
tion constant K in the macroscopic limit, it is necessary to consider that k ∝ τ. Since
τ is of the order 2 in our Chapman–Enskog expansion, the reaction term Rai is also to
be considered as an O( 2) contribution and we shall write
Rai = 2Ra2i
(2.120)
At the macroscopic level the physical quantities of interest are the particle densities of
each species. Following the usual method, we define the density ρA of the A species
as
2d
ρA =
A(0)
i
i=1
with the condition
2d
A( ) = 0
i
if
≥ 1
i=1
Now we have to identify the different orders in
which appear in equation 2.115,
using the expressions 2.117, 2.118, 2.119 and 2.120. We obtain
O( 0) :
ΩijA(0) = 0
i
(2.121)
j
O( 1) :
λ1(ei · )A(0) =
Ω
i
ij A(1)
j
(2.122)
j
These equations are exactly similar to those derived in the case of pure diffusion
(see [14]) and the result is that
ρ
A(0) = A
i
2d
and
λ 1
A(1) = 1
e
i
2d V iα∂αρA
where V is the eigenvalue of the diffusion matrix Ω for the eigenvector
Eα = (e1,α; e2,α; ...; e2d,α)
The equation for the density ρA is now obtained by summing over i equation 2.115,
remembering that
Ωij = 0
i
Collecting all the terms up to O( 2), we see that the orders O( 0) and O( ) vanish and
we are left with
λ
2τ
1
2∂t
A(0)+ 2λ
(e
+ 2
(e
= 2
Ra (A(0), B(0), C(0))
i
1
i· )A(1)
i
2
i· )2A(0)
i
2j
i
i
i
j
111
LA COMPLEXITÉ
Bastien CHOPARD
Using the definition of τ2, λ1, Ra2j and performing the summations yield
1
ρ
ρ
ρ
∂
A
B
C
tρA = D
2ρA +
Ra(
,
,
)
(2.123)
τ
j 2d 2d 2d
j
where D is the same diffusion constant as would be obtained without the chemical
reactions (see section 2.5.3).
It is interesting to note that expression 2.123 has been obtained without knowing
the explicit expression for the reaction terms R and independently of the number of
species. Actually, from this derivation, we see that the reaction term enters in a very
natural way in the macroscopic limit: we just have to replace the occupation numbers
by ρ/2d, the random Boolean fields by their average values and sum up this result for
all lattice directions.
For the case of the A + B → C process in two dimensions, with the reaction term
given by 2.113, equation 2.123 shows that the macroscopic behavior is described by
the rate equations
k
ρ
∂
2
C
tρA
= DA
ρA −
1
ρ
4τ
− 4 AρB
k
ρ
∂
2
C
tρB
= DB
ρB −
1
ρ
4τ
− 4 AρB
k
ρ
∂
2
C
tρC
= DC
ρC +
1
ρ
4τ
− 4 AρB
(2.124)
where, in principle a different diffusion constant can be chosen for each species. We
also observe that the reaction constant K is related to the reaction probability k by
k
K = 4τ
As explained previously, the exclusion principle introduces a correction (1 − ρc/4)
which remains small as long as C is kept in low concentration.
2.5.5
Liesegang patterns
In this section we shall study a more complex system in which reaction-diffusion will
be accompanied by solidification and growth phenomena. This gives rise to nice and
complex structures that can be naturally modeled and analyzed in the framework of
the cellular automata approach.
These structures are known as Liesegang patterns, from the German chemist R.E.
Liesegang who first discovered them at the end of the nineteenth century [118].
Liesegang patterns are produced by precipitation and aggregation in the wake of
a moving reaction front. Typically, they are observed in a test tube containing a gel
in which a chemical species B (for example AgNO3 ) reacts with another species A
112
2. Cellular Automata and Lattice Boltzmann Techniques
LA COMPLEXITÉ
A
B
x=0
direction of the moving front
Figure 2.42: Example of the formation of Liesegang bands in a cellular automata
simulation. The white bands correspond to the precipitate which results from the A+B
reaction.
(for example HCl). At the beginning of the experiment, B is uniformly distributed in
the gel with concentration b0. The other species A, with concentration a0 is allowed
to diffuse into the tube from its open extremity. Provided that the concentration a0 is
larger than b0, a reaction front propagates in the tube. As this A + B reaction goes on,
formation of consecutive bands of precipitate (AgCl in our example) is observed in the
tube, as shown in figure 2.42. Although this figure is from a computer simulation, it is
very close to the picture of a real experiment.
The presence of bands is clearly related to the geometry of the system. Other
geometries lead to the formation of rings or spirals.
Depending on the experimental situation, some Liesegang patterns can present un-
expected structures (inverse banding [119], effect of gravity, shape of the container
and other exotic behaviors [120]). Therefore a complete analysis of the phenomena is
difficult and still under investigation [121,122]
On the other hand, for many different substances, generic formation laws can be
identified. For instance, after a transient time, Liesegang bands appear at some posi-
tions xi and times ti and have a width wi. It is first observed that the center position
xn of the nth band is related to the time tn of its formation through the so-called time
law xn ∼ √tn.
Second, the ratio pn ≡ xn/xn−1 of the positions of two consecutive bands ap-
proaches a constant value p for large enough n. This last property is known as the
Jablczynski law [123] or the spacing law. Finally, the width law states that the width
wn of the the nth band is an increasing function of n. These features are related to the
properties of the reaction front which move in the system. The time law appears to be
a simple consequence of the diffusive dynamics. On the other hand, spacing and width
laws cannot be derived with reaction-diffusion hypotheses alone. Extra nucleation–
aggregation mechanisms have to be introduced, which makes any analytical derivation
quite intricate [124,125,126].
From an abstract point of view, the most successful mechanism that can be pro-
posed to explain the formation of Liesegang patterns is certainly the supersaturation
assumption based on Ostwald’s ideas [127]. This mechanism can be understood using
the formation scenario proposed by Dee [128]: the two species A and B react to pro-
duce a new species C (a colloid, in chemical terminology) which also diffuses in the
113
LA COMPLEXITÉ
Bastien CHOPARD
gel.
When the local concentration of C reaches some threshold value, nucleation oc-
curs: that is, spontaneously, the C particles precipitate and become solid D particles
at rest. This process is described by the following equations
∂
2
ta
= Da
a − Rab
∂
2
tb
= Db
b − Rab
∂
2
tc
= Dc
a + Rab − nc
∂td = nc
(2.125)
where, as usual, a, b, c, d stand for the concentration at time t and position r of the A,
B, C and D species, respectively. The term Rab expresses the production of the C
species due to the A + B reaction. Classically, a mean-field approximation is used for
this term and Rab = Kab, where K is the reaction constant. The quantity nc describes
the depletion of the C species resulting from nucleation and aggregation on existing
D clusters. An analytical expression for this quantity is rather complicated. However,
at the level of a cellular automata model, this depletion term can be included quite
naturally.
Within this framework, the supersaturation hypothesis can be stated as follows: due
to aggregation, the clusters of nucleated D particles formed at the reaction front deplete
their surroundings of the reaction product C. As a result, the level of supersaturation
drops dramatically and the nucleation and solidification processes stop. To reach again
suitable conditions to form new D nuclei, the A −B reaction has to produce sufficient
new C particles. But, the reaction front moves and this happens at some location
further away. As a result, separate bands appear.
Most of the ingredients needed for modeling the formation of Liesegang pattern in
terms of a CA approach have already been introduced in the previous section, when
describing the A + B → C reaction-diffusion process. In the case of Dee’s scenario,
we also need to provide a mechanism for spontaneous nucleation (or precipitation)
in order to model the transformation of a diffusing C particle into a solid D particle.
Finally, aggregation of C particles on an existing D cluster will be modeled in very
much the same spirit as the DLA growth described in section 2.2.6. The key idea will
be to introduce threshold values to control both of these processes.
The C particles, once created, diffuse until their local density (computed as the
number of particles in a small neighborhood divided by its total number of sites and
lattice directions), reaches a threshold value ksp. Then they spontaneously precipitate
and become D particles at rest (nucleation). Here, we typically consider 3 × 3 Moore
neighborhoods centered around each lattice site.
Morover, C particles located in the vicinity of one or more precipitate D parti-
cles aggregate provided that their local density (computed as before) is larger than an
aggregation threshold kp < ksp.
The parameters kp and ksp are two control parameters of the model. The intro-
114
2. Cellular Automata and Lattice Boltzmann Techniques
LA COMPLEXITÉ
400
300
xn 200
100
0 0
100
200
300
xn-1
Figure 2.43: Verification of the spacing law for the situation with C particles. The
ratio xn/xn−1 tends to p = 1.08 .
duction of these critical values refers to the qualitative models of solidification theory,
relating supersaturation and growth behavior [129].
An important aspect of the mechanism of Liesegang patterns formation is the role
of spontaneous fluctuations. Precipitation and aggregation processes (such as a DLA
) are clearly dependent on local density fluctuations. For instance, even if the average
particle concentration of C particles is less that the supersaturation threshold, it may
be higher locally and give rise to spontaneous nucleation. Similarly, aggregation is a
function of the particle density in the vicinity of an existing solid cluster, which is also
a locally fluctuating quantity.
The cellular automata approach naturally accounts for these fluctuation phenomena
and, in addition, captures the mesoscopic nature of the precipitate cluster, that can be
fractal.
Figure 2.42 shows a typical example of a cellular automata simulation with C
particles, giving rise to bands. The initial condition is built as follows: at time t = 0,
the left part of the system (x ≤ 0) is randomly occupied by A particles, with a density
a0 and the right part (x > 0) is filled with B particles with a density b0.
From the positions xn and the formation time tn of each band, we can verify the
spacing and the time laws. For instance, the plot given in figure 2.43 shows very good
agreement for the relation xn/xn−1 → p. It is found that the so-called Jablczynski
coefficient p is 1.08, a value corresponding to experimental findings. The way the
value of p depends on the parameters of the model is expected to follows the so-called
Matalon-Pakter [130] experimental law. From a numerical and theoretical point of
view, this dependence is still under investigation [122].
Liesegang patterns are found only if the parameters of the experiment are thor-
oughly adjusted. In our simulation, kp and ksp are among the natural quantities that
control supersaturation and aggregation. In practice, however, one cannot directly
modify these parameters. On the other hand, it is experimentally possible to change
some properties of the gel (its pH for example) and thus influence the properties of the
115
LA COMPLEXITÉ
Bastien CHOPARD
kp
Homogenous
clustering
No precipitation
BANDS
Amorphous
Dendrites
solidification
ksp
Figure 2.44: Qualitative phase diagram showing the different possible patterns that
can be obtained with our cellular automata model, as a function of the values of ksp
and kp .
aggregation processes or the level of supersaturation.
Outside of the region where Liesegang patterns are formed, our simulations show
that, when kp and ksp vary, other types of patterns are obtained. These various patterns
can be classified in a qualitative phase diagram, as shown in figure 2.44. An example
of some of these “phases” is illustrated in figure 2.45. Note that the limits between the
different “phases” do not correspond to any drastic modification of the patterns. There
is rather a smooth crossover between the different domains. The associated names are
borrowed from the phenomenological theory of solidification [129].
The terminology of dendrite comes from the tree-like structures that are sometimes
found on the surfaces of limestone rocks or plates and that can be confused with fossils.
The plant-shaped deposit is made of iron or manganese oxides that appear when at
some point in the geological past the limestone was penetrated by a supersaturated
solution of manganese or iron ions. It turns out that the formation of these mineral
dendrites can be simulated by the same scenario as Liesegang patterns, but with an
aggregation threshold kp = 0. Figure 2.46 shows the results of such a modeling. The
fractal dimension of these clusters is found to be around 1.77, a value which is very
close to that measured in a real sample [131].
The patterns we have presented so far show axial symmetry, reflecting the proper-
ties of the experimental setup. But the same simulations can be repeated with different
initial conditions. A case of interest is the situation of radial symmetry responsible for
the formation of rings or spirals. The reactant A is injected in the central region of
a two-dimensional gel initially filled with B particles. The result of the cellular au-
tomata simulation is shown in figure 2.47. In (a) concentric rings are formed, starting
from the middle of the system and appearing as the reaction front radially moves away.
In (b) a spiral-shaped structure is created. Although the two situations are similar as
far as the simulation parameter are concerned, the appearance of a spiral stems from a
spontaneous spatial fluctuation which breaks the radial symmetry.
116
2. Cellular Automata and Lattice Boltzmann Techniques
LA COMPLEXITÉ
(a)
(b)
(c)
Figure 2.45: Examples of patterns that are described in the phase diagram: (a) cor-
responds to homogeneous clustering; this is also the case of pattern (b) but closer to
the region of band formation. Pattern (c) shows an example of what we called a den-
drite structure. Amorphous solidification would correspond to a completely uniform
picture.
7.0
(b)
ln (mass)
slope=1.77+
_ 0.05
3.2
0.65
ln (size)
2.8
-0.4
(c)
ln (occupancy)
slope=-1.75+
_ 0.05
-4.2
(a)
0.65
ln (size)
2.8
Figure 2.46: (a) Examples of mineral dendrite obtained from a cellular automata sim-
ulation with kp = 0; in this figure, the reaction front moves from upward. The two
graphs on the right show the numerical measurement of the fractal dimension using:
(b) a sand-box method and (c) a box-counting technique.
117
LA COMPLEXITÉ
Bastien CHOPARD
(a)
(b)
Figure 2.47: Liesegang rings (a) and spiral (b), as obtained after 2000 iterations of
the cellular automata model, with C particles indicated in gray.
Figure 2.48: Example of the formation of Liesegang bands in a lattice Boltzmann
simulation.
Liesegang patterns are obtained when the initial A concentration is significantly
larger than the initial B concentration. In a cellular automata model with an exclusion
principle, a large concentration difference implies having very few B particles. As a
consequence, the production rate of C particles is quite low because very few reactions
take place. For this reason, the simulation presented above, have been produced with
a pseudo-three-dimensional system composed of several two-dimensional layers. The
reaction has been implemented so that particles of different layers can interact.
Therefore, pure CA simulation of Liesegang structure can be very demanding in
terms of CPU time. It turns out that a LB approach is also possible, with much less
computer ressources, and makes it possible to investigate large systems exhibiting
many more bands.
The LB model follows the same line as in the CA approach but some external noise
is added to describe aggregation and nucleation as probabilitic processes. We refer the
reader to [132] for a more detailed discussion. Below we just show some of the patterns
generated with the LB model. Figure 2.48 shows an example of a lattice Boltzmann
simulation containing up to 30 consecutive bands, in a system of sizes 1024 × 64.
We can also consider again the case of Liesegang rings and spirals in the framework
of the LB approach.
Figure 2.49(a) shows the situation where concentric rings of precipitate are formed.
The numerical parameters are: a0 = 1, b0/a0 = 0.013, Db/Da = 0.1, ksp/a0 =
0.0087, kp/a0 = 0.0065. The nucleation process takes place with a probability of 0.05
and aggregation with a probability close to 1. This pattern turns out to be quite similar
118
2. Cellular Automata and Lattice Boltzmann Techniques
LA COMPLEXITÉ
(a)
(b)
Figure 2.49: Formation of (a) Liesegang rings and (b) spiral-shaped pattern, as ob-
tained after 2000 iterations of the lattice Boltzmann model.
to real Liesegang structures obtained in a similar experimental situation [118].
For the same set of parameters, but b0/a0 = 0.016, a different pattern is observed
in figure 2.49(b). There, a local defect produced by a fluctuation develops and a spiral
of precipitate appears instead of a set of rings. Such a spiral pattern will never be
obtained from a deterministic model without a stochastic component.
From these data, we can check the validity of the spacing law for ring formation.
The relation
rn/rn−1 → p
holds, where rn is the radius of the nth ring. In figure 2.50, the Jablczynski coefficient
p is plotted as a function of the concentration of B particles b0 (for a0 = 1) both for ax-
ial (bands) and radial (rings) symmetries. We notice that p decreases when b0 increases
in agreement with experimental data. Moreover, for the same set of parameters, the
value of p is found to be larger in the case of rings than it is for bands.
2.6
Multiparticle models
Multiparticle models (also termed Integer Lattice Gas Automata [133]) are lattice gas
models without an exclusion principle. They are designed to conciliate the advantages
of the CA and LB models. LB models are less noisy and provide more flexibility
than their Boolean (CA) counterpart. However they may exhibit some bad numerical
instabilities (that is the case of lattice BGK models of fluids) and they sometimes
fail to account for relevant physical phenomena because fluctuations are neglected. An
example is provided in section 2.6.3 by the anomalous kinetics in the simple A+A → 0
reaction-diffusion processes.
119
LA COMPLEXITÉ
Bastien CHOPARD
1.3
1.2
Liesegang rings
1.1
Liesegang bands
Jablczynski coefficient
1.0
0.009
0.011
0.013
0.015
0.017
Initial concentration of B
Figure 2.50: Jablczynski coefficients p as a function of the concentration b0 of B par-
ticles (for a0 = 1), for bands (lower curve) and rings (upper curve).
Multiparticle models conserve the point-like nature of particles, as in cellular au-
tomata, but allow an arbitrary number of them to be present at each lattice site. This
eliminates the exclusion principle that plagues the cellular automata approach and
which appears as a numerical artifact rather than a desirable physical property.
Mathematically speaking, this means that the state of each lattice site cannot be
described with a finite number of information bits. However, in practice, it is easy to
allocate a 32- or 64-bit computer word to each lattice site, to safely assume that “any”
number of particles can be described at that site.
Multiparticle models lead to a reduced statistical noise: if the number of particles
per site is N , the intrinsic fluctuations due to the discrete nature of the particles will
√
typically be of the order
N . This is small compared to N , if N is large enough.
Therefore, we do not have to perform much averaging to get a meaningful result.
In addition, with an arbitrary number of particle per site, we have much more
freedom to enforce a given boundary condition, or tune a parameter of the simulation.
Actually, when modeling a reaction process, it is often necessary to get rid of the
exclusion principle. For instance, to describe processes such as mA + nB → C, it is
highly desirable to have more than four particles per site.
Unfortunately, the numerical implementation of multiparticle models is much more
involved than LB or CA models and the computation time is also much higher. On the
other hand, we restore in a natural way the fluctuations that are absent in LB simula-
tions and provide an intrinsically stable numerical scheme (since we deal with positive
integer numbers). Besides, when compared to CA, the extra computational time may
be well compensated by the fact that less averaging is required.
In this section we first consider the case of a reaction-diffusion system and then
we shall describe how a hydrodynamical model can be defined within the context of a
multiparticle approach.
120
2. Cellular Automata and Lattice Boltzmann Techniques
LA COMPLEXITÉ
2.6.1
Multiparticle diffusion model
Our algorithm is defined on a d-dimensional Cartesian lattice of spacing λ [93]. Each
lattice site r is occupied, at time t, by an arbitrary number of particles n(r, t). The
discrete time diffusion process is defined as follows: during the time interval τ , each
particle can jump to one of its 2d nearest-neighbor sites along lattice direction i with
probability pi, or stay at rest with a probability p0 = 1 − 2d p
i=1
i.
An advantage of dealing with multiparticle dynamics is that advection mechanisms
can be added to the diffusion process. When the probabilities of jumping to a nearest-
neighbor site are different in each direction, a drift is introduced. This adds a density
gradient term to the diffusion equation which then reads
∂tρ = V ρ + D 2ρ
where V is the advection velocity. Such an advection effect is difficult to produce
without an artifact when an exclusion principle holds.
For the sake of simplicity, we shall now consider a two-dimensional case. The
generalization is straightforward and follows the same reasoning.
The idea is to loop over every particle at each site, decide where it goes and move
it to its destination site. In terms of the particle numbers n(r, t), our multiparticle rule
can be expressed as
n(r,t)
n(r+λe3,t)
n(r, t + τ ) =
p0 (r, t) +
p1 (r + λe3, t)
=1
=1
n(r+λe1,t)
n(r+λe4,t)
+
p3 (r + λe1, t) +
p2 (r + λe4, t))
=1
=1
n(r+λe2,t)
+
p4 (r + λe2, t))
(2.126)
i=1
The vectors e1 = −e3, e2 = −e4 are the four unit vectors along the main directions of
the lattice. The stochastic Boolean variable pi (r, t) is 1 with probability pi and selects
whether or not particle
chooses to move to site r + λei. Since each particle has only
one choice, we must have
p0 + p1 + p2 + p3 + p4 = 1
The macroscopic occupation number N (r, t) =< n(r, t) > is obtained by aver-
aging the above evolution rule over an ensemble of equivalent systems. Clearly, one
has
n(r,t)
<
pi (r, t) >= piN(r, t)
=1
121
LA COMPLEXITÉ
Bastien CHOPARD
Thus, we obtain the following equation of motion:
N (r, t + τ ) = p0N(r, t) + p1N(r + λe3, t)
+p3N(r + λe1, t) + p2N(r + λe4, t) + p4N(r + λe2, t)
(2.127)
Assuming N varies slowly on the lattice, we can perform a Taylor expansion in both
space and time to obtain the continuous limit. Using
pi = 1 and ei = −ei=2, we
obtain
τ 2
τ ∂
2
tN (r, t)
+
∂ N (r, t) +
2 t
O(τ3) = λ[(p3 − p1)e1 + (p4 − p2)e2] · N(r,t)
λ2
λ2
+
(p1 + p3)(e
(p2 + p4)(e
2
1 · )2N(r, t) + 2
1 · )2N(r, t) + O(λ3)
(2.128)
Since, e1 and e2 are orthonormal, we have
(e1 · )2 + (e2 · )2 = 2
In order to use this property it is necessary that p1 + p3 = p2 + p4, otherwise the lattice
directions will “visible”. Thus we impose the isotropy condition
1
p
− p0
1 + p3 = p2 + p4 =
2
and we obtain
τ
∂
2
tN (r, t) +
∂ N (r, t) +
2 t
O(τ2) = V · N(r,t)
+D 2N (r, t) + O(λ3) (2.129)
where V is the advection velocity
λ
V =
[(p
τ
3 − p1)e1 + (p4 − p2)e2]
and D the diffusion constant
λ2
D =
(1
4τ
− p0)
(2.130)
We may now consider the limit λ → 0 and τ → 0 with λ2/τ → constant, as usual
in a diffusion process. However, here, some additional care is needed. If p3 = p1 or
p4 = p2, the advective term will diverge in the limit. This means that p3 −p1 or p4 −p2
must decrease proportionally to λ when the limit is taken. Thus, with a halved lattice
spacing, the difference between pi and pi+2 must also be halved in order to produce
the same advection. With these assumptions, we obtain, in the macroscopic limit
∂tN = V · N + D 2N
122
2. Cellular Automata and Lattice Boltzmann Techniques
LA COMPLEXITÉ
2.6.2
Numerical implementation
The main problem when implementing our algorithm on a computer (for instance, for
the two-dimensional case we described in the previous section) is to find an efficient
way to select the particles at rest and distribute randomly the others among the four
possible directions of motion. More precisely, we have to compute quantities such as
n(r,t)
ni =
pi (r, t)
=1
In practice, we can loop over all
particles at every site and, for each of them, choose
a random number r, uniformly distributed in the interval [0, 1]. Then, we consider a
division of this interval in subintervals [rj, rj+1[, j = 0, ..., 5, so that pi = ri+1 − ri.
We say that pi = 1 if and only if ri+1 ≤ r < ri. The quantities ni are thus distributed
according to a multinomial distribution.
This procedure is acceptable for small values of n but, otherwise, very time con-
suming. However, when n is large (more precisely when npi(1−pi) 1, the statistical
distributions of the ni is expected to approach Gaussian distributions of mean npi and
variance npi(1−pi). This Gaussian approximation allows us to be much more efficient
because we no longer have to generate a random number for each particle at each site.
For simplicity, take the case p0 = p and p1 = p2 = p3 = p4 = (1 − p)/4. The
ni’s can be approximated as follows: we draw a random number n0 from a Gaussian
distribution of mean np and variance np(1 − p) (for instance using the Box–Muller
method [134]). This number is then rounded to the nearest integer.
Thus, in one operation, this procedure splits the population into two parts: n0
particles that will stay motionless and n−n0 that will move. In a second step, the n−n0
moving particles are divided into two subsets according to a Gaussian distribution
of mean nm/2 and variance nm(1/2)(1/2). Splitting up each of these subsets one
more time yields the number ni of particles that will move in each of the four lattice
directions.
If advection is present, we can also proceed similarly. First, we divide up the mov-
ing particle population into two parts: on the one hand, those going to north and east,
for instance, and on the other hand, those going south and west. Second, each subpop-
ulation is, in turn, split into two subsets according to to the values of the pis. Of course,
as in traditional lattice gas automata, these splitting operations can be performed si-
multaneously (in parallel) at each lattice site.
Empirical considerations, supported by theoretical arguments on binomial distri-
butions, show that ni = 40 is a good threshold value in two dimensions, above which
the Gaussian procedure can be used. Below this critical value, it is safer to have the
algorithm loop over all particles. Note that in a given simulation, important differences
in the particle number can be found from site to site and the two different algorithms
may have to be used at different places.
123
LA COMPLEXITÉ
Bastien CHOPARD
2.6.3
The reaction algorithm
We will now discuss how reaction processes can be implemented in the framework
of multiparticle models (see also [135]). Reaction-diffusion phenomena can then be
simulated by alternating the reaction process between the different species and then the
diffusion of the resulting products, according to the multiparticle diffusion algorithm
just described.
A reaction process couples locally the different species Al, l = 1, .., q to produce
new species Bj, j = 1, .., m according to the relation
K
α1A1 + α2A2 + . . . + αqAq → β1B1 + β2B2 + ... + βmBm
(2.131)
The quantities αl, βj are the stoichiometric coefficients, and k is the reaction constant.
In order to model this reaction scheme with a multiparticle dynamics, one considers
all the q-tuples that can be formed with α1 particles of A1, α2 particles of A2, etc. These
q-tuples are transformed into m-tuples of Bj particles with probability k. At site r and
time t, there are
n
n
n
N(r,t) ≡ A1
A2
. . .
Aq
(r, t)
α1
α2
αq
ways to form these q-tuples, where nX (r, t) denotes the number of particles of species
X present at (r, t). If one of the nA < α
i
i then obviously N = 0.
This techniques offers a natural way to consider all possible reaction scenarios.
For instance, in the case of the annihilation reaction 2A → ∅, suppose we have three
particles (labeled a1, a2, a3) available at a given lattice site. Then, there are three
possible ways to form a reacting pair: (a1, a2), (a1, a3) and (a2, a3). In principle, all
these combinations have the same chance of forming and reacting. However, if (a1, a2)
react, then only a3 is left and there is no point in considering (a1, a3) or (a2, a3) as
possible candidates for reaction. Thus N is the maximal number of possible events,
but it is likely that the available particles are exhausted before reaching the end of this
list of possible reactions.
The multiparticle reaction rule can therefore be summarized as follows:
• As long as there are enough particles left (i.e. at least αl of species Al, for
each l), but no more than N times, choose a Boolean random κ which is 1 with
probability k.
• If κ = 1, remove from each species Al a number αl of particles (nAl → nAl−αl)
and add a number βj of particles to each species Bj, j = 1, ..., m (nBj →
nB + β
j
j ).
This algorithm can easily be extended to a reversible reaction.
When k is very small, we may assume that all the N q-tuples need to be considered
124
2. Cellular Automata and Lattice Boltzmann Techniques
LA COMPLEXITÉ
and the above reaction rule can be expressed as
N (r,t)
nA (r, t + τ ) = n (r, t)
κ
l
Al
− αl
h
h=1
N (r,t)
nB (r, t + τ ) = n (r, t) + β
κ
j
Bj
j
h
(2.132)
h=1
where κh is 1 with probability k.
This algorithm may become quite slow in terms of computer time if the nX are
large and k
1. In this case, the Gaussian approximation described in the previous
section can be used to speed up the numerical simulations: the number of accepted
reactions can be computed from a local Gaussian distribution of mean kN(r, t) and
variance k(1 − k)N(r, t).
Diffusive annihilation
In order to check that our multiparticle reaction rule captures the true nature of fluctu-
ation and correlation, we simulated the A + A → ∅ reaction-diffusion process, where
the A particle is uniformly distributed in the system. This reaction exhibits a non-
mean-field decay law in one-dimensional systems [34]: the time evolution of NA(t)
(the number of A particle left in the system at time t) departs from the behavior pre-
dicted by the rate equation ∂tNA(t) = −KN2(t)
A
, whose solution is NA(t) ∼ t−1, for
large t.
Figure 2.51 gives the behavior of a simulation performed on a line of 64’536 sites,
with an initial number of about 100 particles per site. Diffusion and reaction processes
are simulated with our multiparticle algorithms with a probability 1/2 that each particle
moves left or right and a reaction probability k = 0.8. We observe that the total number
of A particles decreases with time as the power law NA(t) ∼ t−1/2, which is the correct
result in d = 1 dimension.
Rate equation approximation
In a mean-field approximation, i.e. when the multipoint correlation functions are fac-
torized as a product of one-point functions and the reaction probability k is much
smaller than 1, our multiparticle dynamics gives the expected rate equation given by
the mass action law. We define NA and N
as the average particle numbers per site
l
Bj
of species Al and Bj, respectively.
For the reaction process 2.131, it is possible to show that our multiparticle reaction
algorithm yields (in the limit of a large lattice)
NA (t + τ )
(t) =
N α2 . . . N αa
i
− NAi
−KNα1A1 A2
Aq
NB (t + τ )
(t) = KN α1 N α2 . . . ¯
nαq
j
− NBj
A1
A2
Aq
125
LA COMPLEXITÉ
Bastien CHOPARD
11
k=0.8
) A
N
ln(
6
0
10
ln(time)
Figure 2.51: Time decay of NA, the total number of A particles in the A + A → ∅
reaction-diffusion process, with the multiparticle method. A non-mean-field power
law t−d/2 is observed. in agreement with theoretical arguments.
where K is the reaction constant whose expression is
k
K = α1!α2!...αq!
This calculation is based on combinatorial arguments and the equiprobability of all
configurations with the same number of particles. More details can be found in [14,93].
In the limit τ → 0, we obtain the usual form of the rate equations for the reaction
process under study, namely
K
∂tNA (t) =
N α1N α2 . . . N αq
i
− τ A1 A2
Aq
K
∂tNB (t) =
N α1 N α2 . . . N αq
j
τ
A1
A2
Aq
2.6.4
Turing patterns
In this section, we use our multiparticle reaction-diffusion model to simulate the for-
mation of the so-called Turing structures. Turing [136] was the first to suggest that,
under certain conditions, chemicals can react and diffuse so as to produce steady-state
heterogeneous spatial patterns of chemical or concentrations [36]. Turing structures
are believed to play an important role in biological pattern formation processes, such
as the stripes observed on the zebra skin [101]. In contrast to most hydrodynamical
instabilities, the structure of Turing patterns is not related to any imposed macroscopic
length scales (like the size of the container). Turing patterns exhibit regular structure
with an intrinsic wavelength depending on the diffusion constants and reaction rates.
Typical examples of inhomogeneous stationary states observed in experiments have a
hexagonal or a striped structure [137].
126
2. Cellular Automata and Lattice Boltzmann Techniques
LA COMPLEXITÉ
For the sake of simplicity, we consider here only one of the simplest models show-
ing Turing patterns: the Schnackenberg reaction-diffusion model [138] in two dimen-
sions. It describes the following autocatalytic reaction:
A k1
−→ X
X k2
−→ ∅
2X + Y k3
−→ 3X
B k4
−→ Y
(2.133)
where the densities of the species A and B are kept fixed (for instance by external
feeding of the system). This situation of having a fixed concentration of some chemical
is quite common in reaction-diffusion processes. As a result, there is no need to include
all the dynamics of such reagents in cellular automata or multiparticle models. It is
usually enough to create randomly a local population of these particles at each lattice
sites.
Here we consider a two-dimensional multispecies, multiparticle model with alter-
nating reaction and diffusion steps. Instead of varying p0 in equ. 2.130, the diffusion
coefficient is adjusted by performing
consecutive diffusion steps for a given species.
This technique amounts to introducing a different time step τm = τ / for this species
and yields D = λ2/4τ .
The instability of the homogeneous state leading to Turing structures can be under-
stood using the corresponding macroscopic rate equations [101] for the local average
densities x and y
∂
2
tx
= k1a − k2x + k3x2y + Dx x
∂
2
ty
= k4b − k3x2y + Dy y
(2.134)
where a and b represent the densities of particles A and B, respectively. A conventional
analysis shows that for some values of the parameters, a homogeneous stationary state
is unstable towards local density perturbations. Inhomogeneous patterns can evolve by
diffusion-driven instabilities providing that the diffusion constants Dx and Dy are not
the same. The region of the parameter space (a, b, Dy/Dx,...) for which homogeneous
states of the system are unstable is called the deterministic Turing space.
Figure 2.52 shows the configuration obtained in the long time regime with our mul-
tiparticle model and the corresponding rate equations 2.134. In both cases, a hexagonal
geometry is selected. The right panel corresponds to the solution of the rate equations,
while the left panel corresponds to the multiparticle simulation. As we can see, the two
pictures are quite similar. Although, it is not clear that the multiparticle (which brings
fluctuations into play) adds anything compared with the predictions of the mean-field
rate equations (which use less computer time) there are some indications [139] that the
Turing space may be enlarged when fluctuations are considered.
127
LA COMPLEXITÉ
Bastien CHOPARD
(a)
(b)
Figure 2.52: Turing patterns obtained in the Schnackenberg reaction in the long time
regime. (a) Multiparticle model and (b) mean-field rate equations.
2.6.5
A multiparticle fluid
In this section we show a the multiparticle method can also be used to model a hydro-
dynamic behavior. The key problem is to build the appropriate collision rule. Defining
a collision between an arbitrary number of particles which conserve mass and mo-
mentum is not an easy task: particles are indivisible and fractions of them cannot be
distributed among the lattice directions to satisfy the conservation laws. Furthermore,
it is not possible to pre-compute all possible collisions (as we do in a cellular automa-
ton) because there are an infinite number of configurations. Thus, more sophisticated
algorithms should be devised which may slow down the computation of the collision
output.
We also would like to define a model in which the viscosity is an adjustable pa-
rameter. The approach we propose here is to develop a collision procedure which, on
average, obeys the lattice BGK equation for hydrodynamics (see section 2.4). Thus,
we write the evolution rule as
fi(r + τ vi, t + τ ) = fi(r, t) + Fi(f(r,t))
where fi are integer variable (fi ∈ {0, 1, 2, ..., ∞}) describing the number of particles
entering site r at time t with velocity vi. The quantity Fi is the collision term. As
usual, the particle density ρ and velocity field u are defined as
ρ(r, t) =
fi(r, t)
ρu(r, t) =
fi(r, t)vi
i
i
where index i runs over the lattice directions.
We now assume that the main effect of the interaction is to restore the local equi-
librium distribution 2.82 obtained in the LB formalism
b
u2
h
f (0) = aρ +
ρv
+ ρ
v
i
v2
i · u + ρev2
v4 iαviβuαuβ
(2.135)
Note that fi is an integer whereas f (0)
i
is a real number. The parameters a, b, e and h
should be determined according to the geometry of the lattice, with the condition that
128
2. Cellular Automata and Lattice Boltzmann Techniques
LA COMPLEXITÉ
the Navier–Stokes equation describes the dynamics of the system, and that ρ(r, t) =
f (0)(r, t) and ρu(r, t) =
f (0)(r, t)v
i
i
i
i
i.
We shall require that, as in the BGK situation, the relaxation to the local equilib-
rium is governed by a parameter ξ. Thus, the number of particles fi leaving (after
collision) a given site along direction i is
1
fi = fi +
f (0)
ξ
i
− fi + ∆fi
(2.136)
where ∆fi is a random quantity accounting for the fact that (after collision) the actual
particle distribution may depart from its ideal value.
In practice f
f
i is obtained as follows. Let N =
i
i be the total number of particle
at the given site. We assign to each direction i a weight wi computed as
1
1
wi = max 0, f (0) + 1
f
ξ i
− ξ i
From these weights, we define pi, the probability for a particle to leave the site along
direction i, as pi = wi/M , where M =
w
i
i is a normalization constant.
To compute the collision output, we run through each of the N particles and place
them in direction i with probability pi. This gives us a temporary particle distribution
˜
fi which then must be corrected to obtain fi, in order to ensure exact momentum
conservation.
In our algorithm, ˜
fi is computed as
N
˜
fi =
(si−1 ≤ rh < si)
(2.137)
h=1
where (si−1 ≤ sh < si) is to be taken as a boolean value which is 1 when the condition
is true and zero otherwise. The quantities si are defined by si =
i
p
j=1
j , s0 = 0 and
rh is a random variable uniformly distributed in [0, 1[. It is then easy to check that
(si−1 ≤ sh < si) = 1 with probability pi.
Therefore, the expectation of ˜
fi is < ˜
fi >=
N
p
h=1
i. If none of the pi is zero, we
have M = N and
1
1
< ˜
fi >= f (0) + 1
f
ξ i
− ξ i
(2.138)
Note that when N is large enough, equation 2.137 can be computed using a Gaussian
approximation, as explained for the reaction-diffusion multiparticle moldel.
While the distribution ˜
fi of outgoing particles obviously conserves the number of
particles, equation (2.138) shows that it does only conserve momentum on average and
some particles must be redirected to ensure exact conservation. The momentum tuning
is performed iteratively, according to the following steps
• At each site where momentum is not correctly given by
˜
f
j
j vj , choose at ran-
dom one lattice direction i.
129
LA COMPLEXITÉ
Bastien CHOPARD
• If ˜fi = 0 move one particle randomly to an adjacent direction.
• Accept the change if it does not increase the momentum error. It is important to
accept modifications which do not improve the error because it may happen that
only a two-particle redirection decreases the error.
• Iterate this procedure until the outgoing particle distribution satisfies momentum
conservation
f v
f
j
j j =
j
j vj .
√
From the way the particles are distributed, we expect that roughly
N of them are
misplaced. This gives an estimate of the number of iteration necessary to re-adjust the
particle directions.
According to the above discussion, the quantity ∆fi defined in equation 2.136
vanishes on average. This fact is confirmed numerically. Consequently, we write
1
1
< fi(r + τ vi, t + τ ) >= f (0)(r, t) + 1
f
ξ i
− ξ i(r,t)
where we have used that fi(r + τ vi, t + τ ) = fi(r, t), due to the definition of particle
motion.
In the limit where the correlations between the fi’s can be neglected (remember
that f (0)
i
is a nonlinear function of all fj’s) we may take the average of the above
equation and we obtain
1
< fi(r + τ vi, t + τ ) >=
f (0)(< ρ >, < ρu >) +
ξ i
1
1 −
< f
ξ
i >
(2.139)
Equation 2.139 is identical to the usual BGK microdynamics (see section 2.4), ex-
cept that now it approximates a multiparticle dynamics in which fi are integer
quantities. Therefore, the standard multiscale Chapman-Enskog expansion [14] can
be applied exactly as in the BGK case and the same hydrodynamical behavior
emerges: equation 2.139 is equivalent to the Navier-Stokes equation with viscosity
ν = τ v2(C4/C2) (ξ − 1/2), where C2 and C4 are model dependent (different in hexag-
onal, square or cubic lattices and are defined in equ. 2.80 and 2.81.
The present muliparticle scheme is intrinsically stable. No small fluctuation will be
amplified unphysically to make the arithmetic blow up as happens with the LB model
when ν becomes too small. Any value of the relaxation parameter ξ can be considered
without numerical problems but the physical limit of our model when ξ → 1/2 (or
ξ < 1/2) has not yet been explored.
We now present some applications of our multiparticle fluid, on a two-dimensional
hexagonal lattice and with a population of rest particles. Figure 2.53 shows the mea-
sured velocity profile in a simulation of a Poiseuille flow [71]. Fluid particles are
injected on the left side of a channel of length L and width W with a rightward ve-
locity. On the upper and lower channel limits, the usual no-slip condition is imposed,
130
2. Cellular Automata and Lattice Boltzmann Techniques
LA COMPLEXITÉ
1
y
00
0.25
x-velocity
Figure 2.53: Velocity profile in a multiparticle Poiseuille flow. The plot shows the
horizontal average velocity < ux(y) > as a function of y the vertical position between
the upper and lower boundaries. The solid line corresponds to the best parabola fitting
the data.
7
6
5
log(N(t)) 4
1024
512
3
256
32
128
64
2
0
1
2
3
4
log(t)
Figure 2.54: Decay laws for the ballistic annihilation simulations, using the multipar-
ticle lattice gas model. The various plots correspond to the lattice sizes indicated in
the box. The decay exponent x is given by the slopes of the lines which are all within
x = 0.875 ± 0.005, except for the smallest lattice.
by bouncing back incoming particles in order to produce a zero speed flow at the
boundary. We observe a parabolic velocity profile in agreement with the prediction of
hydodynamics.
As a second example, we consider the ballistic annihilation problem A + A → 0,
where particles A evolve according to our multiparticle fluid rule. This is a variant
of the diffusive annihilation problem discussed in section 2.6.3: here a hydrodynamic
behavior is imposed to the particle instead of a diffusive motion.
When two particles meet at the same site with opposite velocities, they annihilate
each other. Thus, before the hydrodynamic colllision take place, our multiparticle
dynamics is supplemented by a reaction term which modifies the particle distributions
fi as fi → max(0,fi−fi ), where i and i correspond to opposite velocities (vi = −vi ).
We are interested to measure the number N (t) of A particles left in the system as
131
LA COMPLEXITÉ
Bastien CHOPARD
time goes on. It is known [140,141,142,143] that this quantity obeys a power law
N (t) ∼ t−x. Molecular dynamics simulations [144] predict an exponent x between
0.86 and 0.89 depending on the size of the sample, in a two-dimensional system.
The simulation performed with the multiparticle model fully agrees with this pre-
diction since an exponent x = 0.87 ± 0.005 is found [145]. The simulation time re-
quired to obtained this value is several order of magnitude shorter than a full molecular
dynamics computation. The results for the decay law are sumarized in figure 2.54.
The decay exponent x depends on the space dimension, as well as the velocity
distribution [142,143]. For one-dimensional systems with particles of velocity ±v, it
is found that x = 1/2. In two dimensions, the molecular dynamics simulations [144]
indicate that the velocity distribution tends to a Maxwellian, in the long time regime. It
is then interesting to note that our multiparticle model imposes from the very beginning
a discrete, truncated Maxwellial velocity distribution (equ. refeq:multi-f0).
2.7
Wave model and fracture simulation
In the previous sections, the LB approach has been applied to hydrodynamic systems
and reaction diffusion processes. Here we show that it can also be used to define a
wave dynamics. This section will present the basic aspect of the model, as well as
some of its applications.
2.7.1
The wave model
Wave phenomena, whether mechanical or electromagnetic derives from two conserved
quantities Ψ and J, together with time reversal invariance and a linear response of the
media. The quantity Ψ is a scalar field and J its associated current. For sound waves, Ψ
and J are respectively the density and the momentum variations. In electrodynamics,
Ψ is the energy density and J the Poynting vector [146].
The idea behind the LB approach is to “generalize” a physical process to a discrete
space and time universe, so that it can be efficiently simulated on a (parallel) com-
puter. For waves, this generalization is obtained by keeping the essential ingredients
of the real phenomenon, namely conservation of Ψ and J, linearity and time reversal
invariance. Thus, in a discrete space-time universe, a generic system leading to wave
propagation is obtained from the lattice BGK equation
1
fi(r + τ vi, t + τ ) − fi(r,t) =
f (0)(r, t)
ξ
i
− fi(r,t)
(2.140)
by an appropriate choice of the local equilibrium distribution
v
f (0) = aΨ + b i · J
if i = 0, and
f(0) = a
i
v2
0
0Ψ
(2.141)
where v is the ratio of the lattice spacing λ to the time step τ , and Ψ and J are related
to the fis in the standard way: Ψ =
m
m
i
ifi and J =
i
ifivi. The quantities
132
2. Cellular Automata and Lattice Boltzmann Techniques
LA COMPLEXITÉ
mi are the weights associated to each lattice directions and whose value depends on
the chosen lattice (here m0 = 1 whatever the lattice is). Note that, here, we make no
restriction on the sign of the fis which may well be negative in order to represent a
wave.
As opposed to hydrodynamics [79], f (0)
i
is a linear function of the conserved quan-
tities, which ensures the superposition principle. The parameters a, b and a0 are com-
puted so that Ψ =
m
m
i
if (0)
i
and J =
i
ivif (0)
i
, which ensures conservation of Ψ
and J.
Following the same derivation as in section 2.4, we obtain
1
a0 + aC0 = 1
b = C2
where C0 =
m
m
i≥1
i and
i≥1
iviαviβ = C2v2δαβ . For the two-dimensional square
lattice with rest particle (D2Q5), mi = 1, C0 = 4 and C2 = 2.
Writing the momentum tensor Π(0) =
m
= c2Ψδ
αβ
i
iviαviβ f (0)
i
as Π(0)
αβ
s
αβ , we
obtain
c2
c2 C
a =
s
a
s
0
v2C
0 = 1 −
2
v2 C2
where cs is a free parameter giving the wave propagation speed. This parameter can
be adjusted locally to model a medium with different refraction indices.
We can now compute the macroscopic behavior of Ψ and J , using the procedure of
section 2.4. The main difference is that here, we do not have to neglect the higher order
in J, since the dynamics is defined as linear. A straightforward calculations gives
∂tΨ + ∂βJβ = 0
(2.142)
∂tJα + c2∂
s αΨ +
1
τ
ξ −
τ c2∂
T
2
s αdivJ − C
αβγδ ∂β ∂γ Jδ
= 0
2v2
(2.143)
where Tαβγδ =
v
i
iαviβ viγ viδ . Depending on the lattice, this fourth order tensor may
not be isotropic. This is precisely the case of the D2Q5 lattice which is known for
giving anisotropic contributions to the hydrodynamic equations. However, this term
vanishes when ξ = 1/2. This is interesting since the condition ξ = 1/2 is required to
ensure time reversal invariance, as can be easily checked from eq. 2.140 with J → −J
and Ψ → Ψ in relation 2.141.
Equations 2.142 and 2.143 can be combined (space derivative of the second sub-
stituted in time derivative of the first). This yields
1
τ
∂2
2
t Ψ − c2s
Ψ =
ξ −
τ c2
T
2
s ∂αdivJ − C
αβγδ ∂β ∂γ Jδ
2v2
133
LA COMPLEXITÉ
Bastien CHOPARD
With ξ = 1/2, we recover the wave equation
∂2Ψ
2Ψ = 0
t
− c2s
(2.144)
In hydrodynamic models, ξ = 1/2 corresponds to the limit of zero viscosity (see sec-
tion 2.4), which is numerically unstable. In our case, this instability does not show
up provided we use an appropriate lattice. In the D2Q5 lattice, our dynamics is also
unitary [147] which ensures that
f 2
i
i is conserved. This extra condition prevents the
fis from becoming arbitrarily large (with positive and negative signs, since Ψ is con-
served). This is no longer the case with the D2Q9 lattice, where numerical instabilities
develop for this wave dynamics.
Note that dissipation can be included in our microdynamics. Using ξ > 1/2 allows
us to describe waves with viscous-like dissipation. This makes sense with the hexago-
nal lattice D2Q7, where no stability problem occurs when ξ = 1/2 and no anisotropy
problem appears when the viscosity is non-zero (ξ > 1/2).
There is another (and simpler) way to include dissipation in this model, which is
suitable for the D2Q5 lattice and appropriate to our purpose of modeling fracture prop-
agation (see section 2.7.4): absorption on non-perfect transmitter sites can be obtained
by modifying the conservation of Ψ to
m
= µΨ, where 0
i
if (0)
i
≤ µ ≤ 1 is an atten-
uation factor. In this way, µ = 0 corresponds to perfect reflection (see equation 2.146),
µ = 1 to perfect transmission and 0 < µ < 1 describes a situation where the wave is
partially absorbed.
In equation 2.144, the propagation speed is given by c2s = av2C2. For the stability
of the numerical scheme we must impose that a0 ≥ 0. This yields the larger possible
value of a and, thus, the maximum propagation speed of the model is
C
c2
=
2 v2
max
C0
(note that v is the speed at which information travels). We define the refraction index
n (which may depends on the position) as
c
n(r) = max ,
n
c
≥ 1
s(r)
From these results, we may rewrite a and a0 as
1
1
a =
a
C
0 = 1 −
0n2
n2
and equation 2.140 reads
µ 1
1
1
fi(r + τ vi, t + τ ) =
Ψ +
v
ξ C
i · vjmjfj −
0n2
ξC2v2
ξ − 1 fi(r, t)
j
µ (n2
1
f
− 1)
0(r, t + τ )
=
Ψ
ξ
n2
− ξ − 1 f0(r,t)
(2.145)
134
2. Cellular Automata and Lattice Boltzmann Techniques
LA COMPLEXITÉ
Figure 2.55: Simulation with the LB wave model: focusing of light by a convex lens
where the propagation speed is smaller than in vacuum (left). Focusing by a parabolic
mirror (right).
where µ is the dissipation factor.
For ξ = 1/2 and a d-dimensional cartesian lattice, we have mi = 1, C2 = 2,
C0 = 2d and the above equations reduce to
µ
fi(r + τ vi, t + τ ) =
Ψ
dn2
− fi (r,t)
n2
f
− 1
0(r, t + τ )
= 2µ
Ψ
n2
− f0(r,t)
(2.146)
where i is defined as the direction opposite to i, i.e. that having vi = −vi. When
µ = 0, the microdynamics becomes fi(r + τ vi, t + τ ) = −fi (r,t). This corresponds
to a perfect reflexion on a mirror site, that is the flux bounces back to where they came
from with a change of sign. This is a way to define a boundary condition by tuning the
parameter µ on some selected sites.
Since equation 2.146 is linear, it can also be expressed using a matrix formulation
fi(r + τ vi, t + τ ) =
W
j
ij fj (r, t). However, from the point of view of a numerical
implementation, equ. 2.146 implies less computation.
Figure 2.55 (left) shows a simulation (D2Q5) of equation 2.146 in a situation where
two media are present. A plane wave is produced in medium M1 by forcing a sine
oscillation for the fi’s on some vertical line. The wave propagates at speed c0 till it
penetrates in medium M2 which has the shape of a convex lens. There, propagation
speed is set to c < c0. The shape of the lens naturally produces a focusing of the
energy when the wave re-enters medium M1. In these simulation, µ = 1.
An example of a wave reflected on a parabolic mirror is shown in figure 2.55
(right). Each lattice site in the black region is a perfect reflector with µ = 0. As a
result of the collective effect of these mirror sites, we observe that the incoming plane
wave concentrates at the focal point of the parabola.
A natural interpretation of our LB wave model is to assume that the fi’s repre-
sent some physical fields (a local deformation or deviation from an equilibrium state).
These fields propagate on the lattice and are scattered when reaching a site as illus-
trated in figure 2.56.
The idea of expressing wave propagation as a discrete formulation of the Huygens
principle has been considered by several authors [148,149,150,151]. Not surprisingly,
135
LA COMPLEXITÉ
Bastien CHOPARD
f2
v2
f0
v
v
f
f
f
3
1
1
3
1
v4
f4
(a)
(b)
Figure 2.56: Scattering of an incoming flux f1 = 1 at a D2Q5 lattice site, according
to equation 2.146.
the resulting numerical schemes bear a strong similarity to ours. Nevertheless the con-
text of these studies is different from ours and none have noticed the existing link with
the lattice BGK approach. Models of refs. [150,151] use a reduced set of conserved
quantities, which may not be appropriate in our case. Other models [152,153] con-
sider wave propagation in a LB approach, but with a significantly more complicated
microdynamics and a different purpose.
2.7.2
Application to mobile communications
The above LB wave model can be used to compute the wave intensity pattern in a
system with complicated boundary conditions. Here we consider the problem of pre-
dicting the intensity of a wave propagating in a city. This application is relevant to the
field of cellular phone and mobile communication devices.
An efficient planning of the deployment of wireless communication networks is
based on accurate predictions of radio-wave propagation in urban environments. Ra-
dio waves are absorbed, reflected, diffracted and scattered in a complicated way on
the buildings and this constitutes a difficult propagation problem which is studied by
various authors [154,155,156] and is beyond analytical calculation. Yet, the coverage
region of an antenna is a crucial question because the base stations must be placed
in appropriate locations so that a complete coverage is guaranteed with a minimum
number of cells, each of them no larger than what is allowed by traffic or propagation
requirements.
The LB model presented in the previous section (with n = 1) produces fast and
accurate predictions of the wave propagation in urban environment [157]. The proce-
dure starts by discretizing the building layout by, for example, scanning a city map.
Depending on the nature of each pixel (building or not), a different set of coefficients
is defined for the microdynamics of the fi. The value is chosen appropriately after
comparison with real measurements performed by Swisscom. A source wave of wave
length Λ is simulated at site r by imposing a A(r) sin(2πt/T ) for the fi(r, t) where
T = c/Λ is the period and A(r) some chosen amplitude.
The simulation then consists of a synchronized updating of each site, according
to the LB microsynamics until a steady state of the signal intensity (defined as the
136
2. Cellular Automata and Lattice Boltzmann Techniques
LA COMPLEXITÉ
-105
-115
-125
-135
-105
Pathloss [dB] -115
-125
-135
-145 0
100
200
300
400
d [m] on Breitfeld street
Figure 2.57: LB simulation of wave propagation in the city of Bern on a square lattice
of size 512×512. The white blocks represent the buildings, the gray levels indicates the
simulated intensity of the wave (decreasing from white to black) and the dot marks the
position of the source. The plots show the measured and computed intensity along the
street which is indicated by the dotted white line. Two types of boundary conditions
were applied for the sites limiting the buildings in the discretized layout: reflecting
walls in the upper graph and permeable walls in the lower graph.
amplitude of Ψ) is reached. A re-normalization scheme Ψ = R(δ, Λ/Λ0)Ψ must be
then applied in order to account for the three-dimensional geometry of the real prop-
agation problem, and the possibly wrong wavelength Λ chosen for numerical reasons
(the wavelength must be large compared to the lattice spacing). In the function R, the
quantity δ is the distance to the source depending on the layout (see [157]) and Λ0 is
the real wavelength concerned by the prediction.
Figure 2.57 shows a typical simulation of the wave intensity pattern produced by a
transmitter located in an urban area. The predictions of the LB model and the renor-
malization procedure are in good agreements with the corresponding real measure-
ments performed by Swisscom in the real environment.
2.7.3
Modeling Solid Body
Whereas LB methods have been largely used to simulate systems of point particles
which interact locally, modeling a solid body with this approach (i.e. modeling an
object made of many particles that maintains its shape and coherence over distances
much larger than the interparticle spacing) has remained mostly unexplored. A suc-
cessful attempt to model a one-dimensional solid as a cellular automata is described
in [158]. The crucial ingredient of this model is the fact that collective motion is
137
LA COMPLEXITÉ
Bastien CHOPARD
achieved because the “atoms” making up the solid vibrate in a coherent way and pro-
duce an overall displacement. This vibration propagates as a wave throughout the solid
and reflects at the boundary.
A 2D solid-body can be thought of as a square lattice of particles linked to their
nearest neighbors with a spring-like interaction. Generalizing the model given in [158]
requires us to consider this solid as made up of two sublattices. We term them black
and white, by analogy to the checkerboard decomposition. The dynamics consists in
moving the black particles as a function of the positions of their white, motionless
neighbors, and vice-versa, at every other steps.
Let us denote the location of a black particle by ri,j = (xi,j, yi,j). The surround-
ing white particles will be at positions ri−1,j, ri+1,j, ri,j−1 and ri,j+1. We define the
separation to the central black particle as (see figure 2.58)
f1(i, j, t) = ri,j(t) − (ri−1,j(t) + h)
f2(i, j, t) = ri,j(t) − (ri,j−1(t) + u)
f3(i, j, t) = ri,j(t) − (ri+1,j(t) − h)
f4(i, j, t) = ri,j(t) − (ri,j+1(t) − u)
(2.147)
where the fi are now vector quantities, and h = (r0, 0) and u = (0, r0) can be thought
of as representing the equilibrium length of the horizontal and vertical spring connect-
ing adjacent particles. With this formulation, the coupling between adjacent particles
is not given by the Euclidean distance but is decoupled along each coordinate axis
(however, a deformation along the x-direction will propagate along the y-direction
and conversely). This method makes it possible to work with a square lattice, which
is usually not taken into account when describing deformation in a solid because, with
the Euclidean distance, the y-axis can be tilted by an angle α without applying any
force. The breaking of the rotational invariance is expected not to play a role in the
fracture process we shall consider below.
The locations rij(t + 1) of the black particles is obtained by updating the corre-
sponding fis by equ. 2.146, with n = 1 and for i > 0. Next, the quantities f are
interpreted as the deformations seen by the white particles,
f1(t + 1) = ri+1,j − (rij(t + 1) + h)
f2(t + 1) = ri,j+1 − (rij(t + 1) + u)
f3(t + 1) = ri−1,j − (rij(t + 1) − h)
f4(t + 1) = ri,j−1 − (rij(t + 1) − u)
(2.148)
Then, the same procedure can re-applied to move the white particles.
It turns out that equ. 2.146 (with n = 1 and i = 0) is equivalent to moving the
particles to a symmetric location with respect to (1/4)[ri−1,j + h + ri+1,j −h+ri,j−1+
u + ri,j+1 − u] (i.e. the center of mass of the neighbors, as shown in figure 2.58).
138
2. Cellular Automata and Lattice Boltzmann Techniques
LA COMPLEXITÉ
time t
time t +1
f1
f4
f2
f3
f3
f
f1
2
f4
Figure 2.58: Illustration of the way the fis are defined. The cross indicates the location
of the geometrical center of mass of the four white particles. At the next iteration, the
black particle jumps to a symmetrical position with respect to this point.
Indeed, in this case the new location of the particle will be
1
ri,j(t + 1) = rij + 2(rCM − rij) = f
2
1 + f2 + f3 + f4
(2.149)
If this expression is substituted into equ. 2.148, it is easy to check that, for instance,
1
f1(t + 1) =
f
2
1 + f2 + f3 + f4
− f3
(2.150)
and similarly for f2(t + 1), f3(t + 1) and f4(t + 1) This shows that the dynamics given
in equation 2.150 is identical to the LB wave model described in relation 2.146 for
n = 1 and f0 = 0.
The momentum pij associated to the motion of particle (i, j) is then
1
pij ≡ rij(t + 1) − rij(t) = − f
2
1 + f2 + f3 + f4
which is the conserved quantity Ψ introduced in the LB wave model.
At the boundary of the domain a different rule of motion has to be considered
since the particles may have less than four links. With the interpretation of the rule as
a symmetrical motion with respect to
rCM = [(ri−1,j+h)ne+(ri+1,j−h)nw+(ri,j−1+u)ns+(ri,j+1−u)nn]/(ne+nw+ns+nn)
where ne, nw, ns and nn are Boolean variables indicating the presence or absence of
a neighbor along the east, west, south and north directions, the evolution rule can be
written down for particles missing some of their link, either because they are at the
boundary of the domain or because some links are broken, as described below.
2.7.4
Fracture
An interesting application of our LB solid body model is the study of a fracture process.
How things breaks is still an important problem in science for which one lacks theory
and no satisfactory understanding is yet achieved [159].
139
LA COMPLEXITÉ
Bastien CHOPARD
Figure 2.59: Fracture simulation obtained in a LB solid with 128 × 128 atoms when
applying an opposite force on both sides of the sample.
The key idea when using our approach as a model of dynamic crack is to assume
that a bond linking two adjacent atoms may break if the local deformation exceeds
some given threshold. This threshold can possibly be different for each bond and
spatial disorder can be introduced in this way. Once a bond is broken, the atoms on
each side of the crack behave as free ends. A broken link weakens the material because
a local deformation can no longer be distributed uniformly among the four neighbors.
Usually, the next bond to break is nearest neighbor of an already broken bond.
A typical experiment which is performed when studying fracture formation is to
apply a stress by pulling in opposite way the left and right extremities of a solid sample.
A small notch (artificially broken links) is made in the middle of the sample to favor
the apparition of the fracture at this position. Once a given strain is reached, a crack
forms and propagate from that notch through the bulk, breaking the system in one
or multiple pieces. The fracture is perpendicular to the direction of the stress. This
situation is illustrated in figure 2.59 where each dot shows the position of an atom.
The shape of the fracture we obtain is qualitatively similar to what is observed in
real experiment [159]. Several situations can be reproduced, depending on the value
of the model parameters. It is found that adding some attenuation in the motion (i.e.
having µ < 1) yields fractures with less branching. Figure 2.60 shows some of the
simulation results. In figure (b) no damping of the wave is included while, in (a) a
damping factor µ = 0.92 is added. Figures (c) and (d) have less disorder than (a) and
(b) in the sense that the breaking threshold varies weakly over space. The damping in
(d) is µ = 0.91, slightly stronger than in (c) (µ = 0.92). The stretching rate (i.e the
displacement of the solid boundary at each time step) is the same for all experiments.
In the above simulations, once the fracture starts propagating, the external stress is
turned off.
We have measured the propagation speed of the fracture by recording the location
of the crack tip l(t) for each time step. In case of branching we consider the most ad-
vanced crack. Figure 2.61 shows the average velocity v(t) = l(t)/t of the propagation
fracture as a function of time. These measurements made from our simulation are in
qualitative agreement with experimental data. In particular the crack speed is slower
140
2. Cellular Automata and Lattice Boltzmann Techniques
LA COMPLEXITÉ
(a)
(b)
(c)
(d)
Figure 2.60: Fracture (top) and the corresponding map of the broken bond (bottom)
for several runs with different parameters.
1
(crack speed)/c0
0
0
1000
time
Figure 2.61: Crack propagation speed measured in the LB fracture simulation. The
upper and lower curves correspond to the fractures shown in figure 2.60 (b) and (d),
respectively.
141
LA COMPLEXITÉ
Bastien CHOPARD
than the speed of sound (which is here c0 = 1/√2 in lattice units) and it is faster when
the fracture is complex.
2.7.5
Wave localization
In this section, we consider another problem for which the LB wave model is useful:
propagation in disordered media. Our purpose is not to derive new physical properties
but, rather, to show how the LB approach can be easily applied to study a difficult
problem such as wave localization. Note also that this type of approach has been
considered for this problem by other authors [150].
A coherent, but by no means complete understanding of the problem of waves in
disordered media has only emerged recently [160]. Disordered media means here that
the waves are supposed to undergo multiple scattering and the problem is quite differ-
ent from the case of propagation in an urban medium, as considered in section 2.7.2.
The two problems are not logically separated but the treatment of radio waves propa-
gation in urban areas involves obstacles with typical sizes much larger than the wave-
length. Thus, it is rather a diffraction problem, involving departures from geometrical
optics caused by the finite wavelength of the waves and actual scattering analysis en-
ters the game only once the small scale roughness of the buildings or the corners are
taken into account.
For the scattering of waves by systems whose characteristic sizes are small com-
pared to the wavelength, it is convenient to think of the incident fields as inducing a
response that oscillates in definite phase relationship with the incident wave and radi-
ates energy in directions other than the direction of incidence. If the medium contains
randomly distributed such small scatterers, the picture of multiple scattered waves is
very different from that we normally associate with waves. Although the medium
is a purely elastic, the wave can have a diffusive-like behavior or become localized,
showing up no more spatial periodicity or possibility for transport.
Our wave propagation model is particularly well adapted to investigate numeri-
cally wave propagation in random media beyond what is analytically possible. Here
we consider a two-dimensional medium with two different refraction indices n: the
background sites have a value of n0 = 1, whereas the randomly distributed scatterers
have n1 > 1.
Different media may be designed. For instance we could choose a different value
of n for each scatterer, or even a different value of n for each lattice site. Figure 2.62
shows the typical pattern of energy issued from a point source located in a random
medium composed of 2% of scatterers. Note that the dynamics of our model is time-
reversal invariant and that the new propagation pattern we observe is not due to some
form of dissipation.
The pattern shows large fluctuations and further analysis or comparison with clas-
sical diffusion involve an averaging over different spatial configurations. To avoid the
excess of computation generated by an averaging process over successive configura-
tion, we consider a two-dimensional system with a one-dimensional symmetry. The
142
2. Cellular Automata and Lattice Boltzmann Techniques
LA COMPLEXITÉ
Figure 2.62: Snapshot of energy propagation pattern in a random medium composed
of a background of refraction index n0 = 1 and containing 2% of randomly distributed
scatterers (black dots), all with n1 = 10. The source is placed at the center of the sam-
ple and oscillates with a period of 16 time steps. The pattern shows large fluctuations
and the diffusive, or sub-diffusive, behavior only emerges after an averaging over the
configurations.
143
LA COMPLEXITÉ
Bastien CHOPARD
4
10
Pure Wave Propagation
10 % of impurities
1 % of impurities
Pure diffusion
3
10
Π
2
10
1
10
1
2
3
10
10
10
Time [iterations]
Figure 2.63: Transition from the wave to the diffusive transport in a 1-dimensional
geometry. The strip-like domain size is 4096 × 64 and the refraction index ratio of
background sites over impurities is 1/10. The square root of the second moment of
the energy distribution Π is ploted in function of time. Between the two extreme cases,
homogeneous medium and pure diffusion, we observe a smooth transition for random
media (1% and 10%) from the wave regime ∝ t to the diffusive regime ∝ t1/2.
averaging is achieved by a reduction over the “irrelevant” dimension. We consider the
propagation of the energy issued from a “line-pulse” in a two-dimensional long strip-
like medium (typically of size 4096 × 64). The line source is placed in the middle and
radiates synchronously two oscillations of a wave with a given period T = 6. Two free
parameters determine the medium: ρ the density of randomly distributed scatterers and
n1 the scattering strength, or refraction index of the scatterers.
In order to extract the propagation properties of a wave traveling in such a disorder
media, we study the average behavior of the square root of the energy distribution
second moment
Π(t) =
A2(r, t)r2d2r
where r is the distance to the source and A the amplitude of the wave at position r.
Thus Π is .
The results are shown in figure 2.63. It can be seen that for the homogeneous
medium, a pure wave propagation is characterized, as expected, by Π(t) ∝ t. For
random non-dissipative media the dynamics switches to a behavior given by Π(t) ∝
√t which is typical of a diffusive transport regime. The cross-over is smooth and
happens earlier in case of increasing disorder, or increasing scattering strength.
For sake of comparison we also show the behavior of Π(t) in the case of true
diffusion, with the model discuss in section 2.5.3. However, for the diffusion case,
Π must be changed: instead of the “energy” A2 we take the local field value Ψ =
f
i
i. The good agreement between classical diffusion and wave diffusion (or weak
localization) is shown in figure 2.63.
144
BIBLIOGRAPHY
LA COMPLEXITÉ
100
Strong localization: 20% (n =31)
1
Π
10 10
100
1000
10000
Time [iterations]
Figure 2.64: Sub-diffusive (or localized) behavior obtained with a stronger disorder:
n1 = 30 and an impurity density of 20%. We have Π ∝ tα with α ≈ 0.24.
Strong localization is presented in [160] as a tendency for the diffusion coefficient
to fall towards 0. The measurement of the quantity Π with a significant increase of
the amount of disorder shows a behavior Π ∝ tα where α < 1/2, as ilustrated in
figure 2.64.
It is interesting to note that, when the reflextion index n 1 is large, the scaterers
behaves as energy conserving reflectors (i.e. with µ = 0 and n = 1 in equation 2.146).
Thus, each site has µ = 1 with probability 1 − ρ and µ = 0 with probability ρ. If
the averaging process over propagation patterns is replaced by a spatial averaging of
the disorder (i.e. the averaging is done before propagation is simulated), we obtain
that the strong localization case behaves as a propagation in an absorbing media with
µ = 1 − ρ.
2.8
Bibliography
[1] R. Livi, S. Ruffo, S. Ciliberto, and M. Buiatti, editors. Chaos and Complexity.
World Scientific, 1988.
[2] I.S.I and R. Monaco, editors. Discrete Kinetic Theory, Lattice Gas Dynamics
and Foundations of Hydrodynamics. World Scientific, 1989.
[3] G. Doolen, editor.
Lattice Gas Method for Partial Differential Equations.
Addison-Wesley, 1990.
[4] P. Manneville, N. Boccara, G.Y. Vichniac, and R. Bideau, editors. Cellular
Automata and Modeling of Complex Physical Systems. Springer Verlag, 1989.
Proceedings in Physics 46.
145
LA COMPLEXITÉ
Bastien CHOPARD
[5] A. Pires, D.P. Landau, and H. Herrmann, editors. Computational Physics and
Cellular Automata. World Scientific, 1990.
[6] J.M. Perdang and A. Lejeune, editors. Cellular Automata: Prospect in Astro-
physical Applications. World Scientific, 1993.
[7] Minnesota IMA cellular automata bibliography
.
http://www.ima.umn.edu/bibtex/ca.bib.
[8] Santa-fe cellular automata bibliography. ftp://alife.santafe.edu/pub/topics/cas/ca-
faq.bib.
[9] T. Toffoli D. Farmer and S. Wolfram, editors. Cellular Automata, Proceedings
of an Interdisciplinary Workshop, Los Alamos, volume 10. Physica D, North-
Holland, 1984.
[10] T. Toffoli and N. Margolus. Cellular Automata Machines: a New Environment
for Modeling. The MIT Press, 1987.
[11] J.-P. Boon, editor. Advanced Research Workshop on Lattice Gas Automata The-
ory, Implementations, and Simulation, volume 68 (3/4). J. Stat. Phys, 1992.
[12] S. Wolfram. Cellular Automata and Complexity. Addison-Wesley, Reading
MA, 1994.
[13] D. Rothman and S. Zaleski. Lattice-Gas Cellular Automata: Simple Models of
Complex Hydrodynamics. Collection Aléa. Cambridge University Press, 1997.
[14] B. Chopard and M. Droz. Cellular Automata Modeling of Physical Systems.
Cambridge University Press, 1998.
[15] A.W. Burks. Von neumann’s self-reproducing automata. In A.W. Burks, editor,
Essays on Cellular Automata, pages 3–64. University of Illinois Press, 1970.
[16] S. Ulam. Random processes and transformations. Proc. Int. Congr. Math.,
2:264–275, 1952.
[17] A. Reggia, S.L. Armentrout, H.-H. Chou, and Y. Peng. Simple systems that
exhibit self-directed replication. Science, 259:1282, 1993.
[18] D. Mange and M. Tomassini, editors. Bio-Inspired Computing Machines: to-
wards novel computational architectures. Press Polytechniques et Universitaires
Romandes, 1998.
[19] M. Gardner. The fantastic combinations of john conway’s new solitaire game
life. Scientific American, 220(4):120, 1970.
[20] S. Wolfram. Theory and Application of Cellular Automata. World Scientific,
1986.
146
BIBLIOGRAPHY
LA COMPLEXITÉ
[21] U. Frisch, B. Hasslacher, and Y. Pomeau. Lattice-gas automata for the navier-
stokes equation. Phys. Rev. Lett., 56:1505, 1986.
[22] S. Chen, K. Diemer, G.D. Doolen, K. Eggert, C. Fu, S. Gutman, and B.J. Travis.
Lattice gas automata for flow through porous media. Physica D, 47:72–84,
1991.
[23] E. Aharonov and D. Rothman. Non-newtonian flow (through porous media): a
lattice boltzmann method. Geophys. Res. Lett., 20:679–682, 1993.
[24] D.W. Grunau, T. Lookman, S.Y. Chen, and A.S. Lapedes. Domain growth,
wetting and scaling in porous media. Phys. Rev. Lett., 71:4198–4201, 1993.
[25] D.H. Rothman. Immiscible lattice gases:new results, new models. In P. Man-
neville, N. Boccara, G.Y. Vichniac, and R. Bideau, editors, Cellular Automata
and Modeling of Complex Physical Systems, pages 206–231. Springer Verlag,
1989. Proceedings in Physics 46.
[26] M. Bonetti, A. Noullez, and J.-P. Boon. Lattice gas simulation of 2-d viscous
fingering. In P. Manneville, N. Boccara, G.Y. Vichniac, and R. Bideau, editors,
Cellular Automata and Modeling of Complex Physical Systems, pages 239–241.
Springer Verlag, 1989. Proceedings in Physics 46.
[27] A.K. Gunstensen, D.H. Rothman, S. Zaleski, and G. Zanetti. Lattice boltzmann
model of immiscible fluids. Phys. Rev. A, 43:4320–4327, 1991.
[28] D. Grunau, Shiyi Chen, and K. Eggert. A lattice boltzmann model for multi-
phase fluid flows. Phys. Fluids A, 5:2557–2562, 1993.
[29] U. D’Ortona, M. Cieplak, R.B. Rybka, and J.R. Banavar. Two-color nonlinear
cellular automata: surface tension and wetting. Phys. Rev. E, 51:3718–28, 1995.
[30] A. Károlyi and J. Kertész. Hydrodynamics cellular automata for granular media.
In R. Gruber and M. Tomassini, editors, Proceeding of the 6th Joint EPS-APS
International Conference on Physics Computing: PC ’94, pages 675–681, 1994.
[31] G. Peng and H.J. Herrmann. Density waves of granular flow in a pipe using
lattice-gas automata. Phys. Rev. E, 49:R1796–R1799, 1994.
[32] B. Boghosian, P. Coveney, and A. Emerton. A lattice-gas model of microemul-
sions. Proceedings of the Royal Society of London, 452:1221–1250, 1996.
[33] J.T. Wells, D.R. Janecky, and B.J. Travis. A lattice gas automata model for het-
erogeneous chemical reaction at mineral surfaces and in pores network. Physica
D, 47:115–123, 1991.
[34] S. Cornell, M. Droz, and B. Chopard. Some properties of the diffusion-limited
reaction nA + mB → C with homogeneous and inhomogeneous initial condi-
tions. Physica A, 188:322–336, 1992.
147
LA COMPLEXITÉ
Bastien CHOPARD
[35] B. Chopard, P. Luthi, and M. Droz. Reaction-diffusion cellular automata model
for the formation of Liesegang patterns. Phys. Rev. Lett., 72(9):1384–1387,
1994.
[36] J.-P. Boon, D. Dab, R. Kapral, and A. Lawniczak. Lattice gas automata for
reactive systems. Phys. Rep., 273:55–148, 1996.
[37] D.E. Wolf, M. Schreckenberg, and A. Bachem, editors. Traffic and Granular
Flow. World Scientific, 1996.
[38] D.E. Wolf et al., editor. Traffic and Granular Flow ’97. Springer, to appear.
[39] B. Chopard, P. O. Luthi, and P.-A. Queloz. Cellular automata model of car
traffic in two-dimensional street networks. J. Phys. A, 29:2325–2336, 1996.
[40] E. Banks. Information processing and transmission in cellular automata. Tech-
nical report, MIT, 1971. MAC TR-81.
[41] G.G. McNamara and G. Zanetti. Use of the boltzmann equation to simulate
lattice-gas automata. Phys. Rev. Lett., 61:2332–2335, 1988.
[42] F. Higuera, J. Jimenez, and S. Succi. Boltzmann approach to lattice gas simula-
tions. Europhys. Lett, 9:663, 1989.
[43] D.B. Bahr and J.B. Rundle. Theory of lattice boltzmann simulation of glacier
flow. J. of Glaciology, 41(139):634–40, 1995.
[44] G. Vichniac. Simulating physics with cellular automata. Physica D, 10:96–115,
1984.
[45] J.D. Gunton and M. Droz. Introduction to the Theory of Metastable and Unsta-
ble States. Springer Verlag, 1983.
[46] Pascal O. Luthi, Anette Preiss, Jeremy J. Ramsden, and B. Chopard. A cellular
automaton model for neurogenesis in drosophila. Physica D, 1998. to appear.
[47] M. Kikuchi S. Yukawa and S. Tadaki.
Dynamical phase transition in one-
dimensional traffic flow model with blockage. J. Phys. Soc. Jpn, 63(10):3609–
3618, 1994.
[48] M. Schreckenberg, A. Schadschneider, K. Nagel, and N. Ito. Discrete stochastic
models for traffic flow. Phys. Rev. E, 51:2939, 1995.
[49] A. Schadschneider and M. Schreckenberg. Cellular automaton models and traf-
fic flow. J. Phys., A(26):L679, 1993.
[50] K. Nagel and M. Schreckenberg. Cellular automaton model for freeway traffic.
J. Physique I (Paris), 2:2221, 1992.
148
BIBLIOGRAPHY
LA COMPLEXITÉ
[51] K. Nagel and H.J. Herrmann. Deterministic models for traffic jams. Physica A,
199:254, 1993.
[52] A. Dupuis. Simulateur micro-cellulaire parallèle de trafic routier urbain et appli-
cation à la ville de Genève. Technical report, CUI, University of Genev, 1997.
Master disertation.
[53] B. Chopard, A. Dupuis, and P. Luthi. A cellular automata model for urban traffic
and its application to the city of geneva. In D. Wolf et al., editor, Traffic and
Granular Flow ’97. Springer, to appear.
[54] T. Vicsek. Fractal Growth Phenomena. World Scientific, 1989.
[55] T.A. Witten and L.M. Sander. Diffusion-limited aggregation. Phys. Rev. B,
27:5686, 1983.
[56] S. Tolman and P. Meakin.
Off-lattice and hypercubic-lattice models for
diffusion-limited aggregation in dimension 2–8. Phys. Rev. A, 40:428–37, 1989.
[57] I. Stewart. The ultimate in anty-particle. Scientific American, pages 88–91, July
1994.
[58] J. Propp. Trajectory of generalized ants. Math. Intelligencer, 16(1):37–42,
1994.
[59] S. Galam, B. Chopard, A. Masselot, and M. Droz. Competing species dynamics:
Qualitative advantage versus geography. Eur. Phys. J., 84, 529, 1998.
[60] Serge Galam. Social paradoxes of majority rule voting and renormalization
group. J. Stat. Phys., 61:943–951, 1990.
[61] U. Frisch, D. d’Humières, B. Hasslacher, P. Lallemand, Y. Pomeau, and J.-P.
Rivet. Lattice gas hydrodynamics in two and three dimension. Complex Sys-
tems, 1:649–707, 1987. Reprinted in Lattice Gas Methods for Partial Differen-
tial Equations, ed. G. Doolen, p.77, Addison-Wesley, 1990.
[62] L.P. Kadanoff, G.R. McNamara, and G. Zanetti. From automata to fluid flow:
comparison of simulation and theory. Phys. Rev. A, 40:4527–4541, 1989.
[63] G. Zanetti. Hydrodynamics of lattice-gas automata. Phys. Rev. A, 40:1539–
1548, 1989.
[64] T. Naitoh, M.H. Ernst, and J.M. Dufty. Long-time tails in two-dimensional
cellular automata fluids. Phys. Rev. A, 42:7187, 1990.
[65] R. Brito and M.H. Ernst. Propagating staggered wave in cellular automata fluids.
J. of Phys. A, 24:3331, 1991.
[66] Jaroslaw Piasecki. Echelles de temps multiples en théories cinétique. Cahiers
de physique. Press polytechniques et universitaire romandes, 1997.
149
LA COMPLEXITÉ
Bastien CHOPARD
[67] S. Wolfram. Cellular automaton fluid: basic theory. J. Stat. Phys., 45:471, 1986.
[68] D. d’Humières and P. Lallemand. Lattice gas models for 3d hydrodynamics.
Europhys. Lett., 2:291, 1986.
[69] B. Chopard and M. Droz. Cellular automata model for heat conduction in a
fluid. Phys. Lett. A, 126:476, 1988.
[70] P. Grosfils, J.-P. Boon, and P. Lallemand. Spontaneous fluctuation correlation
in thermal lattice gas automata. Phys. Rev. Lett., 68:1077, 1992.
[71] D.J. Tritton. Physical Fluid Dynamics. Clarendon Press, 1988.
[72] P. Calvin, D. d’Humières, P. Lallemand, and Y. Pomeau. Cellula automata for
hydrodynamics with free boundaries in two and three dimensions. C.R. Acad.
Sci. Paris, II, 303:1169, 1986. Reprinted in Lattice Gas Methods for Partial
Differential Equations, ed. G. Doolen, p.415, Addison-Wesley, 1990.
[73] D.H. Rothman and J.M. Keller. Immiscible cellular automaton fluids. J. Stat.
Phys, 52:275–282, 1988.
[74] A. Gustensen, D.H. Rothman, S. Zaleski, and G. Zanetti. Lattice boltzmann
model of immiscible fluids. Phys. Rev. A, 43:4320–4327, 1991.
[75] R. Holme and D.H. Rothman. Lattice gas and lattice boltzmann models of
miscible fluids. J. Stat. Phys., 68:409–429, 1992.
[76] R.E. Rosensweig. Magnetic fluids. Scientific American, pages 124–132, Octo-
ber 1982.
[77] K. Binder and D.W. Heermann. Monte Carlo Simulation in Statistical Physics.
Springer-Verlag, 1992.
[78] S.A. Orszag and V. Yakhot. Reynolds number scaling of cellular automata hy-
drodynamics. Phys. Rev. Lett, 56:1691–1693, 1986.
[79] Y.H. Qian, S. Succi, and S.A. Orszag. Recent advances in lattice boltzmann
computing. In D. Stauffer, editor, Annual Reviews of Computational Physics
III, pages 195–242. World Scientific, 1996.
[80] S. Succi, M. Vergassola, and R. Benzi. Lattice-boltzmann scheme for two-
dimensional magnetohydrodynamics. Phys. Rev. A, 43:4521, 1991.
[81] F.J. Alexander, S. Chen, and J.D. Sterling. Lattice boltzmann thermohydrody-
namics. Phys. Rev. E, 47:2249–2252, 1993.
[82] S. Succi, G. Amati, and R. Benzi. J. Stat. Phys., 81:5, 1995.
[83] F. Higuera, J. Jimenez, and S. Succi. Lattice gas dynamics with enhanced colli-
sion. Europhys. Lett, 9:345, 1989.
150
BIBLIOGRAPHY
LA COMPLEXITÉ
[84] Y.H. Qian, D. d’Humières, and P. Lallemand. Lattice BGK models for navier–
stokes equation. Europhys. Lett, 17(6):470–84, 1992.
[85] Hudong Chen, Shiyi Chen, and W.H. Matthaeus. Recovery of navier–stokes
equations using a lattice-gas boltzmann method. Phys. Rev. A, 45:R5339–42,
1992.
[86] A. Renda, G. Bella, S. Succi, and I.V. Karlin. Thermohydrodynamic lattice
BGK schemes with non-perturbative equilibria. Europhys. Lett., 41:279–283,
1998.
[87] P. Bhatnager, E.P. Gross, and M.K. Krook. A model for collision process in
gases. Phys. Rev., 94:511, 1954.
[88] S. Hou, J. Sterling, S. Chen, and G.D. Doolen. A lattice boltzmann subgrid
model for high reynolds number flows. Fields Institute Communications, 6:151–
166, 1996.
[89] A. Masselot and B. Chopard. A lattice boltzmann model for particle transport
and deposition. Europhy. Lett., 1998. to appear.
[90] A. Masselot and B. Chopard.
Snow transport and deposition: a numerical
model. J. of Glaciology, 1998. submitted.
[91] T. Castelle. Transport de la neige par le vent en montagne: approche expéri-
mentale du site du col du Lac Blanc. PhD thesis, EPF Lausanne, Switzerland,
1995.
[92] U. Radok. Snow drift. Journal of Glaciology, 19(81):123–139, 1977.
[93] B. Chopard, L. Frachebourg, and M. Droz. Multiparticle lattice gas automata
for reaction-diffusion systems. Int. J. of Mod. Phys. C, 5:47–63, 1994.
[94] M. Takeuchi. Vertical profile and horizontal increase of drift snow transport.
Journal of Glaciology, 26(94):481–492, 1980.
[95] D. Kobayashi. Studies of snow transport in low-level drifting snow. Contribu-
tions from the Institute of Low Temperature Science, Series A(24):1–58, 1972.
[96] R.P. Sharp. Wind ripples. Journal of Geology, 71:617–636, 1963.
[97] Martinez H. Contribution à la modélisation du transportéolien de particules.
Mesures de profiles de concentration en soufflerie diphasique. PhD thesis, Uni-
versité Joseph Fourier - Grenoble I, february 1996.
[98] V. Cornish. Waves of sand and snow. T. Fisher Unwin, London, 1914.
[99] B.Y. Werner and D.T. Gillespie. Fundamentally discrete stochastic model for
wind ripple dynamics. Phys. Rev. Lett., 71:3230–3233, 1993.
151
LA COMPLEXITÉ
Bastien CHOPARD
[100] J. E. Pearson. Complex patterns in a simple system. Science, 261:189–192, July
1993.
[101] J.D. Muray. Mathematical Biology. Springer-Verlag, 1990.
[102] L.M. Brieger and E. Bonomi. A stochastic cellular automaton model of nonlin-
ear diffusion and diffusion with reaction. J. Comp. Phys., 94:467–486, 1991.
[103] E.J. Garboczi. Permeability, diffusivity and microstructural parameters: a criti-
cal review. Cement and Concrete Res., 20:591–601, 1990.
[104] G.H. Weiss. Random walks and their applications. American Scientist, 71:65,
1983.
[105] G.H. Weiss, editor. Contemporary Problems in Statistical Physics. SIAM, 1994.
[106] P. O. Luthi, J. Ramsden, and B. Chopard. The role of diffusion in irreversible
deposition. Phys. Rev. E, 55:3111–3115, 1997.
[107] S. Cornell, M. Droz, and B. Chopard. Role of fluctuations for inhomogeneous
reaction-diffusion phenomena. Phys. Rev. A, 44:4826–32, 1991.
[108] R. Kapral and K. Showalter, editors. Chemical Waves and Patterns. Kluwer
Academic, 1995.
[109] J.P. Keener and J.J. Tyson. The dynamics of scroll waves in excitable media.
SIAM Rev., 34:1–39, 1992.
[110] E.E. Selkov. Self-oscillation in glycolysis: A simple kinetic model. Eur. J.
Biochem., 4:79, 1968.
[111] R. Fisch, J. Gravner, and D. Griffeath. Threshold-range scaling of excitable
cellular automata. Statistics and Computing, 1:23, 1991.
[112] J. Gravner and D. Griffeath. Threshold grouse dynamics. Trans. Amer. Math.
Soc., 340:837, 1993.
[113] B. Drossel and F. Schwabl. Self-organized critical forest-fire model. Phys. Rev.
Lett., 69:1629, 1992.
[114] R.M. Ziff, E. Gulari, and Y. Barshad. Kinetic phase transitions in an irreversible
surface-reaction model. Phys. Rev. Lett., 56:2553, 1986.
[115] B. Chopard and M. Droz. Cellular automata approach to non equilibrium phase
transitions in a surface reaction model : static and dynamic properties. J. Phys.
A, 21:205, 1987.
[116] R.M. Ziff, K. Fichthorn, and E. Gulari. Cellular automaton version of the ab2
reaction model obeying proper stoichiometry. J. Phys. A, 24:3727, 1991.
152
BIBLIOGRAPHY
LA COMPLEXITÉ
[117] F. Nannelli and S. Succi. J. Stat. Phys., 68:401, 1992.
[118] H. K. Henisch. Crystals in Gels and Liesegang Rings. Cambridge University
Press, 1988.
[119] B. Mathur and S. Ghosh. Liesegang rings-part i: Revert system of liesegang
rings. Kolloid-Zeitschrift, 159:143, 1958.
[120] E. Kárpáti-Smidróczki, A. B ¨ki, and M. Zrinyi. Pattern forming precipitation in
gels due to coupling of chemical reaction with diffusion. Colloid. Polym. Sci.,
273:857–865, 1995.
[121] M. Droz, J. Magnin, M. Zrinyi, and B. Chopard. Effect of dissociation on
liesegang systems. Jour. Chem. Phys. 110, 9618, 1999.
[122] T. Antal, M. Droz, J. Magnin, Z. Rácz, and M. Zrinyi. Derivation of the matalon-
packter law for liesegang structures. Jour. Chem. Phys. 109, 9479, 1998.
[123] K. Jablczynski.
La formation rythmique des pécipités: Les anneaux de
liesegang. Bull. Soc. Chim. France, 33:1592, 1923.
[124] S. Prager. Periodic precipitation. J. Chem. Phys., 25:279, 1956.
[125] Y.B. Zeldovitch, G.I. Barrenblatt, and R.L. Salganik. The quasi-periodical for-
mation of precipitates occuring when two substances diffuse into each other.
Sov. Phys. Dokl., 6:869, 1962.
[126] D.A. Smith.
On ostwald’s supersaturation theory of rhythmic precipitation
(liesegang rings). J. Phys. Chem., 81:3102, 1984.
[127] W. Ostwald. Lehrbuch der allgemeinen Chemie. Engelman, Leipzig, 1897.
[128] G.T. Dee. Patterns produced by precipitation at a moving reaction front. Phys.
Rev. Lett., 57:275–78, 1986.
[129] H. K. Henisch. Periodic Precipitation. Pergamon Press, 1991.
[130] R. Matalon and A. Packter. J. Cooloid Sci., 10:46, 1955.
[131] B. Chopard, H.J. Herrmann, and T. Vicsek. Structure and growth mechanism of
mineral dendrites. Nature, 353:409–412, October 1991.
[132] B. Chopard, P. Luthi, and M. Droz. Microscopic approach to the formation of
Liesegang patterns. J. Stat. Phys., 76:661–677, 1994.
[133] B.M. Boghosian, J. Yepez, F.J. Alexander, and N.H. Margolus. Integer lattice
gases. Phys. Rev. E, 55:4137–4147, 1997.
[134] W.H. Press, B.P. Flannery, S.A. Teukolsky, and W.T. Vetterling. Numerical
Recipes: The Art of Scientific Computing. Cambridge University Press, 1989.
153
LA COMPLEXITÉ
Bastien CHOPARD
[135] T. Karapiperis and B. Blankleider.
Cellular automata model of reaction-
transport process. Physica D, 78:30–64, 1994.
[136] A. Turing. The chemical basis of morphogenesis. Phil. Trans. Roy. Soc. London,
B237:37, 1952.
[137] V. Dufiet and J. Boissonnade. Conventional and unconventional turing patterns.
J. Chem. Phys., 96:664, 1991.
[138] J. Schnackenberg. Simple chemical reaction systems with limit cycle behaviour.
J. Theor. Biol., 81:389, 1979.
[139] B. Chopard, M. Droz, S. Cornell, and L. Frachebourg. Cellular automata ap-
proach to reaction-diffusion systems: theory and applications. In J.M. Perdang
and A. Lejeune, editors, Cellular Automata: Prospects in Astrophysical Appli-
cations, pages 157–186. World Scientific, 1993.
[140] Y. Elskens and H.L. Frisch. Phys. Rev. A, 31:3812, 1985.
[141] E. Ben-Naim, S. Redner, and F. Leyvraz. Phys. Rev. Lett., 70:1890, 1993.
[142] P.A. Rey, L. Frachebourg, M. Droz, and J. Piasecki. Phys. Rev. Lett., 75:160,
1995.
[143] P.A. Rey, M. Droz, and J. Piasecki. Phys. Rev. E, 57:138–145, 1998.
[144] E. Trizac. private communication.
[145] B. Chopard, A. Masselot, and M. Droz. A multiparticle lattice gas model for a
fluid. application to ballistic annihilation. Phys. Rev. Lett., 81, 1845, 1998.
[146] J.D. Jackson. Classical Electrodynamics. John Wiley, 1975.
[147] P.O. Luthi. Lattice Wave Automata: from radio wave to fracture propaga-
tion. PhD thesis, Computer Science Department, University of Geneva, 24 rue
General-Dufour, 1211 Geneva 4, Switzerland, 1998.
[148] W. J. R. Hoeffer. The transmission-line matrix method. theory and applications.
IEEE Trans. on Microwave Theory and Techniques, MTT-33(10):882–893, Oc-
tober 1985.
[149] H. J. Hrgovci´c. Discrete representation of the n-dimensional wave equation. J.
Phys. A, 25:1329–1350, 1991.
[150] C. Vanneste, P. Sebbah, and D. Sornette. A wave automaton for time-dependent
wave propagation in random media. Europhys. Lett., 17:715, 1992.
[151] S. de Toro Arias and C. Vanneste. A new construction for scalar wave equation
in inhomogeneous media. J. Phys. I France, 7:1071–1096, 1997.
154
BIBLIOGRAPHY
LA COMPLEXITÉ
[152] P. Mora. Lattice boltzmann phonic lattice model. J. Stat. Phys., 68:591–609,
1992.
[153] Y.-H. Qian and Y.F. Deng. A lattice bgk model for viscoelastic media. Phys.
rev. Lett., 79:2742–2745, 1997.
[154] H. L. Bertoni, W. Honcharenko, L. R. Maciel, and H. Xia. Uhf propagation
prediction for wireless personal communications. In IEEE Proceedings, Vol 82,
No. 9, pages 1333–1359, 1994.
[155] T. Kürner, D. J. Cichon, and W. Wiesbeck. Concepts and results for 3d digital
terrain-based wave propagation models: an overview. IEEE Jour. on Selected
Areas in Communications, 11(7):1002–1012, Sept. 1993.
[156] K. Rizk, J.-F. Wagen, and F. Gardiol. Ray tracing based path loss prediction in
two microcellular environments. In Proceedings IEEE PIMRC’94, pages 210–
214, The Hague Netherlands, September 1994.
[157] B. Chopard, P.O. Luthi, and Jean-Frédéric Wagen. A lattice boltzmann method
for wave propagation in urban microcells. IEE Proceedings - Microwaves, An-
tennas and Propagation, 144:251–255, 1997.
[158] B. Chopard. A cellular automata model of large scale moving objects. J. Phys.
A, 23:1671–1687, 1990.
[159] M. Marder and J. Fineberg. How things break. Physics Today, pages 24–29,
September 1996.
[160] Ping Sheng. Introduction to Wave Scattering, Localization, and Mesoscopic
Phenomena. Academic Press, 1995.
155
Troisième Partie
Systèmes critiques auto-organisés
Paolo De Los Rios
Département de Physique Théorique, Université de Lausanne
BSP, CH-1015 Lausanne
157
Chapitre 3
Self-Organized Critical Systems
The aim and structure of this course is to provide an introduction to the concepts and
applications of self-organized criticality, through the analysis of a prototype: the Bak-
Sneppen model, that is the simplest and best understood representative of the class of
self-organized models driven to criticality by extremal dynamic rules. A brief sum-
mary of the course is the following:
1. Spontaneous occurrence of scaling (critical) behavior in nature: examples. Lack
of parameter fine tuning as a motivation for self-organized criticality.
2. Introduction to the Bak-Sneppen model.
Dynamical rules and their conse-
quences: fitness histogram and avalanche distribution.
3. The relevant measurable quantities of the model, their exponents and the rela-
tions among them.
4. The Random-Nearest-Neighbor version of the model as a Mean-Field limit: so-
lution and critical exponents.
5. A non trivial relation between exponents: the Maslov equation.
6. Conclusions, and open problems.
3.1
Spontaneous occurrence of scaling in nature
Since we are going to talk about Self-Organized Criticality (SOC), the first question
we would like to answer to is, why SOC? What does it mean and why is there any need
of it?
The word criticality in statistical physics applies (or used to apply) to a quite pre-
cise context, of second order phase transitions at the critical temperature. There, quite
a few extraordinary things happen, related to the appearance of power-law relations
between certain physical quantities. So, for example, the specific heat C diverges al-
gebraically with |T −Tc|, as C ∼ |T −Tc|−α. The exponent α is also known as critical
159
LA COMPLEXITÉ
Paolo DE LOS RIOS
Figure 3.1: Gutenberg-Richter law.
exponent. Close to Tc many other quantities behave algebraically. More recently, it has
become customary to use the word criticality to describe systems exhibiting power-law
relations among their characteristic quantities. Still some differences exist: for a sec-
ond order phase transition some parameters had to be fine-tuned in order to reach the
critical limit, as, for example, the temperature. In Nature, however, many systems are
critical without any need for fine-tuning: somehow the rules of these systems drive
them to criticality without the need of any detailed adjustment of the parameters. This
class of systems self-organizes to the critical state, hence, Self-Organized Criticality.
We will now briefly describe three systems and a model that will not be the focus of
these notes, postponing a detailed description of the Bak-Sneppen model to subsequent
sections.
The first problems that attracted the attention of physicists were systems such as
earthquakes. It is well known that we are still unable to predict when an earthquake
will occur, yet there are some properties of earthquakes that make them appealing to
a physicist. If we look at the distribution of the number of earthquakes as a function
of the energy they release, P (E), we realize that it is a power-law, P (E) ∼ E−γ
with γ close to 1. This law, known as Gutenberg-Richter law [1] (Fig.3.1), is robust
whether we decide to take into account all earthquakes on the planet or only the San
Francisco Bay area ones, and irrespective of the date of the catalog we take data from.
The bad news are that large earthquakes are not so much rarer than small ones. The
good news are (cynically), that very likely the mechanism producing earthquakes is
less random than one might think, or else a gaussian like distribution would have been
found. Moreover, there is no fine-tuning of parameters in this mechanism: it is a good
candidate for SOC. Interestingly, earthquakes and their power-laws are becoming pop-
ular among astrophysicists studying neutron stars. It has been found that the emissions
from pulsars exhibits some bursts of energy and that these bursts are power-law dis-
tributed with respect to the released energy. One of the more serious candidate theories
is that due to their high rotation velocity, these pulsars can develop a high strain in their
160
3. Self-Organized Critical Systems
LA COMPLEXITÉ
Figure 3.2: Distribution of vortex avalanches respect to the number of vortices
crusts, and that sometimes cracks can develop on their surface, associated with large
magnetic emissions. These earthquakes have been named starquakes.
A second system showing critical behavior is in the realm of solid-state physics.
It is known that type II superconductors for magnetic fields above their first critical
field Hc1 admit flux-lines in their interior. The thermodynamics of this problem is
very well understood. The dynamics has been studied more recently and it has been
discovered to be highly non-trivial (a definitive explanation is still lacking). Flux lines
enter the sample from the boundary, but their influx is not slow or gradual. Instead, it
proceeds by bursts (avalanches) that can be recorded (see Fig.3.2). The histogram of
the avalanches classified according to their size is a power-law, and some attempts to
explain this behavior in terms of SOC models has been made [2].
Further systems that show critical behavior in some of their quantities are rice-
piles. Although they represent quite an unorthodox field of research for physicists (but
they could be of relevance for people with seed storage duties), they have been studied
by several groups, particularly in Norway (not the world leading rice producer, by the
way). Results are that, by adding grains on the top of the pile, rearrangements on the
sides are not gradual, but take place in avalanches whose size distribution is, again, a
power-law.
These notes are devoted to a self-organized critical model of biological evolution.
Why should biological evolution need a SOC explanation? Indeed the layman pic-
ture of evolution is well represented by "The Walk of Life", where a walking ape
slowly changes into a modern Homo sapiens by gradual changes. This corresponds to
the gradualist point of view of evolution. Recently some speculation on whether life
really evolved gradually have emerged. Indeed the fossil record clearly shows long pe-
riods of stasis, where almost nothing happens and the same fossils are recovered over
million years, separated by short burst of evolutionary activity, where some species
161
LA COMPLEXITÉ
Paolo DE LOS RIOS
Figure 3.3: Self-similarity in the fossil record time series. Here we show the fluctu-
ations in families of Ammonoidea (over a period of 320 Myr, with a time resolution
of 2 Myr). In a only one of each four original data points is shown, giving a coarse
resolution of 8 Myr. In b we plot a segment of the previous data, but at full (2-Myr)
resolution.
disappear and others appear and diversify. This point of view goes under the name of
punctuated evolution [3]. The distribution of the stasis periods against their duration is,
not surprisingly now, a power-law. As well, the distribution of species that disappear
during the short evolutionary bursts is a power-law (see Figs. 3.3 and 3.4)[4]. So the
question is whether these data are compatible with the usual view of evolution, that,
since Darwin, can be summarized as "mutation and selection".
A possible first explanation is of course the catastrophe theory: bursts of activity
correspond to catastrophes (meteors, eruptions ands so on) that eliminate some species
giving way to others that can then evolve and diversify. Here we want to show that
actually the basic rules of evolution itself can lead a system to a critical state where
mass extinctions, stasis periods and activity bursts emerge spontaneously.
162
3. Self-Organized Critical Systems
LA COMPLEXITÉ
Figure 3.4: (a) Pattern of family extinctions through time of all organisms in terms
of the number of families. Maximum and minimum curves are shown (inset) as well
as the corresponding power spectra (main figure). (b) Fluctuation in the marine fami-
lies (inset) and the corresponding power spectrum (main figure). Both series are in a
timescale of Myr.
163
LA COMPLEXITÉ
Paolo DE LOS RIOS
3.2
The Bak-Sneppen model
In the previous section we mentioned that the record of life history on Earth has some
critical features. The distribution of mass extinctions against the number of species
disappeared in each extinction is a power-law
P (n) ∼ n−γ
(3.1)
with γ
2. Interestingly, also the big five mass extinctions (among which the one that
wiped out the dinosaurs) lay on that curve. There are other features in the paleontolog-
ical records that are power-law distributed, but we do not want do address them here.
What is important for us is to realize that there are some data in need of a non-trivial
explanation. One possible explanation is that some catastrophes took place (such as
the Chicuxclub meteor some 65 million years ago), and that the size of these events is
distributed according to power-laws. Although this hypothesis cannot be ruled out at
the moment, it is healthy to explore some other directions (and we will see at the end,
what we can learn from simple models could at last be reconciled with the catastrophe
theory).
Evolution is the major theory commonly accepted to explain the diversity of life
forms, their origin and the links they have to each other. Although the theory has
changed greatly since its first conception by Charles Darwin, the basic premises still
hold: the weakest species, the ones that are less fit to survive in their environment,
disappear and are replaced by new ones, that could be their evolutionary descendents
or newcomers filling the niche that was left behind. In both cases the new species will
have different genomes and phenotypes with respect to the ancient ones. This process
can be captured with the words "selection and mutation". Yet, the concept itself of
environment is all-including: the environment is determined not only by landscape,
terrain, weather and other physical and geophysical features, but also by other species.
So, the disappearance and replacement of a species changes the environment of the
ones that most interacted with it (predator-prey relations, parasites, symbiosis and so
on), so that suddenly they have a different fitness and could be forced to change. As it
can be easily imagined, this kind of interaction can lead to a chain reaction of extinc-
tions. What are the features of these extinctions?
It is typical of the approach of a theoretical physicist to try to answer this question
with a model as simple as possible, and the Bak-Sneppen (BS) model has been a first
attempt in this direction [5].
3.2.1
Description of the model
The model is extremely simple. We assume the ecosystem to be made of N species on
a lattice, each one with its own fitness f . The fitness represents the survival capability
of the species, and it has not a single definition in biology. Ethologists and zoologists
think of it as the number of offsprings of an individual of the species that can reach the
adult, reproductive, age; for microbiologists, instead, the fitness is related to the resis-
tance of the species to mutations. Actually, these two definitions agree: if the typical
164
3. Self-Organized Critical Systems
LA COMPLEXITÉ
number of successful offsprings is high, the species will be abundant and any mutation
will have greater difficulties in spreading to all individuals. The selection mechanism
is implemented in the model by an extremal rule: the species i (i = 1, ..., N ) with the
lowest fitness fi disappears, and its place is taken by a new species with a new ran-
dom fitness. Then those species that most interact with it (the nearest-neighbors on the
lattice) find a different species, i.e. a different environment, and therefore their fitness
changes too, at random. And the process is repeated. This scheme can be expressed as
a simple algorithm:
1. Assign to every species on the lattice a random fitness taken from the uniform
distribution between 0 and 1;
2. search the species i with the lowest fitness and change it, taking the new fitness
again from the uniform distribution between 0 and 1;
3. change the fitness of the nearest neighbors of i with new values, at random be-
tween 0 and 1;
4. begin again from point 2.
What happens to the system as the steps above are repeated? It is clear that there are
two competing processes at work: the selection mechanism tends to clear the system
of low fitness, but the mutation of the neighbors, that could have in principle high
fitness values, counterbalances it. As a result we could expect some non-trivial fitness
distribution in the system. Indeed it can be seen that most sites have a fitness above a
threshold fc that is close to 2/3 (the exact value is not known, but the best numerical
estimate is 0.66702... [6]), with a few sites below it. In Fig.3.5 the fitness histogram
is shown: fitness are uniformly distributed above fc [5]. This is the first hint that the
system is non-trivially organized thanks to the rules of the dynamics themselves: it
contains a certain degree of spontaneous order. In biological terms, this result implies
that the number of species that are not well fit to survive in the environment is quite
low. How general is this result? The detailed value of the threshold fc depends on the
number of neighbors that are updated each time step. If also the next-nearest neighbors
are involved in the mutations, then fc is lower, as it can be intuitively understood: the
ordering part of the dynamics (the selection of the lowest fitness) does not change,
but the disorder introduced (changing the neighbors) is greater, so that we expect the
system to be more disordered with a lower fc (ideally, a perfectly ordered system would
have fc = 1 and a completely disordered one fc = 0). Yet, above fc the distribution
would still be uniform, so that some generality holds.
Besides the fitness distribution, there are other quantities of interest.
3.2.2
Relevant quantities and their scaling behavior
If we look carefully at the dynamics, we realize that mutations are clustered in space
and time: usually, the mutation of the lowest fitness triggers the mutation of a neighbor,
165
LA COMPLEXITÉ
Paolo DE LOS RIOS
Figure 3.5: Fitness distribution for the d = 1 Bak-Sneppen model.
which becomes in turn the weakest species, and so on. So clusters, that in the jargon
are called avalanches, have an initiator (the first species/site to mutate) and finish when
the new minimal fitness is unrelated to the initiator. Two quantities are of interest then:
the duration s of an avalanche, that is, how many time steps it lasts, and the number of
species n touched by it.
From computer simulations [7], the probability distribution of avalanche durations,
P (s), is a power-law for large s (see Fig.3.6)
P (s) ∼ s−τ s → ∞ .
(3.2)
The exponent τ is the first in a series of exponents that we can define for the Bak-
Sneppen model, and from numerical simulations of various versions of the Bak-
Sneppen model, it turns out that 1 < τ < 3/2. This is an interesting result, since
it tells that, although every avalanche is finite, the average avalanche duration is not.
To every avalanche of duration s we can associate the number n(s) of species that
have undergone mutations. Then, we can classify avalanches by n rather than by s.
Simulations show that avalanches are power-law distributed also with respect to n,
P (n) ∼ n−λ n → ∞ .
(3.3)
Simulations show also that n(s) is (on the average) an increasing function of s, and in
particular that there is an allometric relation
n(s) ∼ sµ
(3.4)
with 0 ≤ µ ≤ 1 depending on the particular model. The two bounds are easily under-
stood: µ = 0 implies that the system would always mutate the same sites, whereas we
know that, as long as the avalanche lasts, it has the tendency to involve new species
166
3. Self-Organized Critical Systems
LA COMPLEXITÉ
Figure 3.6: Avalanche distribution for the d = 1 Bak-Sneppen model.
(hence µ ≥ 0). At the same time, it is clear that every time step involves at most a
finite number of new sites (the neighbors), so that n can be at most proportional to s
and µ ≤ 1.
The three exponents τ ,λ and µ are not independent.
Indeed, since n is a
monotonous increasing function of s and the number of avalanches is conserved, the
relation
P (s)ds = P (n)dn
(3.5)
that in turn tells us that
dn
P (s) = P (n)
(3.6)
ds
and using (3.2),(3.3) and (3.4) we obtain λ = (τ − 1)/µ + 1. Therefore, only two out
of three exponents are independent.
A further, important exponent σ rules the behavior of the system away from crit-
icality. Here, we refer to the critical limit as the one that the system reaches sponta-
neously (at least in a ill-defined thermodynamic limit). Avalanches start from the site
with the lowest fitness higher than fc and stop when the new lowest fitness is again
above fc. In this regime, the avalanche probability distribution is a power-law. Yet,
we could try to explore the behavior of so-called f −avalanches, that is, avalanches
that stop as soon as the lowest fitness is above a given f . Numerically, we find that
these avalanches are not critical, but are rather affected by a cutoff. This result is not
surprising, since it is typical of critical systems away from criticality. For example, in
magnetic systems undergoing second order phase transitions, the correlation length ξ
diverges as ξ ∼ |T − Tc|−η, so that away from the critical temperature Tc the correla-
tion length is finite and the system is not strictly critical. As well, in percolation, it is
known that the correlation length ξ diverges as ξ ∼ |p − pc|−η, where pc is the critical
value of the site (or bond, depending whether we consider site or bond percolation)
probability above which there is a single cluster spanning the system. If |p − pc| = 0
167
LA COMPLEXITÉ
Paolo DE LOS RIOS
then the system loses its critical features above a typical length scale ξ. The percola-
tion example is particularly important since the Bak-Sneppen model has been put in
relation to both directed percolation and invasion percolation [8], so that also some of
the language typical of percolation has been used for the Bak-Sneppen model. The
role of the critical percolation probability is played by the critical fitness fc and the
growing percolation clusters are the analog of Bak-Sneppen avalanches.
The exponent σ affects criticality by introducing a cutoff in the avalanche proba-
bility distribution and by making the average avalanche duration < s > finite. The
new probability distribution is
1
P (s, f ) = s−τ g s(fc − f)σ
(3.7)
so that we can define a typical duration (not to be confused with < s >!) sc ∼ (fc −
f )1/σ; the function g(x) is such that g(0) = const so that criticality is recovered when
f = fc, and decreases faster than a power-law if x → ∞. The average duration < s >
is then
∞
∞
∞
1
τ
< s >=
sP (s, f )
ss−τ g s(f
−2
c − f)σ ds = (fc−f) σ
x1−τ g(x)dx ;
0
0
0
(3.8)
since the last integral is finite thanks to the behavior of g(x), the average duration is
finite if f = fc. Analogously we can compute the average number of touched sites
< n >∼ (fc − f)(τ−µ−1)/σ.
(3.9)
< n > allows us to find a further relation between σ, τ and µ. Indeed, the probability
that an f −avalanche leaves behind a site of fitness f + df is (in the df → 0 limit)
< n > df /(1 − f). Therefore the increment of the average duration of a f−avalanche
when f is raised of an infinitesimal df is
df
d < s >f =< n >f
< s >
1 − f
f
;
(3.10)
(3.10) means that with infinitesimal probability a second f −avalanche starts after the
first one. >From (3.10) it is readily obtained that < n >∼ (fc − f)−1 so that, from
(3.9) we get τ = µ − σ + 1.
We have introduced four exponents, τ , µ, σ and λ, and we have found that only
two of them are independent, let’s say τ and µ. Many other exponents could be defined
for the Bak-Sneppen model, but all of them could be deduced from τ and µ. Just two
exponents escape this dependence, and they are related to the microscopic dynamics
of the model.
Let us call the site with the minimal fitness the active site. Then we can measure the
probability of return of the activity to the same site. In a biological language it could
be interpreted as the life-span of a species. The return to a previously visited site is not
a new quantity in probability theory, and is central to Random Walk theory. There, it
168
3. Self-Organized Critical Systems
LA COMPLEXITÉ
is customary to define the first and all return probability distributions Pf (t) and Pa(t),
where t has been used to measure time inside avalanches. Pf (t) is the distributions of
the time intervals between successive returns: if activity, for example, passes through a
given site at times t = 5, t = 13 and t = 23, these events contribute to Pf (13 − 5 = 8)
and to Pf (23 −13 = 10). If moreover t = 5 was the first passage ever to that site, then
they also contribute to Pa(13 − 5 = 8) and to Pa(23 − 5 = 18), that is, Pa(t) is the
distribution of all the return times after the first passage. Not surprisingly, also these
two probability distributions are power-laws, Pf (t) ∼ t−τf and Pa(t) ∼ t−τa. These
two exponents are not independent either. Indeed, we can find a relation among them
using a relation between Pa(t) and Pf (t) [9]:
t−1
Pa(t) = δt,0 +
Pa(t )Pf (t − t )
(3.11)
t =1
where δt,0 sets the initial condition. Then we use the generating functions
∞
Ga,f (z) =
Pa,f (t)zt
(3.12)
t=0
to disentangle (3.11): we multiply the two sides by zt and sum over t. The resulting
equation is
Ga(z) = 1 + Ga(z)Gf (z)
(3.13)
and finally
1
Gf (z) = 1 −
.
(3.14)
Ga(z)
The last step is tu use tauberian theorems that tell us that, if P (t) ∼ t−τ for t → ∞,
then
G(z) ∼ (1 − z)τ−1 if τ < 1
G(z) ∼ a + b(1 − z)τ−1 if τ > 1
(3.15)
With (3.15) and (3.14) we find at last that
τf + τa = 2
if τ < 1
τf = τa
if τ > 1
(3.16)
so that just one of the two exponents is independent.
Summarizing, we have found that using simple scaling arguments and results
known from traditional probability theory we have been able to reduce the number
of independent exponents of the Bak-Sneppen model to three.
3.2.3
Numerical results in d = 1
Up to now, an analytical solution of the Bak-Sneppen model, even in d = 1, has been
elusive, despite the simplicity of the model itself. Therefore we are forced to resort
169
LA COMPLEXITÉ
Paolo DE LOS RIOS
to numerical simulations to obtain the values of the threshold fc and of the relevant
exponents. The results are shown in Figs. 3.5 and 3.6, corresponding to the already
mentioned fc = 0.66702... with the update of the two nearest neighbors of the lowest
fitness, to τ = 1.073 ± 0.005, µ = 0.42 ± 0.01, σ = 0.35 ± 0.01, in agreement with
their relation τ = µ − σ + 1; τa is indistinguishable from µ within numerical error and
τf = 1.58 ± 0.01, again in agreement with τa + τf = 2.
From these results it is clear that the exponent relations are respected. Actually,
they can be used the other way around: they guide numerical simulations so that, had
we found numerical exponents in disagreement with the relations, we should have
checked our simulations.
We have mentioned above that τa is indistinguishable from µ within numerical
errors. Indeed, there are some arguments according to which they should indeed be
equal, yet, we will not deal with this problem here since a complete picture has yet not
emerged, and, as we are going to show in the next section, there are counterexamples.
3.2.4
The Random Nearest Neighbor model: a mean-field limit
In lack of analytical solutions in d = 1, we can try to move to the other limit where
usually exact solutions can be found, the Mean-Field approximation. In traditional
statistical physics the Mean-Field approximation consists, essentially, in reducing the
correlations as it would happen in a high-dimensional system. Hence, in many cases
the Mean-Field solution is considered as being the exact solution in d = ∞.
The d = ∞ limit has to be conceived carefully for the Bak-Sneppen model. Indeed,
for usual lattice models the infinite dimension limit is obtained by letting every site
interact with all the others, and by rescaling the interaction strengths in such a way
that energy is extensive. Unfortunately, there is no interaction strength in the Bak-
Sneppen model, and if we let every site be considered as nearest neighbor of the active
one, then there would be no organization at all in the model.
An alternative possibility to implement the Mean-Field approximation, is to choose
the nearest neighbors at random over the lattice. In this way any particular finite-d con-
nectivity of the lattice is broken, just as it would be in d = ∞ (these considerations
have become of great relevance recently in the context of social networks, where some
intermediate cases have emerged). This is the Random Nearest Neighbor (RNN) ver-
sion of the Bak-Sneppen model [10]. In particular, at each time step the minimum
fitness is changed, together with the fitness of two other sites chosen at random over
the lattice, different at every step. In this way the finite-d connectivity of the lattice is
maximally lost.
As usual, at first the RNN model has been studied by numerical simulations, find-
ing fc = 1/3 (Fig.3.7), τ = τa = τf = 3/2, σ = 1/2 and µ = 1. All these values
(the exponents respect the scaling relations) can be obtained analytically, getting a first
insight on the dynamics of the model.
First, the threshold value is easily calculated using a simple master equation for the
170
3. Self-Organized Critical Systems
LA COMPLEXITÉ
Figure 3.7: Fitness distribution for the RNN Bak-Sneppen model.
site fitness probability:
1
K
1
K + 1
p(f, t + 1) = p(f, t) − p
p(f, t)
p
(3.17)
N 1(f, t) − N − 1
− N 1(f,t) + N
where N is the number of sites in the lattice, K the nearest neighbor number and p1(f )
is the probability that the minimal fitness is between f and f + df ,
p1(f ) = Np(f )QN−1(f )
(3.18)
with
1
Q(f ) =
p(f )df
(3.19)
f
is the probability to be higher than f . In the stationary state (where p(f ) is independent
of t) and in the N → ∞ limit, and for f → 1 (hence p(f) 1/(1−fc) and Q(f) → 0),
(3.17) simplifies to
1
K
= K + 1
(3.20)
1 − fc
and fc = 1/K, as numerically obtained.
The exponent µ can be obtained straightforwardly by observing that at every time
step 2 new sites are updated (on a very large lattice the probability that the random
neighbors had already been touched by the avalanche is negligible). Therefore n is
proportional to s and µ = 1.
The exponent τ can be obtained through a clever mapping of the dynamics to a
random walk in a population space. Indeed, at every time step the avalanche is charac-
terized by the number na of active (i.e. below the threshold fc) sites. The avalanches
171
LA COMPLEXITÉ
Paolo DE LOS RIOS
stops when na = 0. At every time step na can change with certain probabilities
p−1 = (1 − fc)3
p0 = 3fc(1 − fc)2
p+1 = 3f 2(1
c
− fc)
p+2 = f 3c
(3.21)
The space of populations is a real line, and the state of the avalanche is given by the na.
Then, at every time step the avalanche goes −1, 0, +1, +2 sites to the right or left with
probabilities (3.21). This is a random walk with 0 average, starting from na = 1. It is
a well known result of random walk theory that the probability of such a walk to go
to a particular site (in particular na = 0) after a time s decreases as s−3/2. Therefore,
τ = 3/2.
The exponents τa and τf are extremely difficult to obtain analytically in this ap-
proximation, although arguments analogous to the preceding ones could still be used.
3.2.5
An exact Master Equation
The Bak-Sneppen model is clearly a Markovian system: the state of the system at time
t is completely determined by the state of the system at time t. Being this the case, it
should be possible to write an exact master equation for the model in the form [11]:
N
1
P (x1, x2, ..., xN ; t + 1) =
T ({yj} → {xj};i)P(y1,y2,...,yN;t)dy1dy2...dyN
i=1
0
(3.22)
where T ({yj} → {xj}; i) is the transition probability to go from state {yj} to state
{xj} given that the minimum fitness is at site i. This probability does not depend on
time.
The explicit form of T can be written as
T ({yj} → {xj};i) =
θ(yj − yi)
δ(yj − xj)
(3.23)
j=i
j=i,i±1
where the first product on the r.h.s. ensures that yi is the minimal fitness and the second
one that all fitness apart from yi and yi±1 do not change in the transition.
By substituting (3.23) in (3.22) we have an exact master equation for the model.
Still, we are not able to solve it exactly. We have thus to resort to approximations.
In particular we assume the fitness variables to be mutually independent. Then the
probability P (x1, ..., xN ) can be written as P (x1, ..., xN ) =
p(x
i
i), and substituted
back in the master equation. After some algebra the result is again (3.17). So, in the
mean-field approximation, as expected, correlations are not taken into account. The
possible advantage of this formulation is that in principle the approximation could be
systematically improved by treating exactly, for example, pairs of sites.
172
3. Self-Organized Critical Systems
LA COMPLEXITÉ
3.2.6
Toward a solution: the Maslov equation
We have seen that there are relations among the exponents that allow us to reduce to
three the number of independent exponents. Still, looking with more intimacy at the
dynamics of the model we can further reduce to two that number. Every f −avalanche
of duration s is composed of smaller f -avalanches with f < f . It is possible to use
this property to write a recursive equation for the avalanche probability distribution
that relates µ and τ .
Given an f − avalanche of duration s, what is the probability that it continues into
a (f + df )−avalanche? It is necessary that at least one of the n ∼ sµ sites updated by
the f −avalanche has a new fitness between f and f + df. This feature can be used to
write a relation between f − and (f + df)−avalanches [12]:
df
s−1
df
P (s, f + df ) = P (s, f ) −
n(s)P (s, f ) +
n(s )P (s , f )P (s
1 − f
1
− s ,f)
s =1
− f
(3.24)
The second term on the r.h.s represents the probability that a f −avalanche of duration s
continues into a longer (f + df )−avalanche, therefore not contributing to P(s, f +df);
the third term represents the probability that two f −avalanches of the right duration
glue together to give a f (f + df )−avalanche of duration s thanks to a site between f
and f + df left behind by the first avalanche. Suitable rearrangements lead to, in the
limit df → 0,
∂P (s, y)
s−1
=
(s )µP (s , y)P (s
∂y
−sµP(s,y) +
− s ,y)
(3.25)
s =1
where f = 1−e−y. The initial condition for this differential equation is P(s, 0) = δs,1,
that is, 0−avalanches can last at most 1 time step.
Equation (3.25) cannot be solved exactly, but it can be solved numerically (us-
ing some standard numerical integration techniques) and the result is shown in
Fig.3.8 [13]. As it can be seen the quality of the results, when compared with the
numerical solutions in d = 1 and 2 and with the RNN model is excellent.
This is a completely new kind of exponent relation, that allows us to further reduce
the number of independent exponents.
3.3
Conclusions, and perspectives for Self-Organized
Criticality
In the years some progress has been made toward a full understanding of the Bak-
Sneppen model. Still we are lacking fundamental results such as an exact solution in
d = 1 (if such a solution exists), or the knowledge of the upper critical dimension,
above which the mean-field behavior should be recovered. Still all these open prob-
173
LA COMPLEXITÉ
Paolo DE LOS RIOS
Figure 3.8: Exponent τ vs. exponent µ.
lems are relevant for the sake of science, but are quite irrelevant for the problem that
originally prompted the Bak-Sneppen model: is evolution SOC?
None of the variations of the Bak-sneppen model has ever recovered the exponents
that come out from the fossil record. From that point of view the model has been
highly unsuccessful. Yet, it has originated many other works where other models have
been shown to reproduce some of the features of evolutions, such as the 2 exponent of
mass extinctions. But it is our opinion that the true contribution of the Bak-Sneppen
model is that the basic rules of evolution, that is, mutation and selection, can lead a
system to be critical, in qualitative agreement with the paleontological record. And
indeed mutation and selection are the basic rule common to all these models.
What about the catastrophe theory, then? We are not in a position to claim validity
for one rather than for the other. What we can say, instead, is that critical systems are
extremely sensitive to perturbations, and therefore even if the internal criticality of an
evolving system is not directly responsible for mass extinctions, it prepares the system
in a sensitive state for catastrophes. The two mechanisms in our view do not compete,
but cooperate.
What about SOC in general, what about its successes and its future? Actually, in
time, the scientific community is recognizing that either we apply the SOC definition to
a too large set of systems, or we drop it altogether. Indeed there are too many systems
that show spontaneous criticality to believe that a few simple mechanisms can explain
them all. But at the same time we have to thank SOC if the last ten years have seen a
great sensibilization of the physics community to systems and problems coming from
other disciplines, sometimes contiguous, such as geology or biology, sometimes very
far such as economics and even sociology.
Fifteen years after the first SOC paper appeared, the scenario for statistical physics
has greatly changed and the SOC community has surely been at the hearth of this
174
BIBLIOGRAPHY
LA COMPLEXITÉ
mutation.
3.4
Bibliography
[1] B. Gutenberg and C.F. Richter, Ann. Geofis. 9, 1 (1956).
[2] S. Field, J. Witt, F. Nori and X. Ling, Phys. Rev. Lett. 74, 1206 (1995).
[3] S.J. Gould and N. Eldredge, Paleobiology 3, 114 (1977).
[4] R.V. Solé, S.C. Manrubia, M. Benton and P. Bak, Nature 388, 764 (1997).
[5] P. Bak and K. Sneppen, Phys. Rev. Lett. 71, 4083 (1993).
[6] P. Grassberger, Phys. Lett. A 200, 277 (1995).
[7] M. Paczuski, S. Maslov and P. Bak, Phys. Rev. E 53, 414 (1996), and references
therein.
[8] M. Paczuski, S. Maslov and P. Bak, Europhys. Lett. 27, 97 (1994).
[9] a W. Feller, An Introduction to Probability Theory and Its Applications, vol.1 (J.
Wiley and Sons, New York, 1968).
[10] H. Flyvbjerg, K. Sneppen and P. Bak, Phys. Rev. Lett. 71, 4087 (1993).
[11] M. Vendruscolo, P. De Los Rios and L. Bonesi, Phys. Rev. E 54, 6053 (1996).
[12] S. Maslov, Phys. Rev. Lett. 77, 1182 (1996).
[13] M. Marsili, P. De Los Rios and S. Maslov, Phys. Rev. Lett. 80, 1457 (1998).
175
Quatrième Partie
Systèmes bio-inspirés, algorithmes
évolutionnistes et réseaux de neurones
Marco Tomassini
Institut d’informatique, Université de Lausanne
Collège Propédeutique, CH-1015 Lausanne
177
Chapitre 4
Evolving Cellular Automata
4.1
What Are Cellular Automata?
Cellular automata (CA) were originally conceived by Ulam and von Neumann in the
1940s to provide a formal framework for investigating the behavior of complex, ex-
tended systems [52]. CAs are dynamical systems in which space and time are discrete.
A cellular automaton consists of an array of cells, each of which can be in one of a
finite number of possible states, updated synchronously in discrete time steps, accord-
ing to a local, identical interaction rule. The state of a cell at the next time step is
determined by the current states of a surrounding neighborhood of cells [34,46,55].
The cellular array (grid) is n-dimensional, where n = 1, 2, 3 is used in practice;
in this volume we shall concentrate on n = 1, 2, i.e., one- and two-dimensional grids.
The identical rule contained in each cell is essentially a finite state machine, usually
specified in the form of a rule table (also known as the transition function), with an
entry for every possible neighborhood configuration of states. The cellular neighbor-
hood of a cell consists of the surrounding (adjacent) cells. For one-dimensional CAs,
a cell is connected to r local neighbors (cells) on either side, as well as to itself, where
r is a parameter referred to as the radius (thus, each cell has 2r + 1 neighbors). For
two-dimensional CAs, two types of cellular neighborhoods are usually considered: 5
cells, consisting of the cell along with its four immediate nondiagonal neighbors, and
9 cells, consisting of the cell along with its eight surrounding neighbors. When consid-
ering a finite-sized grid, spatially periodic boundary conditions are frequently applied,
resulting in a circular grid for the one-dimensional case, and a toroidal one for the
two-dimensional case.
As an example, let us consider the parity rule (also known as the XOR rule) for a
2-state, 5-neighbor, two-dimensional CA. Each cell is assigned a state of 1 at the next
time step if the parity of its current state and the states of its four neighbors is odd,
and is assigned a state of 0 if the parity is even (alternatively, this may be considered a
modulo-2 addition). The rule table consists of entries of the form:
0
179
LA COMPLEXITÉ
Marco TOMASSINI
Table 4.1: Parity rule table. CNESW denotes the current states of the center, north,
east, south, and west cells, respectively. Snext is the cell’s state at the next time step.
CNESW
Snext
CNESW
Snext
CNESW
Snext
CNESW
Snext
00000
0
01000
1
10000
1
11000
0
00001
1
01001
0
10001
0
11001
1
00010
1
01010
0
10010
0
11010
1
00011
0
01011
1
10011
1
11011
0
00100
1
01100
0
10100
0
11100
1
00101
0
01101
1
10101
1
11101
0
00110
0
01110
1
10110
1
11110
0
00111
1
01111
0
10111
0
11111
1
1 1 0 → 1
1
This means that if the current state of the cell is 1 and the states of the north, east,
south, and west cells are 0, 0, 1, 1, respectively, then the state of the cell at the next
time step will be 1 (odd parity). The rule is completely specified by the rule table
given in Table 4.1. Figure 4.1 demonstrates patterns that are produced by the parity
CA.
4.2
Formal Definitions
A d-dimensional CA consists of a finite or infinite d-dimensional grid of cells, each
of which can take on a value from a finite, typically small, set of integers. The value
of each cell at time step t is a function of the values of a small local neighborhood of
cells at time t − 1. The cells update their states simultaneously according to a given
local rule.
Formally, a cellular automaton A is a quadruple
A = (S, G, d, f ),
where S is a finite set of states, G is the cellular neighborhood, d ∈ Z+ is the dimen-
sion of A, and f is the local cellular interaction rule, also referred to as the transition
function.
Given the position of a cell, i, i ∈ Zd, in a regular d-dimensional uniform lattice,
or grid (i.e., i is an integer vector in a d-dimensional space), its neighborhood G is
defined by:
Gi = {i,i + r1,i + r2,...,i + rn},
where n is a fixed parameter that determines the neighborhood size, and rj is a fixed
vector in the d-dimensional space.
180
4. Evolving Cellular Automata
LA COMPLEXITÉ
(a)
(b)
(c)
(d)
Figure 4.1: Patterns produced by the parity rule, starting from a 20 × 20 rectangular
pattern. White squares represent cells in state 0, black squares represent cells in state
1. (a) after 30 time steps (t = 30), (b) t = 60, (c) t = 90, (d) t = 120.
The local transition rule f
f : Sn → S
maps the state si ∈ S of a given cell i into another state from the set S, as a function
of the states of the cells in the neighborhood Gi. In uniform CAs f is identical for
all cells, whereas in non-uniform ones f may differ between different cells, i.e., f
depends on i, fi.
For a finite-size CA of size N (such as those treated in this book) a configuration
of the grid at time t is defined as
C(t) = (s0(t), s1(t), . . . , sN−1(t)),
181
LA COMPLEXITÉ
Marco TOMASSINI
where si(t) ∈ S is the state of cell i at time t. The progression of the CA in time is
then given by the iteration of the global mapping F
F : C(t) → C(t + 1),
t = 0, 1, . . .
through the simultaneous application in each cell of the local transition rule f . The
global dynamics of the CA can be described as a directed graph, referred to as the
CA’s phase space [55].
An oft-explored system is that of one-dimensional CAs with two possible states
per cell, i.e., S = {0, 1}. In this case f is a function f : {0, 1}n → {0, 1} and the
neighborhood size n is usually taken to be n = 2r + 1 such that:
si(t + 1) = f (si−r(t), ..., si(t), ..., si+r(t)),
where r ∈ Z+ is a parameter, known as the radius, representing the standard one-
dimensional cellular neighborhood. Considering the r = 1 case one obtains so-called
elementary CAs, for which the neighborhood size is n = 3:
f : {0,1}3 → {0,1},
si(t + 1) = f (si−1(t), si(t), si+1(t)).
The domain of f is the set of all 23 3-tuples, which gives rise to 28 = 256 distinct
elementary rules. It is common to use Wolfram’s decimal numbering convention for
describing these rules [55].1 For two-state CAs a configuration of a size N grid at time
t is a binary sequence C(t), C(t) ∈ {0,1}N. For finite-size grids, spatially periodic
boundary conditions are frequently assumed, resulting in a circular grid; formally, this
implies that cellular indices are computed modulus N .
4.3
Cellular automata as complex and
computational systems
As noted above, the CA model was originally introduced in the late 1940s by Ulam and
von Neumann and used extensively by the latter to study issues related with the logic of
life [52]. In particular, von Neumann asked whether we can use purely mathematical-
logical considerations to discover the specific features of biological automata that make
them self-replicating.
Von Neumann used two-dimensional CAs with 29 states per cell and a 5-cell neigh-
borhood. He showed that a universal computer can be embedded in such cellular space,
namely, a device whose computational power is equivalent to that of a universal Turing
machine [19]. He also described how a universal constructor may be built, namely, a
machine capable of constructing, through the use of a “constructing arm,” any config-
uration whose description can be stored on its input tape. This universal constructor
1For example, f(111) = 1, f(110) = 0, f(101) = 1, f(100) = 1, f(011) = 1, f(010) = 0,
f (001) = 0, f (000) = 0, is denoted rule 184 (the decimal equivalent of 10111000).
182
4. Evolving Cellular Automata
LA COMPLEXITÉ
is therefore capable, given its own description, of constructing a copy of itself, i.e.,
of self replicating (Figure 4.2). The terms ‘machine’ and ‘tape’ refer here to con-
figurations, i.e., patterns of states (as defined in Section 4.2). The mechanisms von
Neumann proposed for achieving self-replicating structures within a cellular automa-
ton bear strong resemblance to those employed by biological life, discovered during
the following decade. Von Neumann’s universal computer-constructor was simplified
by [7] who used an 8-state, 5-neighbor cellular space.
UC
CONSTRUCTING
TAPE
ARM
OFFSPRING
UC
TAPE
PARENT
Figure 4.2: A schematic diagram of von Neumann’s self-replicating cellular automa-
ton. The system is a universal constructor (UC), namely, a machine capable of con-
structing, through the use of a “constructing arm,” any configuration whose description
can be stored on its input tape. This universal constructor is therefore capable, given its
own description, of constructing a copy of itself, i.e., of self-replicating. (The machine
is not drawn to scale.)
Over the years CAs have been applied to the study of general phenomenological
aspects of the world, including communication, computation, construction, growth,
reproduction, competition, and evolution (see, e.g., [4,34,46]). One of the most well-
known CA rules, the “game of life,” was conceived by Conway in the late 1960s and
was shown by him to be computation-universal [2]. For a review of computation-
theoretic CA results refer to [9].
The question of whether cellular automata can model not only general phenomeno-
logical aspects of our world, but also directly model the laws of physics themselves was
raised by [14,43]. A primary theme of this research is the formulation of computational
models of physics that are information-preserving, and thus retain one of the most
fundamental features of microscopic physics, namely, reversibility [14,26,44]. This
approach has been used to provide extremely simple models of common differential
equations of physics, such as the heat and wave equations [45] and the Navier-Stokes
183
LA COMPLEXITÉ
Marco TOMASSINI
equation [15]. CAs also provide a useful model for a branch of dynamical systems
theory which studies the emergence of well-characterized collective phenomena, such
as ordering, turbulence, chaos, symmetry-breaking, and fractality, in discrete systems
[6,50].
The systematic study of CAs in this latter context was pioneered by Wolfram and
studied extensively by him [55]. He investigated CAs and their relationships to dy-
namical systems, identifying the following four qualitative classes of CA behavior,
with analogs in the field of dynamical systems (the latter are shown in parenthesis; see
also [24]):
1. Class I relaxes to a homogeneous state (limit points).
2. Class II converges to simple separated periodic structures (limit cycles).
3. Class III yields chaotic aperiodic patterns (chaotic behavior of the kind associ-
ated with strange attractors).
4. Class IV yields complex patterns of localized structures, including propagating
structures (very long transients with no apparent analog in continuous dynamical
systems).
Figure 4.3 demonstrates these four classes using one-dimensional CAs (as studied by
Wolfram). It should be noted that Wolfram’s classification is purely phenomenologi-
cal, based on experimental evidence gleaned via computer simulation of CAs. As such,
one should regard it as a rough guide – for example, it provides no clue as to how a CA
of given behavior can be constructed. Gutowitz [18], among others, proposed a more
rigorous, statistical classification of CA rules, based on Markov fields and mean-field
theory. Finally, we also remark that CAs have been used in biological modeling [12].
We have seen above that CAs have been used as a formal model for studying phe-
nomena of interest in several scientific fields, including physics, biology, and computer
science. In recent years there is a growing interest in the utilization of CAs as actual
computing devices. CAs exhibit three notable features: massive parallelism, locality
of cellular interactions, and simplicity of basic components (cells). They perform com-
putations in a distributed fashion on a spatially extended grid. As such they differ from
the standard approach to parallel computation in which a problem is split into indepen-
dent sub-problems, each solved by a different processor, later to be combined in order
to yield the final solution. CAs suggest a new approach in which complex behavior
arises in a bottom-up manner from non-linear, spatially extended, local interactions.
This is often referred to as emergent computation, meaning the appearance of global
information processing capabilities that are not explicitly represented in the system’s
elementary components or in their local interconnections [13]. The CA’s properties
greatly facilitate its implementation as electronic hardware [35]. CAs also suggest
a possible approach to attaining novel computational architectures at the nanometer
scale [1].
When considering CAs that perform computations two possibilities manifest them-
selves: (1) Embedding a universal Turing machine within the CA, or (2) using the CA
184
4. Evolving Cellular Automata
LA COMPLEXITÉ
time
↓
(a)
(b)
(c)
(d)
Figure 4.3: Wolfram classes. One dimensional CAs are shown, where the horizontal
axis depicts the configuration at a certain time t and the vertical axis depicts successive
time steps (increasing down the page). CAs are binary (2 states per cell) with radius
r = 2 (two neighbors on both sides of the cell). (a) Class I. (b) Class II. (c) Class III.
(d) Class IV.
in a direct, parallel manner: the input to the computation is encoded as an initial con-
figuration, the output is the configuration after a certain number of time steps, and the
intermediate steps that transform the input to the output are considered to be the steps
in the computation. In this latter case, the “program” emerges through “execution” of
the CA rule in each cell.
185
LA COMPLEXITÉ
Marco TOMASSINI
4.4
Variations on the Original Model
In this section we briefly outline a number of variations of the original, classic CA
model, presented above. These variations concern the cellular rules, the connectivity
architectures, temporal considerations, and determinism.
4.4.1
Non-Uniform Cellular Automata
Non-uniform cellular automata function in the same way as uniform ones, the only
difference being in the cellular rules that need not be identical for all cells. Note that
non-uniform CAs share the basic “attractive” properties of uniform ones (simplicity,
parallelism, locality). From a hardware point of view we observe that the resources re-
quired by non-uniform CAs are identical to those of uniform ones since a cell in both
cases contains a rule. Although simulations of uniform CAs on serial computers may
optimize memory requirements by retaining a single copy of the rule, rather than have
each cell hold one, this in no way detracts from our argument. Indeed, one of the pri-
mary motivations for studying CAs stems from the observation that they are naturally
suited for hardware implementation with the potential of exhibiting extremely fast and
reliable computation that is robust to noisy input data and component failure.
Non-uniform CAs have been investigated by [51] who discuss a 1-D CA in which
a cell probabilistically selects one of two rules at each time step. They showed that
complex patterns appear characteristic of class IV behavior. Garzon [16] presented two
generalizations of cellular automata, namely, discrete neural networks and automata
networks. These were compared to the original model from a computational point of
view which considers the classes of problems such models can solve.
4.4.2
Non-Standard Architectures
Another possible variation concerns the connectivity pattern of the cells, the archi-
tecture, which is standard and homogeneous in the original CA. One can consider
so-called non-standard connectivity architectures, where each cell has a small, identi-
cal number of connections, yet not necessarily from its most immediate neighboring
cells. It can be shown that such architectures are computationally more efficient than
standard architectures in solving certain computational tasks. Furthermore, one can
successfully evolve non-standard architectures using evolutionary computation tech-
niques.
4.4.3
Asynchronous Cellular Automata
One of the prominent features of the CA model is its synchronous mode of opera-
tion, meaning that all cells are updated simultaneously. A preliminary study of asyn-
chronous CAs, where one cell is updated at each time step, was carried out by [21],
where the different dynamical behavior of synchronous and asynchronous CAs was
186
4. Evolving Cellular Automata
LA COMPLEXITÉ
compared; the authors argued that some of the apparent self-organization of CAs is an
artifact of the synchronization of the clocks. Wolfram [55] noted that asynchronous
updating makes it more difficult for information to propagate through the CA and that,
furthermore, such CAs may be harder to analyze. Asynchronous CAs have also been
discussed in [29,3].
4.4.4
Probabilistic Cellular Automata
In a deterministic cellular automaton, for any given input, the system always goes
through the same trajectory of states, ending with the same output. For a nondeter-
ministic, or probabilistic CA the same input may result in different trajectories, and
possibly different outputs. Nondeterminism may be inherent to the system’s func-
tional definition or it may result due to faults. As an example, consider a two-state CA,
where a cell updates its state in a non-deterministic manner, setting it at the next time
step to that specified in the rule table, with probability 1 − pf, or the complementary
state, with probability pf . The value pf can be regarded as the probability that a cell
will malfunction (this type of fault was studied, e.g., by [38]).
4.5
Artificial Evolution of Cellular Automata
The idea of applying the biological principle of natural evolution to artificial systems,
introduced more than three decades ago, has seen impressive growth in the past few
years. Usually grouped under the term evolutionary algorithms or evolutionary com-
putation, one finds such diverse domains as genetic algorithms, evolution strategies,
evolutionary programming, and genetic programming. Central to all these different
methodologies is the idea of solving problems by evolving an initially random pool
of possible solutions, through the application of “genetic” operators, such that in time
“fitter” (i.e., better) solutions emerge.
Research in these areas has traditionally centered on proving theoretical aspects,
such as convergence properties, effects of different algorithmic parameters, and so on,
or on making headway in new application domains, such as constraint optimization
problems, image processing, neural network evolution, and more. The implementa-
tion of an evolutionary algorithm, an issue which usually remains in the background,
is quite costly in many cases, since populations of solutions are involved, possibly cou-
pled with computation-intensive fitness evaluations. One possible solution is to par-
allelize the process, an idea which has been explored to some extent in recent years.
While posing no major problems in principle, this may require judicious modifications
of existing algorithms or the introduction of new ones in order to meet the constraints
of a given parallel machine.
Here a different approach is taken; rather than ask ourselves how to better imple-
ment a specific algorithm on a given hardware platform, we pose the more general
question of whether machines can be made to evolve. While this idea finds its ori-
187
LA COMPLEXITÉ
Marco TOMASSINI
gins in the cybernetics movement of the 1940s and 1950s, it has recently resurged
in the form of the nascent field of bio-inspired systems and evolvable hardware [31].
The field draws on ideas from evolutionary computation as well as on recent hardware
developments.
Our evolving machines are based on the cellular automata model described in sec-
tion 4.1. A one-dimensional CA is illustrated in Figure 4.4 (based on [27]).
Rule table:
neighborhood:
111
110
101
100
011
010
001
000
output bit:
1
1
1
0
1
0
0
0
Grid:
t = 0
0
1
1
0
1
0
1
1
0
1
1
0
0
1
1
t = 1
1
1
1
1
0
1
1
1
1
1
1
0
0
1
1
Figure 4.4: Illustration of a one-dimensional, 2-state CA. The connectivity radius is
r = 1, meaning that each cell has two neighbors, one to its immediate left and one to
its immediate right. Grid size is N = 15. The rule table for updating the grid is shown
on top. The grid configuration over one time step is shown at the bottom. Spatially
periodic boundary conditions are applied, meaning that the grid is viewed as a circle,
with the leftmost and rightmost cells each acting as the other’s neighbor.
CAs exhibit three notable features, namely, massive parallelism, locality of cellu-
lar interactions, and simplicity of basic components (cells). As such they are naturally
suited for hardware implementation, with the potential of exhibiting extremely fast
and reliable computation that is robust to noisy input data and component failure. A
major impediment preventing ubiquitous computing with CAs stems from the diffi-
culty of utilizing their complex behavior to perform useful computations. Designing
CAs to exhibit a specific behavior or to perform a particular task is highly compli-
cated, thus severely limiting their applications. This results from the local dynamics
of the system, which renders the design of local rules to perform global computational
tasks extremely arduous. Automating the design (programming) process would greatly
enhance the viability of CAs [28,34].
The model investigated here is an extension of the CA model, termed non-uniform
cellular automata. Such automata function in the same way as uniform ones, the only
difference being in the cellular rules that need not be identical for all cells. Our main
focus is on the evolution of non-uniform CAs to perform computational tasks, using
the cellular programming approach. The input to the computation is encoded as an
initial configuration and the output is the configuration after a certain number of time
steps. We shall first introduce the algorithm, followed by several problems to which
it has been applied. We then study a number of related issues, including the evolution
188
4. Evolving Cellular Automata
LA COMPLEXITÉ
of connectivity architectures, asynchronous CAs, evolving ware (evolware), and faulty
CAs. Uniform CAs have been evolved previously for performing computational tasks
(see, for example[28,27]). Since our focus here is on non-uniform CAs, these uniform
evolved CAs will be treated briefly in section 4.9.1.
4.6
The Cellular Programming Algorithm
We study 2-state, non-uniform CAs, in which each cell may contain a different rule.
A cell’s rule table is encoded as a bit string (the “genome”), containing the next-state
(output) bits for all possible neighborhood configurations, listed in lexicographic or-
der; e.g., for CAs with r = 1, the genome consists of 8 bits, where the bit at position 0
is the state to which neighborhood configuration 000 is mapped to and so on until bit 7,
corresponding to neighborhood configuration 111. Rather than employ a population of
evolving, uniform CAs, as with genetic algorithm approaches, our algorithm involves a
single, non-uniform CA of size N , with cell rules initialized at random. Initial configu-
rations are then generated at random, in accordance with the task at hand, and for each
one the CA is run for M time steps. Each cell’s fitness is accumulated over C = 300
initial configurations, where a single run’s score is 1 if the cell is in the correct state
after M iterations, and 0 otherwise. After every C configurations evolution of rules
occurs by applying crossover and mutation. This evolutionary process is performed in
a completely local manner, where genetic operators are applied only between directly
connected cells. It is driven by nfi(c), the number of fitter neighbors of cell i after c
configurations. The pseudo-code of the algorithm is delineated in Figure 4.5.
Crossover between two rules is performed by selecting at random (with uniform
probability) a single crossover point and creating a new rule by combining the first
rule’s bit string before the crossover point with the second rule’s bit string from this
point onward. Mutation is applied to the bit string of a rule with probability 0.001 per
bit.
There are two main differences between the cellular programming algorithm and
the standard genetic algorithm: (a) The latter involves a population of evolving, uni-
form CAs; all CAs are ranked according to fitness, with crossover occurring between
any two individuals in the population. Thus, while the CA runs in accordance with
a local rule, evolution proceeds in a global manner. In contrast, the cellular program-
ming algorithm proceeds locally in the sense that each cell has access only to its locale,
not only during the run but also during the evolutionary phase, and no global fitness
ranking is performed. (b) The standard genetic algorithm involves a population of in-
dependent problem solutions; the CAs in the population are assigned fitness values
independent of one another, and interact only through the genetic operators in order
to produce the next generation. In contrast, our CA coevolves since each cell’s fit-
ness depends upon its evolving neighbors. This may also be considered a form of
symbiotic cooperation, which falls, as does coevolution, under the general heading of
“ecological” interactions (see [27], pages 182-183).
189
LA COMPLEXITÉ
Marco TOMASSINI
for each cell i in CA do in parallel
initialize rule table of cell i
fi = 0 { fitness value }
end parallel for
c = 0 { initial configurations counter }
while not done do
generate a random initial configuration
run CA on initial configuration for M time steps
for each cell i do in parallel
if cell i is in the correct final state then
fi = fi + 1
end if
end parallel for
c = c + 1
if c mod C = 0 then { evolve every C configurations}
for each cell i do in parallel
compute nfi(c) { number of fitter neighbors }
if nfi(c) = 0 then rule i is left unchanged
else if nfi(c) = 1 then replace rule i with the fitter neighboring rule,
followed by mutation
else if nfi(c) = 2 then replace rule i with the crossover of the two fitter
neighboring rules, followed by mutation
else if nfi(c) > 2 then replace rule i with the crossover of two randomly
chosen fitter neighboring rules, followed by mutation
(this case can occur if the cellular neighborhood includes
more than two cells)
end if
fi = 0
end parallel for
end if
end while
Figure 4.5: Pseudo-code of the cellular programming algorithm.
190
4. Evolving Cellular Automata
LA COMPLEXITÉ
This latter point comprises a prime difference between our algorithm and parallel
genetic algorithms, which have attracted attention over the past few years. These aim
to exploit the inherent parallelism of evolutionary algorithms, thereby decreasing com-
putation time and enhancing performance. A number of models have been suggested,
among them coarse-grained, island models [8,41,42], and fine-grained, grid models
[25,47]. The latter resemble our system in that they are massively parallel and local;
however, the coevolutionary aspect is missing. As we wish to attain a system dis-
playing global computation, the individual cells do not evolve independently as with
genetic algorithms (be they parallel or serial), i.e., in a “loosely coupled” manner, but
rather coevolve, thereby comprising a “tightly coupled” system.
4.7
Applications of Cellular Programming
In this section we study some of the computational tasks for which non-uniform CAs
have been evolved: density, synchronization, ordering, and random number genera-
tion; Minimal cellular spaces are used: 2-state, r = 1 for the one-dimensional case
and 2-state, 5-neighbor for the two-dimensional one. Spatially periodic boundary con-
ditions are applied, resulting in a circular grid for the one-dimensional case, and a
toroidal one for the two-dimensional case. The total number of initial configurations
per evolutionary run was in the range [105, 106]. Performance values reported here-
after represent the average fitness of all grid cells after C configurations, normalized
to the range [0, 1]; these are obtained during execution of the cellular programming
algorithm.
4.7.1
The Density Task
The one-dimensional density task is to decide whether or not the initial configura-
tion contains more than 50% 1s, relaxing to a fixed-point pattern of all 1s if the initial
density of 1s exceeds 0.5, and all 0s otherwise. As noted by [28], the density task com-
prises a non-trivial computation for a small-radius CA (r
N , where N is the grid
size). Density is a global property of a configuration whereas a small-radius CA relies
solely on local interactions. Since the 1s can be distributed throughout the grid, prop-
agation of information must occur over large distances (i.e., O(N )). The minimum
amount of memory required for the task is O(log N ) using a serial-scan algorithm,
thus the computation involved corresponds to recognition of a non-regular language.
Note that the density task cannot be perfectly solved by a uniform, two-state CA, as
proven by [23]. (This result applies to the above statement of the problem, where the
CA’s final pattern (i.e., output) is specified as a fixed-point configuration. Interestingly,
it has recently been proven that by changing the output specification, namely, the fi-
nal pattern toward which the system should converge, a two-state, r = 1 uniform CA
exists that can perfectly solve the density problem [5].)
We studied this task in [32] using non-uniform, one-dimensional, minimal radius
r = 1 CAs of size N = 149. The search space involved is extremely large; since each
191
LA COMPLEXITÉ
Marco TOMASSINI
cell contains one of 28 possible rules this space is of size (28)149 = 21192. In contrast,
the size of uniform, r = 1 CA rule space is small, consisting of only 28 = 256 rules.
This enabled us to test each and every one of these rules on the density task, a feat not
possible for larger values of r. One of our major results is that evolved non-uniform,
r = 1 CAs outperform any possible uniform, r = 1 CA [32].
For the cellular programming algorithm we used randomly generated initial con-
figurations, uniformly distributed over densities in the range [0, 1], with the CA being
run for M = 150 time steps (thus, computation time is linear with grid size). We found
that non-uniform CAs had coevolved that exhibit performance values as high as 0.93
(in comparison, the maximal performance of uniform r = 1 CAs is 0.83. Furthermore,
these consist of a grid in which one rule dominates, a situation referred to as quasi-
uniformity. Basically, in a quasi-uniform CA the number of distinct rules is extremely
small with respect to rule-space size; furthermore, the rules are distributed such that a
subset of dominant rules occupies most of the grid.
time
↓
(a)
(b)
Figure 4.6: One-dimensional density task: Operation of a coevolved, non-uniform,
r = 1 CA. Grid size is N = 149. White squares represent cells in state 0, black
squares represent cells in state 1. The pattern of configurations is shown through time
(which increases down the page). Initial configurations were generated at random. Top
figures depict space-time diagrams, bottom figures depict rule maps. (a) Initial density
of 1s is 0.40. (b) Initial density of 1s is 0.60. The CA relaxes in both cases to a fixed
pattern of all 0s or all 1s, correctly classifying the initial configuration.
Figure 4.6 demonstrates the operation of one such coevolved CA along with a
rules map, depicting the distribution of rules by assigning a unique gray level to each
distinct rule. In this example the grid consists of 146 cells containing rule 226, 2
cells containing rule 224, and 1 cell containing rule 234.2 The non-dominant rules
2Rule numbers are given in accordance with Wolfram’s convention [53,55], representing the decimal
192
4. Evolving Cellular Automata
LA COMPLEXITÉ
act as “buffers,” preventing information from flowing too freely, and making local
corrections to passing signals..
The density task can be extended in a straightforward manner to 2-D grids, an
investigation of which we have carried out, attaining notably higher performance than
the one-dimensional case, with values of 0.99; furthermore, computation time, i.e., the
number of time steps taken by the CA until convergence to the correct final pattern, is
shorter.
4.7.2
The Synchronization Task
The one-dimensional synchronization task was introduced by [10] and studied by us
in [17,33] using non-uniform CAs. In this task the CA, given any initial configuration,
must reach a final configuration, within M time steps, that oscillates between all 0s
and all 1s on successive time steps. As with the density task, synchronization also
comprises a non-trivial computation for a small-radius CA.
We studied non-uniform, one-dimensional, minimal radius r = 1 CAs of size
N = 149. As for the density task, all possible uniform, r = 1 CAs were first tested on
this task. For the cellular programming algorithm we used randomly generated initial
configurations, uniformly distributed over densities in the range [0, 1], with the CA
being run for M = 150 time steps. We found that quasi-uniform CAs had coevolved
that exhibit near-perfect performance, which surpasses any possible uniform, r = 1
CA. Figure 4.7 depicts the operation of two CAs: a high-performance uniform CA and
a coevolved, non-uniform CA. We have also experimented with two-dimensional grids
obtaining highly successful results as with the one-dimensional case.
4.7.3
The Ordering Task
In this task, the one-dimensional CA, given any initial configuration, must reach a final
configuration in which all 0s are placed on the left side of the grid and all 1s on the right
side (thus the final density equals the initial one, however the configuration consists of
a block of 0s on the left followed by a block of 1s on the right). It is interesting in
that the output is not a uniform configuration of all 0s or all 1s as with the density and
synchronization tasks. Cellular programming yielded quasi-uniform CAs with fitness
values as high as 0.93, one of which is depicted in Figure 4.8. As with the previous
tasks we were able to ascertain that this performance level is better than any possible
uniform, r = 1 CA.
equivalent of the binary number encoding the rule table. For example, the rule depicted in Figure 4.4 is
rule 232.
193
LA COMPLEXITÉ
Marco TOMASSINI
time
↓
(a)
(b)
Figure 4.7: One-dimensional synchronization task: Operation of two r = 1 CAs. Grid
size is N = 149. Initial configurations were generated at random. (a) Uniform rule
31 (one of the best-performance uniform CAs for this task). (b) A coevolved, non-
uniform, r = 1 CA.
time
↓
(a)
(b)
Figure 4.8: One-dimensional ordering task: Operation of a coevolved, non-uniform,
r = 1 CA. (a) Initial density of 1s is 0.315, final density is 0.328. (b) Initial density of
1s is 0.60, final density is 0.59.
4.7.4
Random Number Generation
Random numbers are needed in a variety of applications, yet finding good random
number generators is a difficult task [30]. To generate a random sequence on a digital
194
4. Evolving Cellular Automata
LA COMPLEXITÉ
computer, one starts with a fixed length seed, then iteratively applies some transfor-
mation to it, progressively extracting as long as possible a random sequence. Such
numbers are usually referred to as pseudo-random, as distinguished from true random
numbers resulting from some natural physical process. In the last decade cellular au-
tomata have been used to generate random numbers [20,22,54].
In [36,37] we applied the cellular programming algorithm to evolve random num-
ber generators (RNG) . Essentially, the cell’s fitness score for a single configuration
(refer to Figure 4.5) is the entropy of the temporal bit sequence of that cell, with higher
entropy implying better fitness. This fitness measure was used to drive the evolutionary
process, after which standard tests were applied to evaluate the quality of the evolved
CAs. The results obtained suggest that good generators can indeed be evolved; these
exhibit behavior at least as good as that of previously described CAs, with notable
advantages arising from the existence of a “tunable” algorithm for obtaining random
number generators. Figure 4.9 shows the temporal behaviour of one of our more recent
evolved RNG. For more details, see references [48,49].
Figure 4.9: One-dimensional random number generator: Operation of a coevolved,
non-uniform, r = 1 CA. Essentially, each cell’s sequence of states through time is a
pseudo-random bit stream.
4.8
Asynchronous Cellular Automata
One of the prominent features of the CA model is its synchronous mode of operation,
meaning that all cells are updated simultaneously. In [39,40] we investigated the issue
of evolving asynchronous CAs to perform the density and synchronization tasks. The
grid is partitioned into blocks in which synchronous updating takes place (i.e., all cells
within a block are updated simultaneously), while the blocks themselves are updated
195
LA COMPLEXITÉ
Marco TOMASSINI
asynchronously (rather than have all blocks updated at once); thus, intra-block updat-
ing is synchronous while inter-block updating is asynchronous. The number of blocks
per grid, #b, is a tunable parameter, entailing a scale of asynchrony, ranging from
complete synchrony (#b = 1) to complete asynchrony (#b = N ). There are two main
differences between our investigation and previous ones: (1) rather than consider only
complete asynchrony (#b = N ), we introduced the above scale; (2) asynchronous
CAs were previously studied from a more abstract point of view, whereas we were
interested in evolving them to perform a veritable computation.
We introduced three models of asynchrony, previously unstudied in this context,
finding that asynchronous CAs can be evolved to perform the computational tasks in
question. We concluded that asynchrony presents a more difficult case for evolution,
though it is premature to draw any definitive conclusions at this point, since we have
only considered two problems, using relatively small-size grids. We feel that success-
ful asynchronous CAs can be evolved, though this will probably entail larger grids
(coupled with larger blocks).
4.9
Fault Tolerance
Most classical software and hardware systems, especially parallel ones, exhibit a very
low level of fault tolerance, i.e., they are not resilient in the face of errors. Indeed,
where software is concerned, even a single error can often bring an entire program to
a grinding halt. Future computing systems may contain thousands or even millions
of computing elements (e.g., [11]). For such large numbers of components, the issue
of resilience can no longer be ignored, since faults will be likely to occur with high
probability.
Networks of automata exhibit a certain degree of fault tolerance. As an example,
one can cite artificial neural networks, many of which show graceful degradation in
performance when presented with noisy input. Moreover, the malfunction of a neuron
or damage to a synaptic weight causes but a small change in the system’s overall
behavior, rather than bringing it to a complete standstill. Cellular computing systems,
such as CAs, may be regarded as a simple and convenient framework within which
to study the effects of such errors. Another motivation for studying this issue derives
directly from the work presented in the previous section concerning the firefly machine.
We wish to learn how robust such a machine is when operating under faulty conditions.
In [38,39] we performed a study of fault-tolerance in our evolved CAs, asking how
they perform in the face of errors. The CAs in question were those that had evolved to
solve either the density or synchronization tasks, with our fault-tolerance investigation
picking up upon termination of the evolutionary process. We focused on one type
of error where a cell updates its state in a non-deterministic manner: at each time
step, the cell’s next state is that specified in the rule table, with probability 1 − pf,
or the complementary one with probability pf ; pf is denoted the fault probability,
representing the probability that a cell will incorrectly update its state. Figure 4.10
196
4. Evolving Cellular Automata
LA COMPLEXITÉ
depicts the operation of two faulty CAs.
(a)
(b)
Figure 4.10: One-dimensional synchronization task: Operation of a coevolved, non-
uniform, r = 1 CA, with probability of fault pf > 0. Grid size is N = 149. Initial
configurations were generated at random. (a) pf = 0.0001. (b) pf = 0.001.
Our results showed that the evolved systems exhibit graceful degradation in per-
formance, able to tolerate a certain level of faults. Furthermore, we identified a fault-
tolerant range of pf values, where “good” computational behavior is exhibited, and
introduced a number of measures to fine-tune our understanding of the faulty CAs’
operation. We studied the error level as a function of time and space, as well as the
recuperation time needed to recover from faults.
4.9.1
Evolving Uniform Cellular Automata
In reference [28] and subsequent work, M. Mitchell, J. Crutchfield and coworkers
described computational tasks for a finite, one-dimensional two-state CAs. The density
problem and the synchronization problem were studied in particular. For the density
task, the seven neighbours cellular automata rule of Gacs, Kurdymov and Levin rule
(GKL) performs this task correctly for a substantial part of many randomly generated
initial configurations.
The authors carried out a set of experiments in which GAs are used to evolve CA
rules for the above described computational task. As we saw before, one-dimensional
CA rules can be easily encoded as binary strings by just successively recording in the
string the next states (binary) corresponding to all the neighborhood states combina-
tions in a given rule listed in a fixed order.
197
LA COMPLEXITÉ
Marco TOMASSINI
With seven neighbours and two possible states, one has rules of length 27 = 128
and the number of possible rules is huge: 2128. Starting with a population of random
CA rules, the authors have used as a fitness measure for a rule the number of correct
classifications after a given number of CA steps over 100 initial random configurations
chosen with uniform probability. As usual, the strings (rules) that performed better
were selected to survive and randomly paired to produce new rules by crossover, the
offspring being subject to a small mutation rate. It is to be noted that the evolutionary
algorithm used differs from the cellular programming algorithm in that it uses a popu-
lation of uniform CA genomes, while cellular programming evolves a single, possibly
non-uniform grid.
Computational capabilities and general patterns of rule strategies were found to
automatically emerge from the simulated evolutionary process although in no case the
GA-evolved pattern classification strategies were superior to the GKL rule. However,
some evolved rules had remarkably good performance, close to that of the GKL rule
which, for a system of this complexity, is a good result.
4.10
Concluding Remarks
We described the cellular programming approach used to evolve parallel cellular ma-
chines, and demonstrated its viability by applying it to the solution of several compu-
tational problems. We then studied a number of related issues, including asynchronous
CAs, and faulty CAs.
Evolving cellular machines hold potential both scientifically, as vehicles for study-
ing phenomena of interest in areas such as complex systems and artificial life, as well
as practically, showing a range of potential future applications ensuing the construc-
tion of adaptive systems. The preceding discussion has shed light on the possibility of
computing with such machines, and demonstrated the feasibility of their programming
by means of coevolution.
Acknowledgment.
Part of the material presented here has been contributed by my
colleague M. Sipper.
4.11
Bibliography
[1] S. C. Benjamin and N. F. Johnson. A possible nanometer-scale computing device
based on an adding cellular automaton. Applied Physics Letters, 70(17):2321–
2323, April 1997.
[2] E. R. Berlekamp, J. H. Conway, and R. K. Guy. Winning Ways for your Math-
ematical Plays, volume 2, chapter 25, pages 817–850. Academic Press, New
York, 1982.
198
BIBLIOGRAPHY
LA COMPLEXITÉ
[3] H. Bersini and V. Detour. Asynchrony induces stability in cellular automata based
models. In R. A. Brooks and P. Maes, editors, Artificial Life IV, pages 382–387,
Cambridge, Massachusetts, 1994. The MIT Press.
[4] A. Burks, editor. Essays on Cellular Automata. University of Illinois Press,
Urbana, Illinois, 1970.
[5] M. S. Capcarrere, M. Sipper, and M. Tomassini. Two-state, r=1 cellular automa-
ton that classifies density. Physical Review Letters, 77(24):4969–4971, December
1996.
[6] B. Chopard and M. Droz. Cellular Automata Modeling of Physical Systems.
Cambridge University Press, Cambridge, UK, 1997. (to appear).
[7] E. F. Codd. Cellular Automata. Academic Press, New York, 1968.
[8] J. P. Cohoon, S. U. Hedge, W. N. Martin, and D. Richards. Punctuated equilibria:
A parallel genetic algorithm. In J. J. Grefenstette, editor, Proceedings of the
Second International Conference on Genetic Algorithms, page 148. Lawrence
Erlbaum Associates, 1987.
[9] K. Culik II, L. P. Hurd, and S. Yu. Computation theoretic aspects of cellular
automata. Physica D, 45:357–378, 1990.
[10] R. Das, J. P. Crutchfield, M. Mitchell, and J. E. Hanson. Evolving globally syn-
chronized cellular automata. In L. J. Eshelman, editor, Proceedings of the Sixth
International Conference on Genetic Algorithms, pages 336–343, San Francisco,
CA, 1995. Morgan Kaufmann.
[11] K. E. Drexler. Nanosystems: Molecular Machinery, Manufacturing and Compu-
tation. John Wiley, New York, 1992.
[12] G. B. Ermentrout and L. Edelstein-Keshet. Cellular automata approaches to bio-
logical modeling. Journal of Theoretical Biology, 160:97–133, 1993.
[13] S. Forrest, editor. Emergent Computation: Self-organizing, Collective, and Co-
operative Phenomena in Natural and Artificial Computing Networks. The MIT
Press, Cambridge, MA, 1991.
[14] E. Fredkin and T. Toffoli. Conservative logic. International Journal of Theoreti-
cal Physics, 21:219–253, 1982.
[15] U. Frisch, B. Hasslacher, and Y. Pomeau. Lattice-gas automata for the Navier-
Stokes equation. Physical Review Letters, 56:1505–1508, 1986.
[16] M. Garzon. Models of Massive Parallelism: Analysis of Cellular Automata and
Neural Networks. Springer-Verlag, Berlin, 1995.
199
LA COMPLEXITÉ
Marco TOMASSINI
[17] M. Goeke, M. Sipper, D. Mange, A. Stauffer, E. Sanchez, and M. Tomassini.
Online autonomous evolware. In T. Higuchi, M. Iwata, and W. Liu, editors,
Proceedings of The First International Conference on Evolvable Systems: From
Biology to Hardware (ICES96), volume 1259 of Lecture Notes in Computer Sci-
ence, pages 96–106. Springer-Verlag, Heidelberg, 1997.
[18] H. Gutowitz. A hierarchical classification of cellular automata. In H. Gutowitz,
editor, Cellular Automata: Theory and Experiment, Proceedings of a Workshop
Sponsored by the Center for Nonlinear Studies, Los Alamos National Laboratory,
Los Alamos, volume 45, Nos. 1-3 of Physica D, pages 136–156. 1990.
[19] J. E. Hopcroft and J. D. Ullman. Introduction to Automata Theory Languages
and Computation. Addison-Wesley, Redwood City, CA, 1979.
[20] P. D. Hortensius, R. D. McLeod, and H. C. Card. Parallel random number genera-
tion for VLSI systems using cellular automata. IEEE Transactions on Computers,
38(10):1466–1473, October 1989.
[21] T. E. Ingerson and R. L. Buvel. Structure in asynchronous cellular automata.
Physica D, 10:59–68, 1984.
[22] J. R. Koza. Genetic Programming. The MIT Press, Cambridge, Massachusetts,
1992.
[23] M. Land and R. K. Belew. No perfect two-state cellular automata for density
classification exists. Physical Review Letters, 74(25):5148–5150, June 1995.
[24] C. G. Langton. Life at the edge of chaos. In C. G. Langton, C. Taylor, J. D.
Farmer, and S. Rasmussen, editors, Artificial Life II, volume X of SFI Studies
in the Sciences of Complexity, pages 41–91, Redwood City, CA, 1992. Addison-
Wesley.
[25] B. Manderick and P. Spiessens. Fine-grained parallel genetic algorithms. In J. D.
Schaffer, editor, Proceedings of the Third International Conference on Genetic
Algorithms, page 428. Morgan Kaufmann, 1989.
[26] N. Margolus. Physics-like models of computation. Physica D, 10:81–95, 1984.
[27] M. Mitchell. An Introduction to Genetic Algorithms. MIT Press, Cambridge,
MA, 1996.
[28] M. Mitchell, J. P. Crutchfield, and P. T. Hraber. Evolving cellular automata to
perform computations: Mechanisms and impediments. Physica D, 75:361–391,
1994.
[29] M. A. Nowak, S. Bonhoeffer, and R. M. May. Spatial games and the maintenance
of cooperation. Proceedings of the National Academy of Sciences USA, 91:4877–
4881, May 1994.
200
BIBLIOGRAPHY
LA COMPLEXITÉ
[30] S. K. Park and K. W. Miller. Random number generators: Good ones are hard to
find. Communications of the ACM, 31(10):1192–1201, October 1988.
[31] E. Sanchez and M. Tomassini, editors. Towards Evolvable Hardware, volume
1062 of Lecture Notes in Computer Science. Springer-Verlag, Heidelberg, 1996.
[32] M. Sipper. Co-evolving non-uniform cellular automata to perform computations.
Physica D, 92:193–208, 1996.
[33] M. Sipper.
Designing evolware by cellular programming.
In T. Higuchi,
M. Iwata, and W. Liu, editors, Proceedings of The First International Conference
on Evolvable Systems: From Biology to Hardware (ICES96), volume 1259 of
Lecture Notes in Computer Science, pages 81–95. Springer-Verlag, Heidelberg,
1997.
[34] M. Sipper. Evolution of Parallel Cellular Machines: The Cellular Programming
Approach. Springer-Verlag, Heidelberg, 1997.
[35] M. Sipper, E. Sanchez, D. Mange, M. Tomassini, A. Pérez-Uribe, and A. Stauf-
fer. A phylogenetic, ontogenetic, and epigenetic view of bio-inspired hardware
systems. IEEE Transactions on Evolutionary Computation, 1(1):83–97, April
1997.
[36] M. Sipper and M. Tomassini. Co-evolving parallel random number generators.
In H.-M. Voigt, W. Ebeling, I. Rechenberg, and H.-P. Schwefel, editors, Paral-
lel Problem Solving from Nature - PPSN IV, volume 1141 of Lecture Notes in
Computer Science, pages 950–959. Springer-Verlag, Heidelberg, 1996.
[37] M. Sipper and M. Tomassini. Generating parallel random number generators by
cellular programming. International Journal of Modern Physics C, 7(2):181–
190, 1996.
[38] M. Sipper, M. Tomassini, and O. Beuret. Studying probabilistic faults in evolved
non-uniform cellular automata.
International Journal of Modern Physics C,
7(6):923–939, 1996.
[39] M. Sipper, M. Tomassini, and M. S. Capcarrere. Designing cellular automata us-
ing a parallel evolutionary algorithm. Nuclear Instruments & Methods in Physics
Research, Section A, 389(1-2):278–283, 1997.
[40] M. Sipper, M. Tomassini, and M. S. Capcarrere. Evolving asynchronous and
scalable non-uniform cellular automata. In Proceedings of International Con-
ference on Artificial Neural Networks and Genetic Algorithms (ICANNGA97).
Springer-Verlag KG, Vienna, 1997. (to appear).
[41] T. Starkweather, D. Whitley, and K. Mathias. Optimization using distributed
genetic algorithms. In H.-P. Schwefel and R. Männer, editors, Parallel Problem
Solving from Nature, volume 496 of Lecture Notes in Computer Science, page
176, Heidelberg, 1991. Springer-Verlag.
201
LA COMPLEXITÉ
Marco TOMASSINI
[42] R. Tanese. Parallel genetic algorithms for a hypercube. In J. J. Grefenstette, edi-
tor, Proceedings of the Second International Conference on Genetic Algorithms,
page 177. Lawrence Erlbaum Associates, 1987.
[43] T. Toffoli. Cellular automata mechanics. Technical Report 208, Comp. Comm.
Sci. Dept., The University of Michigan, 1977.
[44] T. Toffoli. Reversible computing. In J. W. De Bakker and J. Van Leeuwen, edi-
tors, Automata, Languages and Programming, pages 632–644. Springer-Verlag,
1980.
[45] T. Toffoli. Cellular automata as an alternative to (rather than an approximation
of) differential equations in modeling physics. Physica D, 10:117–127, 1984.
[46] T. Toffoli and N. Margolus. Cellular Automata Machines. The MIT Press, Cam-
bridge, Massachusetts, 1987.
[47] M. Tomassini. The parallel genetic cellular automata: Application to global func-
tion optimization. In R. F. Albrecht, C. R. Reeves, and N. C. Steele, editors,
Proceedings of the International Conference on Artificial Neural Networks and
Genetic Algorithms, pages 385–391. Springer-Verlag, 1993.
[48] M. Tomassini, M. Sipper, and M. Perrenoud. On the generation of high-quality
random numbers by two-dimensional cellular automata. IEEE Transactions on
Computers, 49(10):1146–1151, October 2000.
[49] M. Tomassini, M. Sipper, M. Zolla, and M. Perrenoud. Generating high-quality
random numbers in parallel by cellular automata. Future Generation Computer
Systems, 16:291–305, 1999.
[50] G. Vichniac. Simulating physics with cellular automata. Physica D, 10:96–115,
1984.
[51] G. Y. Vichniac, P. Tamayo, and H. Hartman. Annealed and quenched inhomoge-
neous cellular automata. Journal of Statistical Physics, 45:875–883, 1986.
[52] J. von Neumann. Theory of Self-Reproducing Automata. University of Illinois
Press, Illinois, 1966. Edited and completed by A. W. Burks.
[53] S. Wolfram. Statistical mechanics of cellular automata. Reviews of Modern
Physics, 55(3):601–644, July 1983.
[54] S. Wolfram. Random sequence generation by cellular automata. Advances in
Applied Mathematics, 7:123–169, June 1986.
[55] S. Wolfram. Cellular Automata and Complexity. Addison-Wesley, Reading, MA,
1994.
202
Chapitre 5
Evolutionary Algorithms
5.1
Introduction
Evolutionary algorithms (EAs) are a broad class of stochastic optimization algorithms,
inspired by biology and in particular by those biological processes that allow popula-
tions of organisms to adapt to their surrounding environment: genetic inheritance and
survival of the fittest. These concepts were introduced in the 19th century by Charles
Darwin [8] and are still today widely acknowledged as valid, even though comple-
mented with further details [10].
The first proposals in that direction date back to the mid-1960s, when John
Holland, of the University of Michigan, introduced genetic algorithms (GAs) [18],
Lawrence Fogel and his colleagues, of the University of California in San Diego,
started their experiments on evolutionary programming, [14] and Ingo Rechenberg, of
the Technical University of Berlin, independently began to work on evolution strategies
[33]. Their pioneering work eventually gave rise to a broad class of optimization meth-
ods particularly well suited for hard problems where little is known about the under-
lying search space. The last development of this research thread is so-called genetic
programming, introduced by John Koza, of Stanford University [22] at the beginning
of the 1990s.
Recent texts of reference and synthesis in the field of evolutionary algorithms are
[25,3].
An evolutionary algorithm maintains a population of candidate solutions for the
problem at hand, and makes it evolve by iteratively applying a (usually quite small) set
of stochastic operators, known as mutation, recombination, and selection.
Mutation randomly perturbs a candidate solution; recombination decomposes two
distinct solutions and then randomly mixes their parts to form novel solutions; and
selection replicates the most successful solutions found in a population at a rate pro-
portional to their relative quality.
The initial population may be either a random sample of the solution space or
may be seeded with solutions found by simple local search procedures, if these are
203
LA COMPLEXITÉ
Marco TOMASSINI
available.
The resulting process tends to find, given enough time, globally optimal solutions
to the problem much in the same way as in nature populations of organisms tend to
adapt to their surrounding environment.
5.2
Genetic Algorithms
We start by providing a qualitative description of binary-coded evolutionary algo-
rithms, using two simple examples to present the main elements of these techniques.
5.2.1
The Metaphor
Essentially, evolutionary algorithms make use of a metaphor whereby an optimization
problem takes the place of the environment; feasible solutions are viewed as individu-
als living in that environment and an individual’s degree of adaptation to its surround-
ing environment is the counterpart of the objective function evaluated on a feasible
solution. In the same way, a set of feasible solutions takes the place of a population of
organisms. This optimization setting of evolutionary algorithms is useful in applica-
tions, but alternative views related to decision theory and machine learning have been
proposed [18]. A further interpretation comes from Artificial Life quarters, where
evolutionary algorithms are seen as artificial counterparts of natural evolution [24].
In evolutionary algorithms selection operates on computer data structures and, in
time, their functionalities evolve in a way substantially analogous to how populations
of living organisms evolve in a natural setting.
Although the computer model introduces sharp simplifications with respect to the
real biological mechanisms, evolutionary algorithms have proved capable of making
surprisingly complex and interesting structures emerge. Each structure, or individual ,
may be viewed as a representation, according to an appropriate encoding, of a particu-
lar solution to a problem, of a strategy to play a game, of a picture, or even of a simple
computer program.
5.2.2
Representation
In genetic algorithms, individuals are just strings of binary digits. As computer mem-
ory is made up of an array of bits, anything that can be stored in a computer can also
be encoded for by a bit string of sufficient length. In a sense, representing solutions to
a problem as bit strings is the most general encoding that can be thought of.
5.2.3
The Evolutionary Cycle
An evolutionary algorithm starts with a population of randomly generated individuals,
although it is also possible to use a previously saved population, or a population of
204
5. Evolutionary Algorithms
LA COMPLEXITÉ
individuals encoding for solutions provided by a human expert or by another heuristic
algorithm. In the case of genetic algorithms the initial population will be made up of
random bit strings.
Once an initial population has been created, an evolutionary algorithm enters a
loop. At the end of each iteration a new population will have been created by applying
a certain number of stochastic operators to the previous population. One such iteration
is referred to as a generation.
The first operator to be applied is selection. Its aim is to simulate the Darwinian
law of “survival of the fittest”. In genetic algorithms, this law is enforced by so-called
fitness proportionate selection: in order to create a new intermediate population of
n “parents”, n independent extractions of an individual from the old population are
performed, where the probability of each individual being extracted is linearly propor-
tional to its fitness. Therefore, above average individuals will expectedly have more
copies in the new population, while below average individuals will risk extinction.
Once the population of parents, that is of individuals that have been selected for re-
production, has been extracted, the individuals for the next generation will be produced
through the application of a number of reproduction operators, which can involve just
one parent (thus simulating asexual reproduction), in which case we speak of muta-
tion, or more parents (thus simulating sexual reproduction), in which case we speak of
recombination. In genetic algorithms two reproduction operators are used: crossover
and mutation.
To apply crossover, couples are formed with all parent individuals; then, with a cer-
tain probability, called crossover rate pcross, each couple actually undergoes crossover:
the two bit strings are cut at the same random position and the second halves are
swapped between the two individuals, thus yielding two novel individuals, each con-
taining characters from both parents.
After crossover, all individuals undergo mutation. The purpose of mutation is to
simulate the effect of transcription errors that can happen with a very low probability
(pmut) when a chromosome is duplicated. This is accomplished by flipping each bit in
every individual with a very small probability, called mutation rate. In other words,
each “0” has a small probability of being turned into a “1” and vice versa.
In principle, the above-described loop is infinite, but it can be stopped when a given
termination condition specified by the user is met. Examples of termination conditions
are:
• a pre-determined number of generations or time has elapsed;
• a satisfactory solution has been found;
• no improvement in solution quality has taken place for a pre-determined number
of generations.
All of the above termination conditions are acceptable under some assumptions rele-
vant to the context the evolutionary algorithm is used in.
205
LA COMPLEXITÉ
Marco TOMASSINI
The evolutionary cycle can be summarized by the following pseudo-code:
generation = 0
Seed Population
while not termination condition do
generation = generation + 1
Calculate Fitness
Selection
Crossover(pcross)
Mutation(pmut)
end while
5.2.4
A First Example
An example will illustrate the workings of genetic algorithms and show how a few sim-
ple concepts borrowed from natural evolution can give rise to a powerful optimization
technique. The following example is based on the MAXONE problem.
Suppose that we want to maximize the number of ones in a string of l binary digits.
At first sight this might look like a trivial problem, just because we know the solution
in advance: a string of l ones. However, if we imagine being faced with l yes/no
answers to an equal number of difficult questions, the problem of maximizing the
number of correct answers becomes less straightforward. But then, we can transform
that problem into our MAXONE problem simply by assuming that, for each question,
the right answer, be it yes or no, is encoded by 1 and the wrong one is encoded by 0.
The fitness of a candidate solution to the MAXONE problem is the number of ones
in its genetic code, the string of l binary digits.
We start with a population of n random strings. Suppose that l = 10 and n = 6:
we toss a fair coin 60 times and we get the following initial population:
s1 = 1111010101 f (s1) = 7
s2 = 0111000101 f (s2) = 5
s3 = 1110110101 f (s3) = 7
(5.1)
s4 = 0100010011 f (s4) = 4
s5 = 1110111101 f (s5) = 8
s6 = 0100110000 f (s6) = 3
where f is the fitness function, which associates with every binary string s its fitness
f (s).
Next we apply fitness proportionate selection with the roulette wheel method: we
sum up the fitness of individuals in the population, getting 34. We equate the full
circumference of a roulette wheel to the total fitness, 34, and we divide it into sec-
tors proportional to each individual’s fitness, then we simulate throwing a ball into it.
Therefore, when we spin the wheel, the ball will have a 7 = 0.2059 probability of
34
206
5. Evolutionary Algorithms
LA COMPLEXITÉ
stopping in string s1’s sector, and only a 3 = 0.0882 probability of stopping in string
34
s6’s sector.
We repeat the extraction using this method six times, that is as many times as the
individuals we need to complete our parent population. Suppose that, after performing
selection, we get the following population:
s1 = 1111010101 (s1)
s = 1110110101 (s
2
3)
s = 1110111101 (s
3
5)
(5.2)
s4 = 0111000101 (s2)
s = 0100010011 (s
5
4)
s6 = 1110111101 (s5)
We note that string s5 was extracted twice, while string s6 was never extracted, thus
being replaced by a copy of string s5.
Next we mate strings for crossover. Since the strings in the parent population have
been extracted in a random order, we can just mate s1 with s2, s3 with s4 and so on. For
each couple thus formed, we decide according to crossover probability (for instance
0.6) whether to actually perform crossover or not. Suppose that we decide to actually
perform crossover only for couples (s1, s2) and (s5, s6). For each couple, we randomly
extract a crossover point, for instance 2 for the first couple and 5 for the second couple.
Therefore, for couple (s1, s2), we will have
s = 11
1
· 11010101
(5.3)
s = 11
2
· 10110101
before crossover, and
s = 11
1
· 10110101
(5.4)
s = 11
2
· 11010101
after crossover. We notice that in this particular case no new genetic material is pro-
duced, since the two offspring are equal to their parents.
For couple (s , s )
5
6 , we will have
s = 01000
5
· 10011
(5.5)
s = 11101
6
· 11101
before crossover, and
s = 01000
5
· 11101
(5.6)
s6 = 11101 · 10011
after crossover: this time, individuals s5 and s6 are novel, each retaining characters
from both parents.
The final step to produce the population for the next generation is to apply random
mutation: for each bit that we are to copy to the new population we allow a small
probability of error (for instance 0.1). Since we have 60 bits overall to transcribe, we
207
LA COMPLEXITÉ
Marco TOMASSINI
expect that on average 6 of them will end up being flipped, for instance according to
the following pattern (bits that will be flipped are marked with a bar on top):
s1 = 11101¯10101
s = 1111¯01010¯1
2
s = 11101¯111¯01
3
(5.7)
s = 0111000101
4
s = 0100011101
5
s6 = 11101100¯11
Mutations do not have to be evenly distributed over the individuals. In this example,
individuals s2 and s3 were particularly “unlucky”, while individuals s4 and s5 passed
this stage untouched. If we carefully look a little more at the example, we can observe
that of six transcription errors, four make a “1” become a “0”, thus worsening the string
in which they occur: this should come to no surprise, since as a population adapts to
its environment, “good” genes tend to be more frequent than “bad” genes; therefore,
transcription errors, which blindly strike at random, will be much likely to disrupt good
genes and only occasionally will they introduce a fortuitous improvement.
After applying mutation, we end up with the following new population:
s = 1110100101 f (s ) = 6
1
1
s2 = 1111110100 f(s2 ) = 7
s = 1110101111 f (s ) = 8
3
3
(5.8)
s = 0111000101 f (s ) = 5
4
4
s5 = 0100011101 f(s5 ) = 5
s = 1110110001 f (s ) = 6
6
6
In one generation, the total fitness of the population passed from 34 to 37, thus im-
proving it by almost 9%. At this point, we go through the same process all over again,
getting populations for generation 2, 3, . . ., until a stopping criterion is met.
5.2.5
A Second Example
In this section we present a second example of the operation of the genetic algorithm,
this time involving real function optimization. This should bring things closer to the
actual use of GAs, although the problem is purely of illustrative value and can in fact
be solved by hand.
The non-constrained function minimization problem can be cast as follows. Given
a function f (x) and a domain D ∈ IRn, find x∗ such that:
f (x∗) = min{f(x) | ∀x ∈ D}
where x = (x1, x2, . . . , xn)T.
Let us consider the following function (see Figure 5.1):
f (x) = − | xsin( | x |) | + C
208
5. Evolutionary Algorithms
LA COMPLEXITÉ
400
300
200
100
0
-400
-200
0
200
400
Figure 5.1: Graph of f (x), x ∈ [−512, 512].
The problem is to find x∗ in the interval [−512, 512] which minimizes f. Since
f (x) is symmetric, studying it in the positive portion of the x axis will suffice.
Let us examine in turn the components of the genetic algorithm for solving the
given problem.
The initial population will be formed by 50 randomly chosen trial points in the
interval [0, 512]. Therefore, one individual is a value of the real variable x.
A binary string will be used to represent the values of x. However, this time a
decoding process from bit strings to real values will be needed, contrary to the previous
example. The length of the string will be a function of the required precision; the
longer the string the better the precision. For example, if each point x is represented
by 10 bits then 1024 different values are available for covering the interval [0, 512]
with 1024 points, which gives a granularity of 0.5 for x i.e., the genetic algorithm will
be able to sample points no less than 0.5 apart from each other.
The strings 0000000000 and 1111111111 will represent respectively the lower and
upper bounds of the search interval. Any other 10-bit string will be mapped to an
interior point. In order to map the binary string to a real number, the string is first
converted to a decimal number and then to the corresponding real x. Note that our use
of 10-bit strings is only for illustrative purposes; in real applications, finer granularities
and therefore longer strings are often needed.
The fitness of each sample point x is simply the value of the function at that point.
Since we want to minimize f , the lower the value of f (x) the fitter is x.
As in the MAXONE example, we apply fitness proportionate selection with the
roulette wheel method. We saw that with this selection method fitter members are
209
LA COMPLEXITÉ
Marco TOMASSINI
more likely to be reproduced; furthemore, strings can be selected more than once. As
before, once the new population has been produced, strings are paired at random and
recombined through crossover and the offspring replace their parents in the population
of the next generation. After crossover, mutation is applied to population members.
To measure the quality of our solutions we record both the average population fitness
and the fitness of the best individual at a given generation. As an example consider the
following table, showing the results of a particular evolutionary run.
Generation
Best
Average
0
1.0430
268.70
3
1.0430
78.61
9
0.00179
32.71
18
0.00179
14.32
26
0.00179
5.83
36
0.00179
2.72
50
0.00179
1.77
69
0.00179
0.15
As generation 0 consists of randomly generated individuals, we find, as expected,
that both the average and the best fitness values are low. We observe that fairly rapid
improvement ensues, with the minimum already found at generation 9. However, the
average population fitness continues to improve until the population becomes homoge-
nous and fitness values level off. This behavior is in fact characteristic of evolution-
ary algorithms in general. Note that in our simple example the probability of getting
“stuck” in a local minimum is practically zero. In harder problems, a compromise
must be reached between exploitation of “good” regions of the search space, (i.e., lo-
cal improvement), and further exploration of this space, in order to find possibly better
extrema points.
One final remark is in order; genetic algorithms are stochastic, thus their perfor-
mance varies between different runs (unless the same random number generator with
the same seed is used). Thus, the average performance taken over several runs is a
more useful indicator of their behavior than a single run.
The problem presented above is an easy one for GAs, as well as for any other opti-
mization method. GAs have been shown to be effective in solving hard mathematical
optimization problems, involving multimodal functions of several variables [28].
5.3
Theoretical Background of Genetic Algorithms
This section gives an account of the theoretical background of simple evolutionary
algorithms with binary representation, the genetic algorithms.
210
5. Evolutionary Algorithms
LA COMPLEXITÉ
5.3.1
Notation
The most general formulation of an optimization problem [29] is as follows:
min c(s)
(5.9)
subject to the constraint
s ∈ S.
(5.10)
S is called the feasible set of the problem domain, c is a cost function and s ∈ S
is called a feasible solution to the problem at hand.
Let Γ be the space of genotypes. A genotype is an arbitrary data structure that
encodes a certain solution to the problem. The space of genotypes is just the collection
of all such data structures. We shall assume in the sequel that a genotype can only
encode a feasible solution; this hypothesis can be relaxed in various ways that will be
dealt with in Section 5.7.3.
Which solution a given genotype encodes is established by a function M : Γ → Φ,
where Φ ⊆ S is the space of phenotypes, for analogy with natural genetics, in which
the phenotype is the set of characters shown by an actual realization of a genotype in a
living organism. Defining the function M amounts to choosing a particular encoding
for solutions. Of course, different genotypes may encode for the same solution, or
phenotype, but the reverse does not hold.
The concept that relates to the cost function c(·) is that of fitness. The fitness
function is a function f : Γ → [0, +∞), that depends on the cost of a solution through
some appropriate transformation function F ,
f (γ) = F [c(M (γ))],
(5.11)
such that a larger fitness value corresponds to a better (lower cost) solution: for all
γ, κ ∈ Γ,
f (γ) > f (κ)
if and only if
c(M (γ)) < c(M (κ)).
(5.12)
A population is a collection of individuals, each having its genotype and, as a
consequence, the corresponding phenotype. Let us denote by Γ∗ the space of all pop-
ulations consisting of any number of individuals. More than one individual in a pop-
ulation can share the same genotype and what distinguishes two populations of equal
size is only the number of individuals in which each genotype occurs. Therefore, a
convenient way to think of a population is as a multi-set, or bag, of genotypes, that is,
the equivalence class of n-tuples identical up to a permutation of their elements.
Every population x ∈ Γ∗ is in a 1-to-1 correspondence with a share function
qx : Γ → [0,1] giving the fraction qx(γ) of individuals in x that have genotype γ.
The size of population x can be conveniently denoted as x .
5.3.2
Main Theoretical Results
The Schema Theorem
An important concept for the analysis of genetic algorithms
is that of schema.
211
LA COMPLEXITÉ
Marco TOMASSINI
A schema is a subset S ⊆ Γ represented by a template string consisting of l sym-
bols in {0, 1, }, where “ ” plays the role of “wild card”: a schema thus contains all
strings that match its template string in all positions that are not marked by the “ ”
symbol.
There are in Γ exactly 3l distinct schemata. Each schema induces a bipartition of
Φ and M (S) ⊆ Φ is called the hyperplane defined by S.
The order o(S) of schema S is defined as the number of fixed positions (0 or 1) in
the template string representing it. The cardinality of schema S is bound to its order
by the relationship S = 2l−o(S). On the other hand, a string of length l is matched
by 2l schemata.
The absolute fitness of schema S is
1
f (S) ≡
f (γ).
(5.13)
S γ∈S
This quantity represents the expected fitness of an individual randomly extracted with
uniform probability from S.
The relative fitness of schema S with respect to population x ∈ Γ∗ is
1
fx(S) ≡
q
q
x(γ)f (γ).
(5.14)
x(S) γ∈S
This quantity represents the expected fitness of an individual randomly extracted from
population x, given that it belongs to schema S.
The defining length δ(S) of schema S is the distance between the first and last fixed
position in its template string. The defining length can be interpreted as a measure of
information “compactness” in a schema.
Let {Xt}t=0,1,... be the sequence of populations generated by the genetic algorithm
at generations t = 0, 1, . . .; assuming constant the ratio
f (S)
c = Xt
− f(Xt), t = 0,1,...,
(5.15)
f (Xt)
we have that
δ(S)
t
E[qX (S)
(S)(1 + c)t 1
,
(5.16)
t
|X0] ≥ qX0
− pcrossl −1 −o(S)pmut
where pcross and pmut are, respectively, the crossover and mutation rates.
In other words, it is expected that short, low-order, above-average schemata get an
exponentially increasing number of instances in subsequent generations.
This result, known as the Schema Theorem, can be proved as follows. We calcu-
late first of all the conditional expectation E[qX (S)
t
|Xt−1] keeping into account the
cumulative effects of crossover and mutation. Using fitness proportionate selection,
we get
f
E[q
Xt−1
X (S)
(S)
P
(S)(1 + c)P
t
|Xt−1] ≥ qXt−1 f(X
surv[S] = qXt−1
surv[S],
(5.17)
t−1)
212
5. Evolutionary Algorithms
LA COMPLEXITÉ
where Psurv[S] is the probability for the fixed positions of schema S of not being
touched by crossover and mutation. Clearly, Psurv[S] ≥ Psc[S]Psm[S], where Psc[S] is
the probability that S survives crossover and Psm[S] is the probability that S survives
mutation, because it could happen that a chance mutation restores a part of S corrupted
by crossover.
Therefore, Psc[S] and Psm[S] have to be calculated. It should be clear that the
defining length of a schema plays a significant role in its probability of survival: the
crossover point being uniformly chosen among l − 1 possible points, the probability
that the chosen point does not split the fixed part of a schema is given by 1 − δ(S); now,
l−1
there is a probability pcross that an individual undergoes crossover, whence it can be
written
δ(S)
Psc[S] = 1 − pcross
.
(5.18)
l − 1
As for mutation, the probability that each single position in a genotype is altered is
pmut and, given that mutation in each position is independent of mutation in the others,
the probability that no fixed position is altered is
Psm[S] = (1 − pmut)o(S) ≈ 1 − o(S)pmut,
(5.19)
since pmut
1. This makes it possible to write
δ(S)
E[qX (S)
(S)(1 + c) 1
t
|Xt−1] ≥ qXt−1
− pcrossl −1 −o(S)pmut .
(5.20)
Substituting qX
(S) with E[q
(S)
t−1
Xt−1
|Xt−2] and iterating the reasoning until we get
to qX (S) yields the thesis.
0
The Building Blocks Hypothesis
A consequence of the schema theorem is the so-
called building blocks hypothesis [25], which states that
An evolutionary algorithm seeks near-optimal performance through
the juxtaposition of short, low-order, high performance schemata: the
building blocks.
When the building block hypothesis does not hold we have deception. The simplest
case of deception happens when, for some schema S, γ∗ ∈ S but f(S) < f( ¯S), where
γ∗ is the optimal genotype and ¯
S is the complement of S. In such cases, the genetic
algorithm is deceived by schemata that are above average, but that do not lead in the
right direction.
Three remedies to deception have been proposed:
• if a priori knowledge of the objective function is available, one can use it to
come up with a non-deceptive encoding;
• one can introduce a new inversion operator which makes the semantics of genes
non-positional;
213
LA COMPLEXITÉ
Marco TOMASSINI
• genotypes might underspecify or overspecify a solution, this being the solution
adopted by Goldberg’s messy genetic algorithms [16].
Convergence in Probability
It has been proved [1,9] that, under certain rather mild
assumptions, the process {Xt}t=0,1,2,... converges in probability to a global optimum.
This is no impressive result: other very inefficient search algorithms, like random
search and exhaustive search, enjoy the same property. What convergence in proba-
bility means is that, provided we have enough patience, evolution will always find the
best solution.
Rate of Convergence
Although we know that an evolutionary algorithm will always,
sooner or later, find the optimal solution for every problem, we still have no clue as to
how long we will have to wait.
The general rate of convergence of evolutionary algorithms is still an open ques-
tion, and a very difficult one at that.
It is reasonable to conjecture that all problems might be characterized by the rate at
which a well designed evolutionary algorithm converges to their solution. Evolution-
ary algorithms might thus give rise to their own problem complexity classes, and these
classes might have some relationship with the traditional computational complexity
classes of computer science.
While this could be an ambitious program for future research in the field of evo-
lutionary computing, to date, to the authors’ knowledge, no significant result has been
achieved in this direction.
5.4
Introduction to Classifier Systems
Classifier systems (CS) provide another look at genetic algorithms, one which is not
based on optimization concepts but rather on machine learning ideas. Machine learn-
ing is a field of artificial intelligence which deals with the question of how to build com-
puter programs that can improve their behaviour by experiencing their environment;
that is, they should be able to change the course of calculations as the model accu-
mulates experience. One early and important piece of work in this field was Samuel’s
checkersplaying program, which is still significant today [36]. There exist many spe-
cific learning techniques and we cannot discuss all of them here; the interested reader
can consult Mitchell’s book for a complete survey of the subject [26]. We will meet
several machine learning ideas especially in connection with neural networks. Here
we describe classifier sytems, a machine learning technique that has deep connections
with genetic algorithms. For an in-depth treatment of the subject see Goldberg’s book
[15].
Classifier systems are rule-based systems. Rule-based systems belong to the class
of production systems, a computational class of systems consisting of a set of rules
214
5. Evolutionary Algorithms
LA COMPLEXITÉ
each having a left side that determines the applicability of the rule and a right side that
describes what is to be done if the rule is applied. Each rule maps a problem state into
a new state, where the process is applied again, or a solution is found. The rules in
classifier systems are called classifiers and are of the following form:
IF condition THEN action
which is to be interpreted such that action is executed if condition is true. Unlike
classical expert systems, which need access to a substantial domain knowledge base,
usually established through human experts and translated into rules, classifier systems
may discover new rules through simulated evolution. Furthermore, rules in a classifier
systems are processed in parallel as a whole, whereas traditional expert systems only
fire one rule at a time in sequential fashion.
Environment
message
list
input
s
output
message
r
s
r
o
message
t
o
t
c
c
e
t
e
f
e
f
rule
d
e
population
credit
discovery
of
sharing
system
classifiers
system
(GA)
Figure 5.2: Schematic architecture of a learning classifier system.
A classifier system has three main components:
• message and rule system
• credit-assignment system
• genetic algorithm
Classifier systems work roughly in the following way (see also Figure 5.2). A popu-
lation of rules is encoded as bit strings and evolves over time. The environment sends
messages representing example events to the CS which are decoded by the detectors
215
LA COMPLEXITÉ
Marco TOMASSINI
and placed into an internal message list. The classifier system then tries to match mes-
sages with one or more classifier conditions. If some conditions match, an action is
selected and applied by the effectors through a coding of the corresponding output
message. Classifiers are given fitness values, also called strength , through a credit-
assignment system, and new rules can be discovered in the classifier population by the
genetic algorithm. The syntax of a classifier (rule) is of the type:
condition
: message
A message is simply a finite string over some alphabet, e.g. the binary alphabet
{0,1}k. The condition part of the rule can also be coded with the same alphabet plus
a “wild card” symbol
(see Section 5.3.2) that matches either a 0 or a 1. Thus, for
example, both the messages 0011 and 0101 would match the condition 0
1. When
a message matches a classifier condition the classifier may place its own message part
into the message list. Classifiers are all of the same length.
When classifiers are activated by a message, a market-like mechanism is employed
to choose which one will win and be able to post its message into the list and how the
partial worth of other classifiers in the activation chain is to be acknowledged. Several
solutions have been proposed for sharing the credit among classifiers, all of them being
relatively involved. For the sake of simplicity, it suffices to say here that the classifier
worth, called its bid , is proportional to its strength and that a reward system is set
up such that successful classifiers get a positive reward that increases their current
strength. However, the successful classifier recognizes its debt by sharing its bid in
some way among those classifiers that sent the messages that matched the bidding
classifier’s condition. More details on the credit-sharing problem may be found in [15]
where the original Holland’s “bucket brigade” algorithm is described as well as more
recent variations.
The genetic algorithm part of the classifier system allows it to generate new, possi-
bly superior, classifiers. New rules are built by recombining and mutating the current
rules in the population. Crossover is identical to the one used in GAs for search and
optimization purposes. Mutation is also the same but, since the wild card symbol is
also permitted, when a symbol is to be mutated it can change to one of the other two
with equal probability. Another difference with the classical GA is that in machine
learning the whole population of classifiers co-evolves and learns collectively. Replac-
ing the whole population at once after the selection phase would disrupt the learning
process. In practice, individuals are replaced more gently and taking into account their
degree of similarity.
The above schematic presentation of classifier systems corresponds to the original
model proposed by Holland (see for instance [19] and [15] for an historical perspective
on the development of CS). A more recent version of classifier systems is the one
presented by Wilson [41], called the XCS classifier system. XCS differs from the
standard CS model in various respects, especially in the way actions are selected after
a set of classifiers that match a given message has been formed. The apportionment of
credit algorithm is also different and is based on reinforcement learning ideas.
216
5. Evolutionary Algorithms
LA COMPLEXITÉ
For the sake of completeness, it is to be noted that classifier systems are not the only
evolutionary approach to machine learning. In the so-called Pitt approach, instead of
having a population of co-evolving rules that are to be considered as an integrated set,
each individual represents a set of rules which are complete solutions to the problem
(see [25]). Genetic programming, which is the subject of the next section, can also be
considered as an evolutionary machine learning technique.
5.5
Genetic Programming
Genetic programming (GP) is a new evolutionary approach which extends the genetic
model of learning to the space of programs. It is a major variation of genetic algorithms
in which the evolving individuals are themselves computer programs instead of fixed
length strings from a limited alphabet of symbols. Genetic programming is a form
of program induction that can be used to automatically discover programs that solve
or approximately solve a given task. The present form of GP is principally due to J.
Koza[22].
Individual programs in GP might be expressed in principle in any current program-
ming language. However, the syntax of most languages is such that GP operators
would create a large percentage of syntactically incorrect programs. For this reason,
Koza chose a syntax in prefix form analogous to LISP and a restricted language with
an appropriate number of variables, constants and operators defined to fit the problem
to be solved. In this way syntax constraints are respected and the program search space
is limited. The restricted language is formed by a user-defined function set F and ter-
minal set T . The functions chosen are those that are a priori believed to be useful for
the problem at hand, and the terminals are usually either variables or constants. In ad-
dition, each function in the function set must be able to accept as arguments any other
function return value and any data type in the terminal set T , a property that is called
syntactic closure. Thus, the space of possible programs is constituted by the set of all
possible compositions of functions that can be recursively formed from the elements
of F and T .
As an example, suppose that we are dealing with simple arithmetic expressions in
four variables. In this case, suitable function and terminal sets might be defined as:
F = {+,−,∗,/}
and
T = {A,B,C,D}
and the following are legal programs: (+ (* A B) (/ C D)), and (* (- (+
A C) B) A).
It is important to note that GP does not need to be implemented in the LISP lan-
guage (though this was the original implementation). Any language that can represent
programs internally as parse trees is adequate. Thus, most GP packages today are
written in C, C++ or Java rather than LISP. GP representation is not restricted to trees
217
LA COMPLEXITÉ
Marco TOMASSINI
however. Other program representations have been proposed such as linear and graph
[6].
For the sake of simplicity and generality, we will depict programs as trees with
ordered branches in which the internal nodes are functions and the leaves are the ter-
minals of the problem. Thus, the examples given above would give rise to the trees in
Figure 5.3.
+
*
*
/
-
A
+
B
A
B
C
D
A
C
Figure 5.3: Two GP trees corresponding to the LISP expressions in the text.
Evolution in GP is similar to GAs, except that different individual representation
and genetic operators are used. Once suitable functions and terminals have been de-
termined for the problem at hand, an initial random population of trees (programs)
is constructed. From there on the population evolves as with a GA where fitness is
assigned after actual execution of the program (individual) and with genetic operators
adapted to the tree representation. Fitness calculation is a bit different for programs. In
GP we would like to discover a program that satisfies a given number N of predefined
input/output relations: these are called the fitness cases. For a given program pi its
fitness fj(pi) on the j-th fitness case represents the difference between the output gj
produced by the program and the correct answer Gj for that case. The total fitness
f (pi) is the sum over all N fitness cases of some norm of the cumulated difference:
N
f (pi) =
gk − Gk .
(5.21)
k=1
Obviously, a better program will have a lower fitness under this definition, and a perfect
one will score 0 fitness.
The crossover operation starts by selecting a random crossover point in each parent
tree and then exchanging the sub-trees, giving rise to two offspring trees, as shown in
Figure 5.4. The crossover site is usually chosen with non-uniform probability, in order
to favor internal nodes with respect to leaves. Mutation is implemented by randomly
removing a subtree at a selected point and replacing it with a randomly generated
subtree, although this operator is seldom used.
One problematic step in GP is the choice of the appropriate language for a given
problem. In general, the problem itself suggests a reasonable set of functions and
terminals but this is not always the case. Although experimental evidence has shown
that good results can be obtained with slightly different choices of F and T , it is clear
that the choice of language has an influence on how hard the problem will be to solve
with GP. For the time being, there is no guideline for estimating this dependence nor
218
5. Evolutionary Algorithms
LA COMPLEXITÉ
for choosing suitable terminal and function sets. Self-adaptation or co-evolution (see
Section 5.7.5) of functions and terminals might help in finding good language building
blocks.
+
parent_1
-
parent_2
+
child_1
-
child_2
a
*
/
x
a
/
/
*
x
b
c
y
+
y
+
b
c
5
z
5
z
Figure 5.4: Example of crossover of two genetic programs.
Another controversial issue has to do with the size of the GP trees. The depth of
the trees can in principle increase without limits under the influence of crossover, a
phenomenon that goes under the name of “bloating”. The increase in size is often
accompanied by a stagnant population fitness. Most GP systems have a parameter
that prevents trees from becoming too deep, thus filling all the available memory and
requiring longer evaluation times. To further avoid bloating, a common approach is
to introduce a size-penalty term into the fitness expression, possibly in a self-adapting
way. There is still some debate among practitioners in the field as to whether one
should let the trees breed and grow until the maximum depth or whether to edit and
simplify them along the way in order to obtain shorter programs. The argument for
larger trees is that the often redundant genetic material has a richer set of breeding
possibilities and may lead to increased diversity in successive populations. On the
other hand, the use of parsimony through Minimum Description Length principles or
size penalties may give rise to compact and efficient solutions in some cases [42]. The
issue is difficult to settle due to our currently limited knowledge about the dynamics
of the evolution of program populations.
Plain GP works well for problems that are not too complex and that give rise to
relatively short programs. To extend GP to more complex problems some hierarchical
principle has to be introduced. In any problem-solving activity hierarchical consider-
ations are needed to produce economically viable solutions. This is true in classical
top-down design where some form of divide-and-conquer strategy is routinely used to
decompose the problem into manageable subproblems. The same considerations are
also useful when working bottom-up, as in artificial evolutionary methods. It has been
observed by several researchers that during evolution some subtrees appear repeatedly
within the population as parts of successful individuals. Those trees that seem to per-
form a useful function might be identified, encapsulated into modules, and reused as
single units in the evolutionary process. Methods for automatically identifying and
extracting useful modules within GP have been discussed by Koza under the name of
Automatically Defined Functions (ADF) ([23]), by Angeline and Kinnear in [11], and
by Rosca in [34].
219
LA COMPLEXITÉ
Marco TOMASSINI
GP is particularly useful for program discovery, i.e. the induction of programs that
correctly solve a given problem with the assumption that the form of the program is
unknown and that only the desired behavior is given, e.g. by specifying input-output
relations. Genetic programming has been successfully applied to a wide variety of
problems from many fields, described in [22,23,6] and, more recently, in [2,11]. In
conclusion, GP has been empirically shown to be quite a powerful automatic or semi-
automatic program-induction and machine learning methodology.
5.6
Evolution Strategies and Evolutionary
Programming
This section provides an overview of evolution strategies and evolutionary program-
ming along with some relevant theoretical results.
5.6.1
Evolution Strategies
Evolution strategies [37,33,38] approach function optimization problems in the l-
dimensional real space by exploiting a real encoding of the objective function param-
eters.
Phenotypes are l-dimensional vectors, i.e. Φ = IRl. A genotype is made up of the
same vector as the associated phenotype, plus up to l variances cii = σ2i, with i =
1, . . . , l, and up to l(l − 1)/2 covariances cij, with i,j = 1,...,l, of the l-dimensional
normal joint distribution having vector 0 as its expectation and density function, for
all z ∈ IRl,
det C−1
p(z) =
e− 1 zT C−1z
2
,
(5.22)
(2π)l
where C = (cij) is the variance/covariance matrix. The choice of a normal distri-
bution, that will be used to perturb the genotypes, is obviously arbitrary. Overall, an
individual will contain k ≤ l(l + 1)/2 parameters relevant to the “strategy” besides
the l parameters relevant to the object problem, whence in general Γ = IRk+l; often,
however, only variances are considered, whereas sometimes it is sufficient to consider
one variance for all the object problem parameters.
The fitness of an individual γ ∈ Γ is obtained by scaling the objective function
to make it positive and, sometimes, by adding a random noise described as a random
variable W ,
f (γ) = F {c[M(γ)],W}.
(5.23)
Mutation
In its most general form, the mutation operator perturbs a genotype γ =
(z, C) by first randomly modifying C and then z according to the new probability
220
5. Evolutionary Algorithms
LA COMPLEXITÉ
distribution provided by C, thus producing a new individual γ = (z , C ), where
z = z + N(0, C ),
(5.24)
where N(0, C ) denotes a random vector with normal joint distribution with mean 0
and variance/covariance matrix C .
This mutation mechanism allows the algorithm to autonomously evolve the param-
eters relevant to its strategy while searching for the solution: the resulting evolutionary
process has been called self-adaptation [39].
In practice, instead of modifying C directly, contemporary evolution strategies
use the decomposition C = (ST)T (ST), where S is the diagonal matrix of standard
deviations (sii = σi), and
l−1
l
T =
Rij(αij)
(5.25)
i=1 j=i+1
is the product of l(l −1)/2 elementary rotation matrices1 Rij with angles αij ∈ (0, 2π].
The l(l − 1)/2 rotation angles and the l standard deviations can be directly used to
generate a joint normal vector deviate
∆z = TT ST N(0,I),
with the same probability density function as in Equation 5.22.
On the basis of this, mutation in its most general form is a three-stage operation,
consisting in
1. perturbing every rotation angle α according to α = α + ϕN(0, 1) (mod 2π);
2. updating the standard deviations for each variable by the lognormal self-
adaptation method, whereby σi = σieτN(0,1);
3. applying Equation 5.24 in the form
z = z + TT (α )ST (σ )N(0, I).
Recombination
Evolution strategies utilize various recombination mechanisms,
which in the simplest case produce one child individual from a couple of parents or,
in the global case, can form the new individual by combining all the individuals in the
population (orgy).
1An elementary rotation matrix R with an angle α is obtained from an identity matrix by replacing
four entries, identified by indices i and j, as follows:
rii = rjj
= cos α,
rij = −rji = −sinα.
221
LA COMPLEXITÉ
Marco TOMASSINI
The most widely used recombination mechanisms are discrete and intermediate
recombination: in discrete recombination each component is copied from one of the
parents at random; in intermediate recombination the value of each component of the
child individual is a linear combination of the corresponding component of all the
parents participating in the operation.
It has been observed that the best results are obtained by applying discrete re-
combination to the object problem parameters and intermediate recombination to the
strategy parameters. Furthermore, it has been proved that recombination of the latter
is required for self-adaptation phenomena to take place.
Selection
Selection in evolution strategies is deterministic, according to two alterna-
tive schemes, which define two classes of strategies, (n, m) and (n + m). In (n, m)
strategies, from a population of n individuals, m > n offspring are produced and the
best n of them are kept for the next generation. The n parents are always discarded to
make room for the best offspring. In (n + m) strategies, on the contrary, the best n
individuals among the m offspring and the n parents survive into the next generation:
an (n + m) strategy never discards the best solutions so-far (elitism), thus guarantee-
ing a monotone improvement of the population; on the other hand such a strategy has
a hard time reacting to problems that change in time and does not support evolution
of strategy parameters in a satisfactory way, in particular for small populations. For
these reasons (n, m) strategies are nowadays preferred, where experiments point to an
optimal fraction n/m ≈ 1 [39].
7
Theoretical Results
An early theoretical result was the so-called 1 -success rule,
5
stated by Rechenberg [33], which provides a method for controlling the standard devi-
ation on the basis of the observed frequency of mutations resulting in an improvement
of the individuals undergoing them (success):
The optimal fraction of success over all mutations is 1 . If it is greater than
5
1 , increase the standard deviation; if it is less, decrease it.
5
There are two problems concerning the use of this rule:
• sometimes the success rate remains below 1 even when the standard deviation is
5
decreased to zero;
• the rule gives no suggestion as to how single deviations should be treated indi-
vidually and therefore does not allow the scaling of the average mutation step
along distinct axes of the coordinate systems in a different way.
It follows that this result is really useful only for strategies with one standard deviation
and not using recombination and self-adaptation. It has to be said that this rule was
never intended to be used with anything other than (1 + 1) evolution strategies.
222
5. Evolutionary Algorithms
LA COMPLEXITÉ
The general expression for the convergence rate, i.e. the expected rate of improve-
ment for the average fitness in a population ϕt = E[f (Xt+1)/f (Xt)|Xt] for (n, m)
and (n + m) strategies using one standard deviation and without recombination and
self-adaptation was obtained by Schwefel [38].
A recent result for evolution strategies is convergence in probability for the (1 + 1)
strategy proven by Günter Rudolph [35]. His proof can be extended to the general
(n + m) case, but not to (n, m) strategies.
5.6.2
Evolutionary Programming
Evolutionary programming [13,14] is an approach to Artificial Intelligence making
use of finite states automata. Intelligent behavior requires both the capability of pre-
dicting the environment and mechanisms to translate those predictions into reactions
appropriate for reaching a goal. At the highest level of generality, the environment is
described as a sequence of symbols from a given finite alphabet. The task is therefore
to evolve an algorithm operating on a sequence of observed symbols and producing an
output symbol so as to maximize its performance with respect to the next environment
symbol according to a well-defined reward function.
A finite states automaton [20] is a five-tuple Q, Σ, Z, δ, ω , where Q is the set of
internal states, Σ is the input alphabet, that is the environment, Z is the output alphabet,
δ : Σ × Q → Q is the transition function and ω: Σ × Q → Z is the output function.
In evolutionary programming the space of phenotypes Φ is the set of finite states
automata with Q, Σ, and Z fixed according to the problem approached. The space
of genotypes Γ consists in descriptions of functions δ and ω, typically in the form of
Q × Z tables having a pair (q,a), q ∈ Q, and a ∈ Z, in each cell; more compact
encodings are often used when the problem structure makes it possible.
Fitness of automata in a population is calculated by applying them to an observed
sequence of symbols. One symbol at a time is fed into each automaton and its output
compared with the next symbol in the sequence; the accuracy of prediction is measured
according to the reward function. When all the symbols have been read, the fitness of
each automaton is given by a function of the single rewards (e.g. the average per-
symbol reward).
New automata are generated by random mutation from each automaton already
in the population: typically each parent produces one child. There are five kinds of
mutation suggested by the description of an automaton:
1. replacing an output symbol;
2. replacing a state in the transition function definition;
3. adding a new internal state;
4. removing an internal state;
223
LA COMPLEXITÉ
Marco TOMASSINI
5. changing the initial state.
These operations are subject to constraints on the maximum and minimum number
of internal states; mutation operates according to an assigned probability distribution,
typically uniform; furthermore the number of mutations of an individual can be in turn
governed by a probability law, for example a Poisson distribution.
In its original formulation evolutionary programming does not provide for a re-
combination operator.
The selection process can be performed according to one of several general tech-
niques, including:
1. the best n solutions are retained to become the parents for the next generation
(truncation selection);
2. each individual is compared against k randomly chosen other individuals, and
the n individuals with the best win records are chosen (a sort of tournament
selection);
3. standard fitness proportionate selection, as in genetic algorithms.
Theoretical Results
David B. Fogel [12] has proved global convergence in proba-
bility for evolutionary programming.
5.7
Advanced Topics
This section gives an idea of the full range of evolutionary techniques, discussing the
main issues relevant to their application in practice.
5.7.1
Selection Methods and Reproduction Strategies
Exploration vs. Exploitation
The purpose of selection in evolutionary algorithms
is to concentrate the use of the available computational resources in promising regions
of the search space.
There is a relationship of reciprocity between the aspects of exploration and ex-
ploitation of the search space and, clearly, the stronger the pressure exerted by se-
lection toward a concentration of the computational effort, the smaller the fraction of
resources utilized to explore other possibilities. At one extreme, as selective pressure
decreases and, as a consequence, the resources employed for exploration increase, an
evolutionary algorithm tends to behave just like a raw Monte Carlo method, randomly
sampling the space of feasible solutions; at the other extreme, as the selective pres-
sure increases, the evolutionary algorithm degenerates into a local “gradient descent”
search method.
224
5. Evolutionary Algorithms
LA COMPLEXITÉ
These two extremes are in a precise correspondence with two typologies of objec-
tive functions: on one side gradient descent methods ensure fast convergence to the
global optimum for unimodal objective functions; on the other side, the only algo-
rithm ensuring almost certain convergence to the global optimum for highly irregu-
lar objective functions (almost everywhere discontinuous, multimodal, noisy, etc.) is,
apart from exhaustive search, which is impractical almost for all problems, random
sampling.
Evolutionary algorithms can thus be regarded as a trade-off between these extremes
and selection is the instrument to adjust it.
Fitness Proportionate Selection
Fitness proportionate selection was derived by
Holland as the optimal trade-off between exploration and exploitation using an analogy
with the k-armed bandit [18].
Despite being rigorously derived and having a deep justification in Decision The-
ory, fitness proportionate selection has some drawbacks.
For instance, consider two fitness functions f1(γ) and f2(γ) = f1(γ) + c, for all
γ ∈ Γ. Since they appear substantially equivalent, one would expect the behavior of
an evolutionary algorithm not to change by replacing one with the other; however, if
the selection scheme is fitness proportionate this is obviously false.
225
LA COMPLEXITÉ
Marco TOMASSINI
Another difficulty is represented by so-called superindividuals. A superindividual
in population x is an individual γ ∈ Γ such that f(γ)
f (x). A superindividual
is allocated by fitness proportionate selection a prominent slice of copies in the next
generation and, in a few generations, it ends up overwhelming any other genotypes
initially in the population, causing convergence. If the superindividual corresponds
to the problem’s global optimum, this is exactly what is desired, but if otherwise it
is associated with a local optimum, this leads to a failure of the algorithm, called
premature convergence.
The list of problems does not end here. The push toward improvement provided
by fitness proportionate selection asymptotically tends to zero as the individuals in a
population together approach the optimum. While this in fact reflects what is observed
in nature, in the framework of optimization it amounts to an actual drawback.
In order to solve these difficulties two approaches are possible: either appropri-
ately modifying the fitness function or, more simply, resorting to alternative selection
schemes.
Linear Ranking Selection
Linear ranking selection [5] is based on a sorting of in-
dividuals by decreasing fitness. The probability for the ith individual in the ranking of
being extracted is thus defined, for i = 1, . . . , n, as
1
i
p(i) =
β
− 1 ,
(5.26)
n
− 2(β − 1)n −1
where 0 ≤ β ≤ 2 is a parameter that can be interpreted as the expected sampling rate
of the best individual across n independent extractions with re-insertion.
Local Tournament Selection
Local tournament selection [7], extracts k individuals
from the population with uniform probability but without re-insertion and makes them
play a “tournament”, which is won, in the deterministic case, by the fittest individual
among the participants. The tournament may be probabilistic as well, in which case
the probability for an individual to win it is generally proportional to its fitness.
Selective pressure is directly proportional to the number k of participants in a tour-
nament: for k = n, the population size, deterministic local tournament selection de-
generates into truncation selection with parameter τ = 1 (see below), whereas proba-
n
bilistic local tournament selection degenerates into fitness proportionate selection.
Truncation Selection
Truncation selection [27], has its inspiration in the science of
breeding, a branch of applied statistics, and the main concepts on which it relies are
the correlation between parent and offspring and the inheritance coefficient.
As Mühlenbein and Schlierkamp-Voosen point out, there is no major difference
between breeding natural organisms and solutions to a problem. A minor difference
is that in the latter case it is possible to control the genetic operators of mutation and
recombination and modify them in order to get the greatest advantage out of them.
226
5. Evolutionary Algorithms
LA COMPLEXITÉ
The most interesting aspect of selection for a breeder is the response to selection
R, defined as the difference between the average fitness in two subsequent generations:
Rt = f (xt+1) − f(xt).
(5.27)
Breeders measure selection through the selective differential S, defined as the differ-
ence between the average fitness of the individuals selected for reproduction and the
average fitness of the entire population:
St = f (¯
xt) − f(xt), ¯xt xt.
(5.28)
Truncation selection consists in selecting for reproduction just the best individuals
and discarding the rest. The selective differential depends on the proportion τ of the
population that gets selected: the smaller τ , the greater St.
Classification of Selection Schemes
A possible taxonomy of selection schemes is
the following, whereby a selection scheme can be classified according to at least four
independent axes [4]:
• dynamic – static depending on whether the selection probabilities depend on the
fitness values actually present in the population, varying across generations: fit-
ness proportionate selection is thus static, while linear ranking, local tournament
and truncation selection are dynamic;
• preservative – extinctive, depending on whether it guarantees to every individual
a non-zero probability of being selected: thus truncation and deterministic local
tournament selection are extinctive, fitness proportional selection is preservative
and linear ranking selection is preservative for β < 2 and extinctive for β ≥ 2;
• elitist – pure, depending on whether it guarantees the survival of the best indi-
vidual unchanged into the next generation.
• generational – steady-state, depending on whether the set of parents is deter-
mined once and remains fixed until a new population of offspring has been pro-
duced, or the parents are extracted at different times and their offspring are in-
troduced into the population as they are produced.
5.7.2
Specialized Representations and Genetic Operators
The recent trend in the effective application of evolutionary algorithms to real world
problems is to abandon general encoding schemes like bit strings and rely on the fol-
lowing key ideas:
• use a data structure as close as possible to the natural representation suggested
by the object problem;
227
LA COMPLEXITÉ
Marco TOMASSINI
• write appropriate genetic operators as needed;
• if possible, ensure that all genotypes correspond to feasible solutions;
• if possible, ensure that genetic operators preserve feasibility.
At the same time, it is advisable to represent solutions in such a way that the
“genes” which encode them be as orthogonal as possible. By orthogonality here it
is meant that the semantics of each gene depends only on its allele (or value) and is in-
dependent of other genes. When this property does not hold, and there are interactions
among genes, especially interactions in which one gene suppresses the expression of
another, geneticists speak of epistasis.
Although it is difficult to formulate a general recipe that makes it possible, given
an object problem, to come up with the best encoding scheme for it, it is often use-
ful to compare one’s problem with others that have been successfully solved using
evolutionary algorithms in the scientific literature, hoping to find one that is roughly
similar.
However, it is possible to give a very coarse classification of problems which can
serve as a guide for the choice of the representations that are most likely to work well.
A partial attempt to list such big classes follows, along with suggestions for appropriate
encodings.
Pie Problems
Problems that involve finding a (constrained) weight assignment,
which can effectively be represented by a pie chart, may be referred to as pie prob-
lems. The solution can be encoded as a vector of integer or floating-point numbers and
the decoder can be written in such a way as to ensure that no constraint is violated.
For example, if we are given a finite amount of money and we are to find the best
way to invest it on N assets, we have the following constraints:
N
wi = 1 and, for all i, wi ≥ 0,
(5.29)
i=1
where wi is the relative weight of the amount invested on the ith asset with respect to
the total. These two constraints are easy to enforce when we encode solutions as an
unconstrained array of N integers between 0 and a positive constant gmax, simply by
defining the decoding function M : Γ → [0, 1]N as follows:
γ
w
i
i =
,
(5.30)
N
γ
j=1
j
where w = M (γ) and γi is the ith integer of the genotype. This is just an example,
but it suggests a general approach to handling constraints in “pie” problems.
228
5. Evolutionary Algorithms
LA COMPLEXITÉ
Parameter Optimization Problems
These are problems that involve finding opti-
mal values for a set of parameters of a predetermined mathematical formula: engi-
neering design and numerical regression fall into this class of problems. Evolution
strategies (see Section 5.6.1) provide very effective techniques for dealing with this
kind of problem.
Permutation Problems
These are problems that involve finding a (possibly con-
strained) permutation of elements: this class comprises some famous problems from
operations research like the Traveling Salesman Problem, where an order in which
to visit a number of cities is sought for. A wealth of encoding schemes have been
proposed for this kind of problem.
There are at least five sensible ways of encoding a permutation of N elements.
Suppose N = 9; we will identify the nine elements with numbers 1, 2, . . . , 9. For
instance, we want to represent permutation
1 − 2 − 4 − 3 − 8 − 5 − 9 − 6 − 7.
(5.31)
These are some possibilities:
• adjacency representation: the genotype is made up of N integers between 1 and
N included; a number j in position i indicates that j is the element that comes
after element i in the permutation; therefore, the permutation in equation (5.31)
would be encoded as (2, 4, 8, 3, 9, 7, 1, 5, 6);
• ordinal representation: the genotype is again made up of N integers, but here
we imagine starting with all N elements sorted in ascending order; a number j
in position i tells us to extract the jth elements among the remaining N − i + 1
and use it as the ith element of the permutation; according to this scheme, the
permutation in equation (5.31) would be encoded as (1, 1, 2, 1, 4, 1, 3, 1, 1);
• path representation: here representation is direct: our sample permutation would
be simply encoded as (1, 2, 4, 3, 8, 5, 9, 6, 7);
• matrix representation: the genotype is an N × N matrix of binary digits; a 1 in
position (i, j) means that element i comes before element j in the permutation;
the permutation in Equation (5.31) would be encoded as
0 1 1 1 1 1 1 1 1
0 0 1 1 1 1 1 1 1
0 0 0 0 1 1 1 1 1
0 0 1 0 1 1 1 1 1
0 0 0 0 0 1 1 0 1 ;
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 1 0 0
0 0 0 0 1 1 1 0 1
0 0 0 0 0 1 1 0 0
229
LA COMPLEXITÉ
Marco TOMASSINI
• sorting representation: the genotype is an array of N reals or integers, and the
permutation is obtained by sorting these numbers in increasing order; a possible
representation of our sample permutation would be
(−23,−6,2,0,19,32,85,11,25).
Note that according to this encoding scheme, there are many different genotypes
that encode for the same permutation.
Mapping Problems
Problems that require finding a mapping from a set of inputs to
a set of outputs might be termed mapping problems: This class includes, among oth-
ers, symbolic regression, series prediction, system modeling, and control. Solutions to
these problems can be conveniently represented as mathematical formulas or simple
programs, for which genetic programming (see Section 5.5) offers a well-established
set of specific operators and techniques; alternative representations include neural net-
works and (fuzzy) rule sets, in which case we fall into the class of problem involving
parameter optimization. Most of the rest of the present book will be dedicated to the
interaction of these techniques and to their applications.
Specialized Genetic Operators
For each encoding scheme like the ones briefly out-
lined above specific genetic operators have been defined, that operate at the semantic
level of the representation. In other words, these specialized genetic operators ma-
nipulate elements of a solution in a hopefully sensible way, preserving feasibility of
solutions and syntax of the representations.
A detailed treatise of the specialized genetic operators that have been proposed
in the literature goes beyond the scope and length of this course. What is important
for the reader to keep in mind is that the issues of representation and specialized ge-
netic operators are intimately tied and they are one of the frontiers of research on the
application of evolutionary algorithms to problems of practical relevance.
5.7.3
Handling Constraints
Three techniques have been proposed and used in order to deal with non-trivial con-
straints:
• the use of penalty functions;
• the use of decoders or repair algorithms;
• the design of appropriate encodings and specialized genetic operators.
Penalty functions are appropriate functions associated with each constraint in the
object problem, which measure the degree to which their constraint is violated by a
solution. As their name indicates, these functions are combined with the objective
230
5. Evolutionary Algorithms
LA COMPLEXITÉ
function to calculate the fitness of an individual, providing a penalty for each con-
straint violation. While penalty functions are very general and their application is
straightforward, by using them one risks spending most of the time evaluating infea-
sible solutions, eventually sticking with the first feasible solution found, or finding an
infeasible solution that scores better than all feasible solutions.
Decoders are used to translate a genotype into a feasible solution, even when the
natural decoding would produce an infeasible solution. Repair algorithms are applied
to new individuals generated by mutation or recombination to check whether they
violate any constraints and, if that is the case, to “repair” the damage, by transforming
their genotype in such a way that it corresponds to a feasible solution. Decoders and
repair algorithms are often computationally intensive and have to be tailored to the
particular application.
The design of appropriate encodings such that all possible genotypes encode for
feasible solutions and/or specialized genetic operators that preserve feasibility is more
of an art, requiring some insights into the structure of the object problem. It goes
without saying that when such an approach can be taken it results in a much better
performance than the other two and a greater elegance.
5.7.4
Hybrid Evolutionary Algorithms
Three methods whereby an evolutionary algorithm can be hybridized with available
heuristics can be distinguished:
• seed the population with solutions provided by some heuristics;
• use local optimization algorithms as genetic operators;
• encode parameters of a heuristics instead of a solution and then use the heuristics
to decode the genotype into its corresponding phenotype.
The first technique is also the simplest. In almost all domains in which an opti-
mization problem arises some sort of preliminary, sub-optimal solution is available,
either because an expert has developed it by hand or because some local optimization
algorithm has already been used. Care must be taken, however, when seeding evolu-
tionary algorithms with prefabricated solutions, in that the search process might thus
be misled into a local optimum and miss better but completely different solutions.
The main motivation behind the second technique is that, as it was observed in
Section 5.2.4, as the population starts adapting to the problem, mutation most likely
disrupts good solutions rather than further improving them: improving mutations be-
come rarer and rarer. The case for using an available local optimization algorithm as
a mutation-type genetic operator is made by the need to improve the algorithm perfor-
mance by artificially making “good” mutations more frequent. If the locally improved
genotype is kept in the new population then the process is commonly called Lamarck-
ian mutation. It is also possible to just keep track of the new and better search point
(i.e. phenotype) without incorporating the change into the original genotype.
231
LA COMPLEXITÉ
Marco TOMASSINI
The third technique is the most sophisticated. It actually implies a total change
of perspective – and representation. The search space of parameters of an available
heuristics is likely to be much smaller than the space of all solutions to the object prob-
lem. The kind of heuristics that typically are available can be characterized as greedy
algorithms. These algorithms usually rely on weights assigned to the elements of a so-
lution in order to combine them in the locally best way. If this is the case, those weights
can be viewed as parameters of the heuristics: provided that they are assigned in an
appropriate way, the greedy algorithm will be able to construct the globally optimal
solution. The problem is then shifted to finding an appropriate parameter assignment,
rather then, directly, a ready-made solution to the object problem.
5.7.5
Co-evolution
In co-evolutionary algorithms two or more populations constantly evolve and interact.
This is in contrast with the customary evolutionary paradigm where a single population
evolves under the selection pressure of a given fixed fitness function that plays the role
of the environment. Indeed, in nature the environment of a given population is actually
comprised of the physical environment, which normally changes very slowly, and of
the other biological populations which are simultaneously adapting. Interactions be-
tween (evolving) populations are omnipresent; consider, for example, prey-predator or
host-parasite relationships. Under these conditions, it is best to think of evolution as
being a co-evolutionary process where changes in a certain species (population) influ-
ence the other ones, i.e., the environment is altered. Thus, a kind of “arms race” de-
velops in which evolutionary changes in one species trigger counter-adaptive changes
in other species, and vice versa.
These observations have only recently been exploited for creating more robust ar-
tificial evolutionary algorithms. One advantage of co-evolutionary methods is that one
need not necessarily specify a global fitness function, only relative fitness is needed.
This can be useful since sometimes providing an adequate fitness function for a given
problem can be difficult or even impossible, for example, in complex games or when
the suite of test cases is very large.
The methods based on co-evolution can roughly be classified as being either com-
petitive or cooperative. Hillis presented one of the first successful competitive co-
evolutionary approaches to optimization problems [17]. The problem consisted in
evolving a sorting network for 16 integers involving a minimum number of exchanges.
Hillis used both a classical evolutionary approach and a co-evolutionary one. In the
latter there are two populations, the first consisting of sorting networks, the second of
sorting problems. These problems are permutations of integers that are to be used as
test cases by the sorting networks of the first population; this second population can
be viewed as an opportunistic or parasitic one. Both populations co-evolve on a two-
dimensional grid in parallel, with selection and mating being carried out locally. The
fitness for the sorting networks is defined as how well they sort the numbers of the
immediate neighbor parasites; the parasites are scored according to their capacity for
232
BIBLIOGRAPHY
LA COMPLEXITÉ
producing difficult problems for the sorting networks. In this way, the best networks
learn to sort increasingly difficult sets of numbers, thus avoiding the need for testing
all 16! possible permutations of numbers. The co-evolutionary approach outperformed
the standard one for this problem and Hillis was able to evolve a nearly optimal sort-
ing network with 61 exchanges, the best known hand-designed solutions having 60
exchanges.
Potter and De Jong [31,32] and Husbands [21] have proposed related co-
evolutionary models that are also based on species cooperation rather than competi-
tion. The methods differ in their implementation but they share the notion of multiple
species cooperating so as to attain a common objective. We briefly describe the first
approach, referring the reader to the original papers for more details.
In this model, given a problem to be solved, multiple populations are evolved in-
dependently by a standard genetic algorithm. Each subpopulation evolves a species of
individuals that represent (hopefully) useful components in the solution of the global
problem. Species are then combined into full solutions and evaluated on the common
global task. Credit is assigned to the species according to how well they collaborate to
solve the common problem. In this way, selection pressure favors cooperation rather
than competition between species, although within a single species evolution is still
competition-based.
Potter and De Jong obtained good results, often better than standard GA-based
ones, on simple function optimization problems and on neural network evolutionary
design.
Paredis [30] has also studied co-evolutionary dynamics recently, both in the com-
petitive predator-prey setting as well as using a symbiotic approach, which is co-
operative in nature, since success on one side improves chances of survival of the
other party.
Another highly parallel, local, co-evolutionary algorithm has recently been used
by Sipper for evolving non-uniform cellular automata to perform computational tasks.
This model belongs to the cooperative class of co-evolutionary algorithms, since the
individual units must work in unison to attain a global goal. The methodology, called
the cellular programming method is described in detail in [40].
Acknowledgment.
Part of the material presented here has been contributed by my
colleague A. Tettamanzi.
5.8
Bibliography
[1] E. H. L. Aarts, A. E. Eiben, and K. M. van Hee. A General Theory of Genetic
Algorithms. Computing Science Notes. Eindhoven University of Technology,
Eindhoven, 1989.
[2] P.J. Angeline and K.E. Kinnear Jr. (Eds.). Advances in Genetic Programming 2.
The MIT Press, Cambridge, Massachusetts, 1996.
233
LA COMPLEXITÉ
Marco TOMASSINI
[3] T. Bäck. Evolutionary algorithms in theory and practice. Oxford University
Press, Oxford, 1996.
[4] Th. Bäck and F. Hoffmeister. Extended selection mechanisms in genetic algo-
rithms. In R. K. Belew and L. B. Booker, editors, Proceedings of the Fourth
International Conference on Genetic Algorithms, pages 92–99, San Mateo, CA,
1991. Morgan Kaufmann.
[5] J. Baker. Adaptive selection methods for genetic algorithms. In J. J. Grefenstette,
editor, Proceedings of the First International Conference on Genetic Algorithms,
pages 101–111, Hillsdale, NJ, 1985. Lawrence Erlbaum Associates.
[6] W. Banzhaf, P. Nordin, R. E. Keller, and F. D. Francone. Genetic programming,
An Introduction. Morgan Kaufmann, San Francisco CA, 1998.
[7] A. Brindle. Genetic agorithms for function optimization. Technical Report TR81-
2, Department of Computer Science, University of Alberta, Edmonton, 1981.
[8] C. Darwin. The Origin of Species. John Murray, London, 1859.
[9] T. E. Davis and J. C. Principe. A simulated annealing like convergence theory
for the simple genetic algorithm. In R. K. Belew and L. B. Booker, editors, Pro-
ceedings of the Fourth International Conference on Genetic Algorithms, pages
174–181, San Mateo, CA, 1991. Morgan Kaufmann.
[10] R. Dawkins. The Blind Watchmaker. W.W. Norton and Company, 1986.
[11] K.E. Kinnear Jr. (Ed.). Advances in Genetic Programming. The MIT Press,
Cambridge, Massachusetts, 1994.
[12] D. B. Fogel. Evolving Artificial Intelligence. PhD thesis, University of California,
San Diego, 1992.
[13] L. J. Fogel. On the Organization of Intellect. PhD thesis, University of California,
Los Angeles, 1964.
[14] L. J. Fogel, A. J. Owens, and M. J. Walsh. Artificial Intelligence through Simu-
lated Evolution. John Wiley & Sons, New York, 1966.
[15] D. E. Goldberg. Genetic Algorithms in Search, Optimization and Machine Learn-
ing. Addison-Wesley, 1989.
[16] D. E. Goldberg. Messy genetic algorithms: Motivation, analysis and first results.
Complex Systems, 3:493–530, 1989.
[17] W. D. Hillis. Co-evolving parasites improve simulated evolution as an optimiza-
tion procedure. In C. G. Langton, C. Taylor, J. D. Farmer, and S. Rasmussen,
editors, Artificial Life II, volume X of SFI Studies in the Sciences of Complexity,
pages 313–324, Redwood City, CA, 1992. Addison-Wesley.
234
BIBLIOGRAPHY
LA COMPLEXITÉ
[18] J. H. Holland. Adaptation in Natural and Artificial Systems. The University of
Michigan Press, Ann Arbor, Michigan, 1975.
[19] J. H. Holland. Adaptation in Natural and Artificial Systems. The MIT Press,
Cambridge, Massachusetts, second edition, 1992.
[20] J. E. Hopcroft and J. D. Ullman. Formal Languages and Their Relation to Au-
tomata. Addison-Wesley series in Computer Science and Information Process-
ing. Addison-Wesley, Reading, MA, 1969.
[21] P. Husbands. An ecosystem model for integrated production planning. Journal
of Computer Integrated Manufacturing, 6:74–86, 1993.
[22] J. R. Koza. Genetic Programming. The MIT Press, Cambridge, Massachusetts,
1992.
[23] J. R. Koza. Genetic Programming II. The MIT Press, Cambridge, Massachusetts,
1994.
[24] C. G. Langton, C. Taylor, J. D. Farmer, and S. Rasmussen, editors. Artificial
Life II, volume X of SFI Studies in the Sciences of Complexity. Addison-Wesley,
Redwood City, CA, 1992.
[25] Z. Michalewicz. Genetic Algorithms + Data Structures = Evolution Programs.
Springer-Verlag, Heidelberg, third edition, 1996.
[26] T. M. Mitchell. Machine Learning. McGrow-Hill, New York, 1997.
[27] H. Mühlenbein and D. Schlierkamp-Voosen. The science of breeding and its
application to the breeder genetic algorithm (bga). Evolutionary Computation,
1(4):335–360, Winter 1993.
[28] H. Mühlenbein, M. Schomish, and J. Born. The parallel genetic algorithm as a
function optimizer. Parallel Computing, 17:619–632, 1991.
[29] C. H. Papadimitriou and K. Steiglitz. Combinatorial Optimization, Algorithms
and Complexity. Prentice-Hall, Englewood Cliffs, 1982.
[30] J. Paredis. Coevolutionary computation. Artificial Life, 2(4):355–375, 1995.
[31] M. Potter and K. De Jong. A cooperative coevolutionary approach to function
optimization. In Procs. of the Third Conference on Parallel Problem Solving from
Nature, Y. Davidor and H.-P. Schwefel (Eds.), pages 249–257. Lecture Notes in
Computer Science Vol. 866, Springer-Verlag, 1994.
[32] M. Potter and K. De Jong. Evolving neural networks with collaborative species.
In Procs. of the 1995 Summer Computer Simulation Conference, pages 340–345.
The Society for Computer Simulation, Ottawa, Canada, 1995.
235
LA COMPLEXITÉ
Marco TOMASSINI
[33] I. Rechenberg.
Evolutionsstrategie: Optimierung technischer Systeme nach
Prinzipien der biologischen Evolution. Fromman-Holzboog Verlag, Stuttgart,
1973.
[34] J. Rosca and D. Ballard. Discovery of subroutines in genetic programming. In P.J.
Angeline and K.E. Kinnear Jr. (Eds.), editors, Advances in Genetic Programming
2, pages 177–201. Cambridge, Massachusetts, 1996.
[35] G. Rudolph. Parallel approaches to stochastic global optimization. In W. Joosen
and E. Milgrom, editors, Parallel Computing: from Theory to sound Practice.
Proceedings of the European Workshop on Parallel Computing, pages 256–267,
Amsterdam, 1992. IOS Press.
[36] A. L. Samuel. Some studies in machine learning using the game of checkers. In
E. A. Feigenbaum and J. Feldman, editors, Computers and Thought, New York,
1959. McGraw-Hill.
[37] H.-P. Schwefel.
Kybernetische Evolution als Strategie der experimentellen
Forschung in der Strömungstechnik.
Master’s thesis, Technische Universität
Berlin, Berlin, 1965.
[38] H.-P. Schwefel. Numerical optimization of computer models. Wiley, Chichester,
New York, 1981.
[39] H.-P. Schwefel. Collective phenomena in evolutionary systems. In Preprints
of the 31st Annual Meeting of the International Society for General System Re-
search, Budapest, 1987.
[40] M. Sipper. Evolution of Parallel Cellular Machines: The Cellular Programming
Approach. Springer-Verlag, Heidelberg, 1997.
[41] S. W. Wilson. Classifier systems based on accuracy. Evolutionary Computation,
3(2):149–175, 1995.
[42] B. Zhang and H. Mühlenbein. Balancing accuracy and parsimony in genetic
programming. Evolutionary Computation, 3:17–38, 1995.
236
Chapitre 6
Artificial Neural Networks
6.1
Introduction
The functioning of the brain has always fascinated people. The human brain, and
also the brain of some animals, is indeed capable of astonishing achievements such as
remembering, recognizing patterns, and associating, among many others. The way in
which these tasks are performed appears to be quite different in nature from standard
computation as we know it, that is, in the Turing sense [18]. Indeed, the brain is a
massively parallel, highly connected assemblage of an astronomical number of slow
processing units that collectively work on these difficult tasks and allow us to function
smoothly and effortlessly. These units or cells are called neurons, they are of several
different types, and they work in an analog way by propagating electrical currents
of chemical origin along connections. The details of how neurons function are very
intricate and need not concern us here but the main points are simple and worth some
study. The neuron has three main components: the soma, the dendrites, and the axon
(Figure 6.1).
The soma is the cell body containing the cell’s nucleus. The axon is a single fiber
extending from the cell body and then branching out progressively. The dendrites are
shorter and they are arranged in a tree-like fashion around the cell body. The axon
branches terminate at the surface of other neurons or on the dendrites. These contact
points are called synapses. Neurons interact through electrical pulses that propagate
along the axons and are transmitted to other neurons via the synaptic connections.
If enough incoming pulses arrive at a given neuron in a certain interval of time, the
neuron “fires”, transmitting a new electrical pulse down its axon. It is this compar-
atively slow but massively parallel and fault-tolerant signal propagation phenomenon
that accounts for the unique properties of the nervous system. The brain is also a
low-consumption system, dissipating several orders of magnitude less power than any
known digital technology for comparably elementary operations.
The operational properties of the brain arise from several different processes, on
different time scales. First of all, the brain has evolved over millions of years and its
main structures are coded in the genome. However, given that the human brain has
237
LA COMPLEXITÉ
Marco TOMASSINI
dendrite
axon
nucleus
synapse
Figure 6.1: A highly stylized form of a biological neuron with its main components
highlighted. The arrows show the direction of the electrical signal flow.
some 1011 neurons as well as some 1015 connections, the genetic code cannot specify
everything from the start. Therefore, the brain shows “plasticity”, that is, it continues
to modify its connectivity patterns and to adapt during its whole life, as a function
of the external stimuli, although plasticity diminishes with age. It is especially the
neuron “fanout”, that is, the number of direct connections a neuron has with other
neurons, that is impressive, this number being typically between 103 and 105. Just for
comparison, this number is of the order of 10 for electronic circuits. No doubt, the
highly interconnected nature of the brain plays a major role in the way it functions.
The self-organizing and self-adapting capabilities of the brain are also evident in case
of impairement of some of the brain area. This also points to another remarkable
property of the brain, i.e., its capability to tolerate faults and incomplete information.
This fault-tolerance is difficult to obtain in artificial computational systems and the
brain is a precious example to follow in this respect.
Artificial neural networks (ANN) have their origin in the attempt to simulate by
mathematical means an idealized form of the elementary processing units in the brain
and of their interconnections, signal processing, and self-organization capabilities. The
emphasis here is on these models seen as computing systems of a different kind. There
also exist computational models of real neurons that are widely used in what is called
“computational neuroscience”. Computational neuroscience tries to build models that
approximate the behavior of actual neurons so as to be able to simulate relatively large
pieces of the nervous system. This is useful to complement electrophysiological and
other in vivo experiments that are difficult and lengthy to perform. But although these
models are very interesting and useful, they fall outside the scope of our subject mat-
ter. Instead, we intend to present here the behavior and properties of highly simplified
238
6. Artificial Neural Networks
LA COMPLEXITÉ
mathematical abstractions of neurons and of their interactions. We will first outline
the simplest formal models of neurons, study their properties and limitations as single
units, and then investigate the behavior of networks of such simple processing ele-
ments. We will see that neural networks solve problems in very different ways from
those we are accustomed to in classical computer science insofar as neural networks
approximate answers rather than deterministically compute them. In fact, the usual
way to solve a given problem on a digital computer is to provide a series of precise
instructions to be followed by the machine, i.e., a “program”. On the other hand, a
neural network may be viewed as an adaptive system that progressively self-organizes
in order to approximate the solution. In other words, neural networks free the problem
solver from the need to accurately and unambiguously specify the steps towards the
solution. This problem-solving philosophy can be either an advantage or a drawback:
it all depends on the application and on the objectives.
Most importantly, we will also see that neural networks have the ability to progres-
sively improve their performance on a given task by somehow “learning” how to do
the task better, if given some way of evaluating their current performance, a process in
which programming is replaced by learning through examples.
Artificial neural networks are at their best for problems in which there is little or
incomplete understanding, so that building a faithful mathematical model is difficult or
even impossible, but abundant data is available, since they are data-driven. Problems
of this kind are very common in pattern classification, non-linear function approxima-
tion and system modeling, control, associative memory, and system prediction among
others. ANNs of the kind studied here are related to traditional mathematical and
statistical models, as we will see below.
Although here we provide a reasonably detailed account of the most common ANN
types and gives an adequate background for the rest of the book, it is far from ex-
haustive. The subject of ANNs has grown into quite a large field of study and more
complete descriptions can be found in a number of books. Among those we can rec-
ommend Gurney’s book [11] at an elementary level and Hassoun’s [12] for a deeper
and more complete presentation.
6.2
Artificial Neurons
Before studying assemblages of artificial neurons it is sensible to describe the proper-
ties of isolated units. The first model of a formal neuron was proposed by McCulloch
and Pitts in their landmark 1943 paper [17] and is called a Threshold Logic Unit (TLU)
or a Linear Threshold Gate. Figure 6.2 gives a graphical representation of such a unit
with n real-valued inputs xi each input being associated with a parameter wi. Param-
eter wi is also known as a “synaptic weight”, or simply “weight”, in analogy with
biological synapses, the functional contacts between two nerve cells. A TLU performs
a weighted sum operation followed by a non-linear thresholding operation, or step
function, such that if the value of the sum is greater or equal than the threshold θ then
239
LA COMPLEXITÉ
Marco TOMASSINI
the output y of the unit is 1, otherwise it is 0:
1
if
n
w
y(x) =
i=1
ixi ≥ θ,
(6.1)
0
otherwise
input
synaptic weight
x1
w1
output
x2 w2
Σ
1
y
-1
wn
θ
threshold
xn
Figure 6.2: The thresholding logic unit of McCulloch and Pitts.
In other words, the neuron will “fire” that is, it will emit an instantaneous “1”
signal if the thereshold is exceeded; otherwise, it will do nothing. The weighted sum
of equation 6.1, also called the neuron activation, can be expressed more concisely as
the scalar product w · x of the weights vector w and the input vector x.
x
x
2
2
1
1
x
x
1
1
0
1
0
1
A N D
O R
Figure 6.3: Linear separations of input space corresponding to the AND and OR logic
functions.
Thus, a TLU performs a mapping IRn → {0, 1} from the reals to Boolean values.
If the inputs are binary, that is if the n-component vector x ∈ {0, 1}n then a TLU
becomes a boolean function. It would then be interesting to know the computational
power of TLUs, i.e. can a TLU realize all the 22n possible boolean functions of n
inputs? The global answer is that a single TLU can only realize a subset of all those
functions (for more details see [12]). The functions that can be attained by a TLU
are called linearly separable which means that they are able to categorize the inputs in
240
6. Artificial Neural Networks
LA COMPLEXITÉ
x x
y
1
2
0 0
0
0 1
1
y { : 0: 1
1 0
1
1 1
0
XOR truth table
x
x
2
2
x
x
1
1
a)
b)
Figure 6.4: XOR implementation with (a) a nonlinear separation of the input space,
and (b) two linear separations.
two classes which are separated by a hyperplane in an n-dimensional boolean space.
For instance, among the familiar boolean functions with two inputs, AND and OR are
linearly separable, whereas the boolean equality and inequality (also called XOR or
exclusive OR) are not. This can be seen pictorially in Figure 6.3 and Figure 6.4. In
two dimensions a hyperplane becomes a straight line and there is no way of drawing
the line in order to separate the two different classes in the XOR case. However, it can
also be seen in this last case that a non-linear separator or a couple of straight lines can
perform the classification task.
6.3
Networks of Artificial Neurons
As we have seen, artificial neurons in isolation are not very impressive. However, the
living example of the nervous system shows that large assemblages of simple cells
give raise to astonishingly complex emergent behaviour. Likewise, it is only when
many artificial simple units are brought together to form a network that useful new
computational capabilities appear. In this section we will study such artificial neural
networks and some of their properties.
Although we have just seen that a single TLU is unable to represent all the Boolean
functions, it can nevertheless realize the NAND and NOR gates. Since NAND (and
NOR) are universal logic gates, it follows that a TLU is also universal and any Boolean
function can be realized by a suitable network of TLUs. This result had already been
recognized in the classic paper by McCulloch and Pitts [17]. There exist many kinds of
neural network architectures, some of which are depicted in Figure 6.7. Among these
the feedforward multilayer networks have received much attention due to their relative
241
LA COMPLEXITÉ
Marco TOMASSINI
simplicity and computational capabilities. In feedforward networks an input pattern
is transformed to an output pattern through a finite series of layers of nodes, some
of which may have no connections to the input or output, and there are no feedback
signals i.e., signals only travel forward (see Figure 6.7(b)).
For instance, it can be shown that the feedforward two-layer network of Figure 6.5
solves the XOR classification problem by implementing two linear decision bound-
aries, as depicted in Figure 6.4. The intermediate node layers are called hidden nodes
because they are not directly connected to the external world through the inputs and
the outputs.
Hidden layer
x
1
1
0.5
1
-1
0.5
y
1
-1
x
1
2
0.5
Figure 6.5: Multilayer feedforward network.
Until now we have been mainly talking about Boolean functions as the use of a
threshold function limits the output to be 0 or 1. However, it is possible to make the
artificial neuron emit a continuous signal. This can be obtained by using a contin-
uous transfer function following the weighted sum instead of the discontinuous step
function, as represented in the following equation:
n
y(x) = g (
wixi),
(6.2)
i=0
where g(wixi) is called the activation function. A convenient function form for g is the
so-called sigmoid which is graphically represented in Figure 6.6 and whose analytic
form is:
1
y(x) =
(6.3)
1 + e−(bx−c)
Other commonly used continuous transfer functions are linear, hyperbolic tangent,
and gaussian. The sigmoid is suitable because it is differentiable and it saturates i.e.
it tends asymptotically to 0 and 1 at the extremes. By varying the parameter of the
exponential one can control the steepness of the curve and in the limit it reduces to
a TLU. Note that still other functions can be used provided that they are continuous,
monotonically increasing, and take values between 0 and 1.
242
6. Artificial Neural Networks
LA COMPLEXITÉ
1
0.8
0.6
0.4
0.2
–4
–2
2
4
x
Figure 6.6: A typical sigmoid activation function.
Networks of neurons with real-valued inputs and a sigmoid transfer function can
be used to approximate mathematical functions. In fact, it is possible to think of a
network of such units as implementing a mathematical function of its inputs. This
allows the parameterization of an unknown real-valued function which is very useful
in situations where the exact form of the functional relation (if any) is unknown, such
as time-series prediction or system identification. More generally, let us suppose that
we want to approximate any continuous real-valued function F (x1, x2, . . . , xp). A
general result (see for instance [12] and references therein) says that such a family
of functions can be approximated to any desired accuracy by a feed-forward network
with at least one single hidden layer. That is, there exist real constants αi, wij and a
monotone-increasing continuous function g (e.g. the sigmoid function) such that:
p
m
f (x1, x2, . . . , xp) =
αk g(
wijvj)
(6.4)
k=0
j=0
and
| f(x1,x2,...,xp) − F(x1,x2,...,xp) |<
(6.5)
for any
> 0, where m and p are the number of units in the input and hidden
layers respectively. Being only an existence result, the theorem leaves unspecified the
optimum number of hidden layers and the number of units in the hidden layers in any
given case. A related result is that single-hidden-layer nets with sigmoidal activation
units and a single linear output node are universal classifiers, i.e. they can correctly
assign any given input pattern to one of a finite number of distinct classes. In practice,
243
LA COMPLEXITÉ
Marco TOMASSINI
networks used for classification usually employ several output units, each of which
represents a unique class. This setting is more natural, as it avoids the use of a large
number of hidden units.
Thus, ANNs possess interesting computational capabilities. But the remaining
problem is that of designing the networks in such a way that a given computational
task can be realized as efficiently and economically as possible. How should we go
about designing a neural network? How many units and layers should we use? How
to choose the connections and their weights? There are no easy answers to these
questions and no standard design recipes exist for designing neural networks for a
given problem. In general, neural network researchers prefer to let the network design
itself, so to speak, rather than imposing a pre-existing architecture. This is indeed a
very peculiar way of tackling a computational problem and it will be dealt with in the
following section.
a) single-layer perceptron
d) Elman recurrent network
b) multi-layer perceptron
e) competitive networks
c) Hopfield network
f) self-organizing maps
Figure 6.7: Typical neural network topologies.
244
6. Artificial Neural Networks
LA COMPLEXITÉ
6.4
Hopfield Networks
The type of network model designed by Hopfield (see [15] and Figure 6.7 c) played
an important role in the “resurgence” of artificial neural networks in the early eighties.
Hopfield networks can be used as associative memories and for optimization, both of
which are important applications. Hopfield had the insight of considering the evolu-
tion of the network as a dynamical system. In general terms, the time evolution of
a dynamical system can be described as a trajectory in a state space through a state
vector x(t) containing, for instance, the positions and the velocities of each system
component through time. If the dynamics has stable limit points x1, x2, . . ., then these
locally stable points may be considered as an information content or storage implicitly
described by the system dynamics. Since they are stable, they can be recovered if the
system starts its dynamics sufficiently close to one of those points. Hopfield showed
that his networks had a similar property and, moreover, that this could be made use of
in neural computation.
As in McCulloch-Pitts networks, neurons in a Hopfield net produce a binary output
when a given thereshold of activation is reached (see Equation 6.1). However, the net
is completely connected and there is no distinction between input, internal or output
units: all units are the same. Moreover, the net is constrained by the fact that the con-
nection weights are symmetric (wij = wji), and units are not connected to themselves
(wii = 0). Furthermore, in contrast with McCulloch-Pitts networks, units are updated
asynchronously, one at a time in random order; the system is recurrent since neuron
outputs are reinjected into the net. For technical reasons, and without loss of gener-
ality, it is customary to choose {−1, +1} instead of {0, 1} as neural states, and the
threshold θ is taken to be 0.
The network evolves in time from a given initial state according to the following
simple dynamics: at each time step choose one unit i at random. Compute the sum of
the weighted inputs of all the other units impinging on the given cell i:
S =
wi,jxj
j
Now, if S ≥ 0 then the state si of unit i takes the value +1, otherwise it is -1. Then
another random units is chosen in turn and the process is repeated until a stable state
is reached, i.e., until no unit can change state.
Reasoning that his idealized neural net is formally analogous to a paradigmatic
physical model called the Ising spin model, Hopfield identified the quantity
1
E = −
w
2
ij sisj
(6.6)
i,j
with the “energy” function of the net [15]. Of course, this is only a formal analogy
since no energy is implied in the physical sense. Here, as usual, wij are the connec-
tion weights between neurons i and j, while si and sj are neuron states belonging to
245
LA COMPLEXITÉ
Marco TOMASSINI
{−1,+1}. However, this analogy allows the net dynamics to be seen as a trajectory in
state space such that the energy is minimized. In fact, Hopfield was able to show that
when a formal neuron updates its state, the energy E either stays constant or dimin-
ishes, i.e., ∆E ≤ 0 (the details of the proof can be found in [14]). Since the energy has
a lower bound, this means that the net dynamics will always lead to a local minimum
i.e., one of the stable points of the system.
Having recognized that the system dynamics is such that it always leads to a stable
equilibrium point, the reason why Hopfield nets can act as associative memories be-
comes now clear. One just has to find a way of encoding patterns to be remembered
as the net stable points, or attractors, and this can be done by appropriately setting
the net’s weight matrix W = (wij) (see [14] for details). Even more interesting, if
a slightly corrupted binary pattern is presented to the net (by setting the initial states
of the neurons to those of the pattern), the net dynamics will converge to the corre-
sponding stored pattern, provided that the initial faulty pattern falls into the basin of
attraction of the corresponding energy minimum.
How many patterns can be stored in the net? Hopfield’s analysis shows that, if N
is the number of neurons in the net, then about 0.10N − 0.15N patterns can be stored
safely. As the number of patterns becomes higher, errors start to appear i.e., sometimes
a wrong pattern is retrieved. But the degradation is not “graceful”: as more and more
patterns are stored it is found that the network suddenly experiences a kind of phase
transition and the retrieved patterns become random, making the memory useless.
Our presentation of Hopfield networks has been limited to the basics. The in-
terested reader will find a complete and rigorous statistical mechanical discussion,
including extensions such as Boltzmann machines, in reference [14].
6.5
Neural Learning
We saw that one of the distictive features of artificial neural networks is the absence
of a pre-defined set of instructions to follow. Instead, neural networks represent a dif-
ferent conceptual approach to computation that makes use of statistical concepts such
as progressively improving input-output relationships by a process in many ways anal-
ogous to function fitting and extrapolation. In this way, the network adapts itself to
perform a given task. In the neural network literature such a process goes under the
name of “learning” or “training”. Perhaps, terms such as “learning” are misleading
since they may suggest that a phenomenon similar in complexity to human or animal
learning is taking place. Indeed, nothing of the sort is actually occurring when a net-
work is trained to perform a given task. The process is entirely of a formal nature and
can be described by well-defined mathematical algorithms. Nevertheless, network and
machine learning is such a widespread concept that the term will be used here in this
restricted sense.
In its most common meaning, learning by a neural network implies an adaptive
procedure in which the weights of the network are incrementally modified so as to im-
246
6. Artificial Neural Networks
LA COMPLEXITÉ
prove a prespecified performance criterion i.e., an objective function over time. Such
a procedure is called a learnig rule or learning algorithm and the adaptation may take
place in a supervised or an unsupervised way. In supervised learning the net is pre-
sented with a set of known input/ouput pattern pairs: this set of values is called the
training set . The learning process consists in updating the weights at each training
step so that, for a given input, an error measure between the network’s output and
the known target value is reduced. This is also known as learning with a teacher or
associative learning.
In unsupervised learning there is still an input/output relationship that the network
must learn to reproduce but no feedback is provided indicating whether a given asso-
ciation is correct or not. In other words, the similarities among patterns and features
of a training set must be discovered by the network itself by clustering similar cases in
a self-organizing manner.
A third commonly used form of training a network makes use of the concept of
reinforcement learning. In this case the “teacher” signal for some training input/output
pair is not some measure of the difference between the given output and the expected
value, as in supervised learning, but rather an evaluation of the result as a “wrong” or
“right” direction.
In the following sections some of the most important supervised and unsupervised
learning algorithms will be briefly described. Since the subject is a large one, we
cannot hope for complete coverage: for an in-depth treatment of the subject, the reader
can consult for instance reference [12].
6.6
Supervised Learning
Supervised learning algorithms are based on error correction rules, that is, an error
value is generated from the actual response of the network and the desired response.
Following that, the weights are modified such that the error is gradually reduced.
We start by describing some simple single-neuron training rules and then go on to
training algorithms for networks of units.
6.6.1
Perceptron Learning Algorithm
A classical and very simple example of supervised learning is Rosenblatt’s perceptron
learning algorithm [21]. For our purpose here, a perceptron is a binary unit similar to
the linear threshold gate of Figure 6.2. The outline of the learning algorithm is given
in Figure 6.8.
Historically, the algorithm is called the perceptron learning rule because it was first
used by Rosenblatt with a variant of the TLU called a perceptron. In this algorithm,
the positive parameter η is called the learning rate or step size: it dictates the size of
247
LA COMPLEXITÉ
Marco TOMASSINI
1.
Initialize weights and threshold randomly,
2.
Present an input vector to the neuron,
3.
Evaluate the output of the neuron,
4.
Evaluate the error of the neuron and
update the weights according to:
wt+1 = wt + η(d
i
i
− y)xi,
where, d is the desired output, y is the actual
output of the neuron, and η (0 < η < 1)
is a parameter called the step size.
5.
Go to step 2 for a certain number of iterations, or until
the error is less than a prespecified value.
Figure 6.8: Perceptron learning algorithm.
the weight change and thus it controls how fast the learning takes place.
The rule has a straightforward geometric interpretation in terms of vector algebra
in IRn but we do not have space here to go into the details. To give a feeling for
how the algorithm works, let us consider a decision problem in which the inputs must
be partitioned into two classes. In the perceptron context, learning is equivalent to
changing the vector of weights in such a way that a decision surface moves until a
hyperplane is found that correctly separates all input/output pairs into two distinct
classes. The interested reader can consult [11] for a more formal explanation.
Rosenblatt proved that if the two classes of input patterns are linearly separable,
then the perceptron algorithm will eventually converge after a finite number of itera-
tions. This is known as the perceptron convergence theorem. An important property of
the perceptron is that whatever it can compute, it can learn with the perceptron learning
algorithm. However, if a given problem is not linearly separable, a perceptron does not
find a low-error solution, even if it exists.
The perceptron learning rule is of historical interest but it fails on nonlineraly sepa-
rable problems. We have seen in Section 6.3 that such nonlinearly separable problems
can be solved in principle by multi-layer nets of LTUs. However, it turns out that
multi-layer nets cannot be trained effectively by the perceptron learning algorithm.
For training purposes, it has been found that instead of using straight weight adjust-
ment, it is more practical to use gradient-based descent on an appropriate differentiable
criterion function. For instance, we could use as an objective function to be minimized
the sum of the differences between the actual neuron output and the target training out-
put values. This amounts to finding the minimum of the sum of errors over the training
set as a function of the free weight parameters. Normally, the square of the sum of the
errors is used in order to work with positive quantities. Because of the discontinuous
248
6. Artificial Neural Networks
LA COMPLEXITÉ
nature of the output step function, the gradient, and thus the derivatives of the criterion
functions are also discontinuous. This is easy to fix by just considering the neuron
activation i.e., the weighted sum of inputs instead of the output value. As we will see,
this form of training is suitable for multi-neuron, multi-layer networks, allowing those
networks to solve arbitrary classification problems (see also Section 6.3).
6.6.2
LMS Rule and Delta Rule
The LMS (Least Mean Square) algorithm by B. Widrow and M. Hoff [25] was de-
veloped to provide “learning” to a McCulloch-Pitts-like element called ADALINE for
ADAptive LINear Element. Such nodes are identical to TLUs except that the output
signals are {−1, +1} instead of {0, 1}. It uses a hard-limiting function with a bias
weight w0 controlling the threshold level of such a function. The learning rule tries to
reduce the mean square error (MSE), averaged over the training set, using the gradient
descent method. Here the criterion function to be minimized is:
m
E(w) = 1/2
(di − yi)2
(6.7)
i=1
That is, the sum of squared errors over m training pairs, where w is the vector of
weights. Taking the derivatives gives us the iterative form of the rule:
wt+1 = wt + µ (dt
) xt ,
k − ytk
k
(6.8)
where µ is a parameter that controls stability and rate of convergence, x and w are the
input and weight vectors respectively, y is the output value, d is the expected value and
t and t + 1 are the current and next step in the iteration respectively.
In the LMS algorithm the weights are updated using an estimate of the steepest
descent of the mean square error function E(x) in weight space. E is a quadratic
function of the weights and is therefore convex and has a unique (global) minimum. If
the chosen positive constant µ is sufficientely small, the gradient descent search will
asymptotically converge toward the solution regardless of the initial weight values.
Widrow extended his ADALINE units to multiple Adaline networks and provided one
of the first trainable layered networks.
The so-called Delta rule is similar to the LMS rule: it also works by minimizing
an error residual but now the activation function of the neuron unit is a continuous,
differentiable, non-linear curve such as a sigmoid. Gradient methods for minimization
can be used with the Delta rule and the output value can appear in the quadratic cri-
terion function since the transfer function is differentiable. The convergence speed of
the Delta algorithm depends on the slope of the first derivative of the transfer function,
which can be rather flat at the extremes. In these regions progress is very slow but
convergence can be improved in various ways (see for instance [12]).
The importance of the Delta rule is due to the fact that it extends naturally to the
249
LA COMPLEXITÉ
Marco TOMASSINI
training of multilayer networks by a method called error backpropagation. Backprop-
agation was worked out early in the 1970s by Paul J. Werbos. The next section gives
an account of this important technique.
6.6.3
Backpropagation Algorithm
The absence of a practical method for training multilayer networks nearly stopped
work on ANNs for several years. The invention of the backpropagation method, at-
tributed to Paul J. Werbos [24], is one of the main reasons for the renewed interest in
artificial neural networks toward the end of the 1970s.
The algorithm is by far the most popular method for performing supervised learn-
ing of feedforward networks composed of continuous activation function units such
as those described in the previous paragraph on the Delta rule in order to be able to
use derivatives for calculating the gradient. It has been used on countless applications
of ANNs to many different kinds of problems. Error backpropagation is essentially
a search procedure that attempts to minimize a whole network error function such as
the sum E of the squared error of the network output over an ensemble of training
input/output pairs:
m
E = 1/2
(dj − yj)2,
(6.9)
j=1
where dj is the desired jth output and yj is the actual jth output of the network
The name of the algorithm is due to the fact that the weight modifications dictated
by the learning rule propagates “backwards” from the output layer to the input layer.
In fact, the algorithm can be intuitively visualized as a series of forward and backward
waves of activity. In the forward phase the network produces its output for given
input/output pairs: this leads to the calculation of the global error. In the backwards
phase, the weights of the output nodes and then those of the hidden nodes back to the
input, are modified layer by layer according to the learning rule in order to reduce the
error. The details of the algorithm are mathematically simple but require some space
to be presented in an orderly way. We give here only an outline of the basic procedure
(Figure 6.9). An exhaustive description can be found in [12]. An excellent pictorial
explanation is contained in [20].
Remarks on the Backpropagation Algorithm
Today there exist many public domain or commercial packages implementing some
form of feedforward ANNs with backpropagation as the learning algorithm. There
is nothing wrong with this “canned” software as many of them are of high quality
and the method is quite standard and well-known. Indeed, in view of the number of
applications that use supervised neural learning, it would not make much sense to
reinvent the algorithm each time. Nevertheless, there are a number of subtle points in
250
6. Artificial Neural Networks
LA COMPLEXITÉ
1.
Initialize weights randomly,
2.
Present an input vector pattern to the network,
3.
Evaluate the outputs of the network by propagating
signals forwards,
4.
For all output neurons calculate δj = (yj − dj),
where dj is the desired output of neuron j and
yj is its current output:
y
wij xi
j = g(
w
)−1,
i
ij xi) = (1 + e−
i
assuming a sigmoid activation function,
5.
For all other neurons (from last hidden layer to first),
compute δj =
w
k
jkg (x)δk,
where δk is the δj of the succeeding layer,
and g (x) = yk(1 − yk),
6.
Update the weights according to:
wij(t + 1) = wij(t) − ηyiyj(1 − yj)δj,
where, η is a parameter called the learning rate.
7.
Go to step 2 for a certain number of iterations, or until
the error is less than a prespecified value.
Figure 6.9: Backpropagation learning algorithm.
training ANNs that make life less pleasant and that should at least be briefly mentioned.
Local and Global Minima
First of all, consider that backpropagation essentially
does a gradient descent search through the multidimensional space of possible weights
trying to reduce the error E between the training output values and the network out-
puts. Till now, we have not explicitly mentioned the fact that the search hypersurface
will almost surely be multi-modal i.e., it may present a number of minima. In mini-
mizing E(w) for such a function we would like to strive for the global minimum, that
is the value w∗ of the weight vector for which E(w∗) < E(w) for any w in the search
domain. Gradient descent is a minimization algorithm that will find the closest local
minimum with respect to the starting point of the search, and if the search landscape
is complex enough, this minimum is very unlikely to be the global one. What does
this mean in terms of the learning process? Two things can be said in general about
this state of affairs: first, the problem of the search getting stuck in a local minimum
is not as important in practice as one might think. Network weights determined in this
way have proved to be good enough in many applications. Secondly, there exist other
optimization methods that can help the search to escape local minima. For example,
evolutionary algorithms, which can be seen as biased stochastic optimizers, can be
used to train neural networks of various kinds. This is a first example of the synergy
251
LA COMPLEXITÉ
Marco TOMASSINI
between different soft computing methodologies which is one of the leading themes
of our book. Other global optimization methods, such as stochastic gradient search or
simulated annealing can also be used. Another problem with the standard backpropa-
gation algorithm is that it can be very slow. But researchers have found a number of
ways to speed-up and enhance backpropagation learning (see for instance [12]).
Learning Topologies
Another issue which we have largely ignored until now is the
question of network topology. That is, backpropagation or similar learning algorithms
are customarily used simply for adjusting the weights in an otherwise fixed feedfor-
ward network structure. But it is clear that the interconnection of the units and their
number plays an important role. In fact, lack of knowledge in determining the appro-
priate topology for a problem, including the number of layers, the number of neurons
per layer, and the interconnection scheme, often results in slow learning speed and
poor performance. For example, if a network is too small, it does not accurately ap-
proximate the desired input to output mapping. If a network is too large it requires
longer training times and may not perform well on unseen data. A variety of methods
have been proposed to dynamically grow or prune the number of neurons and inter-
connections in order to improve the performance of the network. For example, the
cascade-correlation algorithm starts with a network with no hidden units and keeps
adding units one at a time, choosing the weight such that the residual error is mini-
mized [9]. One can also go the other way around, starting with a complex network
and shrinking it by pruning units away when it is found that certain connections have a
small influence on the network error. Some of the methods in this class alter the topol-
ogy while at the same time using backpropagation of the output errors, some do not;
but we do not have space here to go into the details, for which the reader can consult
[3] for example.
Generalization and Overfitting
Generalization means the ability of a learning sys-
tem to correctly map new inputs that were not previously used in the training phase.
If we let the network adapt too well to the training data by giving it too many degrees
of freedom, then the residual error will be very small but the network is likely to fail
to correctly map new, previously unseen input data of the same class i.e., it will have
poor generalization capabilities (see Figure 6.10). This phenomenon is known in the
statistical and ANN literature as overfitting the data and should obviously be avoided
as far as possible. The problem is that the net adapts too much to the noise and fine
behaviour present in the data while ignoring the underlying general trends.
What can be done to improve the generalization ability of a neural network? One
well-known technique comes from regression analysis in statistics and consists in di-
viding the training data into two sets: a training set proper and a validation set. The
training set is used for learning in the usual way but from time to time the validation set
is used to test out-of-sample performance. This cross-validation procedure continues
until the error on the validation set starts to increase, for this means that the generaliza-
tion performance attained is good. Ideally, the best model is the one at the minimum
252
6. Artificial Neural Networks
LA COMPLEXITÉ
underfitting
overfitting
r
r
r
o
out−of−sample (testing)
E
in−sample (training)
Time
Figure 6.10: Illustration of overfitting. Too much training may lead to small residual
error but with bad generalization performance.
of the validation set error curve. Cross validation is a useful empirical technique that
allows us to distinguish between signal and noise in the absence of further information,
and its proper use is the subject of much discussion.
The appropriate number of hidden units and the number of training data are also
related and play a role in the quality of the resulting network. There are no rigorous
results here but a rule of thumb says that the number of hidden units needed will
increase as the number of training data increases for a given level of performance. The
techniques outlined in the previous section may help in finding a nearly optimal or
satisfactory net topology for a given task.
6.6.4
Applications of Supervised Learning Networks
Supervised neural networks, especially feedforward nets trained by backpropagation,
have been used in a wide range of applications including signal and image process-
ing, speech recognition, system identification, medical diagnosis, financial prediction,
detection of events in high-energy physics, etc.
It would be inappropriate to treat representative applications in detail in such a
short space. However, a brief description of a couple of typical areas in which ANNs
have been successful should help the reader by giving a flavor of the work that has
been achieved.
An Illustrative Example: Character Recognition
Fast and reliable character
recognition, especially of hand-written characters is extremely important in practice.
Think for instance of pen-based palm organizers, of automatic address reading and
routing of postal mail and of check signatures, just to mention a few. There have been
253
LA COMPLEXITÉ
Marco TOMASSINI
several attempts to solve this problem by using feedforward ANNs. It is very easy
to find plenty of training data in this field but the task is a very difficult one even for
human recognizers, not to speak of computer programs. Given the difficulties of the
problem, no classification system is expected to achieve a success rate of 100 percent.
Hundred percent success is not really needed for the system to be useful, but too low a
recognition rate, say under 95%, will result in a system that is not reliable enough to be
usable in practical applications. An example of this was the Apple Newton, a palmtop
computer that employed handwriting input and recognition, but in which error rates
were sufficiently high as to seriously limit the user acceptance of the device.
The main ideas in using feedforward nets for character recognition are simple
enough: once one has obtained the hand-written data, the characters are scanned and
digitized as a gray-scale pattern represented as a grid of pixels of a certain size. Two
examples of such a grid are schematically depicted in Figure 6.11. Note, though, that
actual grids are likely to have many more pixels in them.
Figure 6.11: An array of pixels representation of two digits. Left hand side: the ideal-
ized digit 7. Right hand side: a more likely example of a handwritten or noisy 7.
For characters to be interpreted on personal digital assistants, the recording is made
online using a special pad which is able to sense the pen’s pressure and velocity. The
data are then divided into a learning and a test set. The learning set is presented to the
network one image at a time and the error in classification is progressively reduced by
using backpropagation. When an acceptable error level has been attained and misclas-
sifications are rare, the net is used to classify the yet unseen examples. The structure of
an idealized ANN architecture for decimal digit recognition is shown in Figure 6.12.
The inputs are the pixels representing a single character and the outputs are the classes
0 − 9 into which the given character is to be classified.
For example, in a well-known application to handwritten numerical ZIP codes
appearing on U.S. mail, a feedforward net was trained to learn to recognize hand-
written digits [7]. The ANN learned on several thousand samples and was tested on
about two thousand unseen cases with good results. The original network architecture
is actually more complex than what one could imagine from our short description.
There were a total of 1000 units of which ten were the output units corresponding
to the classification of the digits into one of ten groups (0 − 9). The network input
was a 16 × 16 array containing the gray-level pixel image of a particular handwritten
digit. There were three hidden layers of which the first two were able to extract
smaller image features, while the third was a standard fully connected layer. The
254
6. Artificial Neural Networks
LA COMPLEXITÉ
0
1
2
3
4
5
6
7
8
9
Hidden
Inputs
layer
Outputs
Figure 6.12: A schematic network architecture for character classification.
net was trained by an accelerated version of backpropagation. As a refinement of
the above work, LeCun et al. [7] did weight pruning using statistical methods in
order to make a nearly optimal selection of weights with the goal of increasing the
generalization capability of the network. This resulted in better overall performance
of the system.
For more recent discussions of neural and fuzzy approaches to
handwriting recognition see [10,5].
A Second Example: Financial Markets Forecasting
The dynamics of financial
markets is poorly understood. Although market data are generally and publicly avail-
able, nobody really knows how market prices will evolve in spite of claims to the
contrary. The fact is that financial markets are very complex evolving systems of
many interacting agents that have their own beliefs, expectations, and trading strate-
gies. Apart from textbooks describing extremely simple and idealized cases, there is
absolutely no market model available. Again, when there are no models but empirical
data abound, ANNs, among other techniques, could be employed to approximate the
unknown relationships underlying the observed values.
Being able to predict market evolution, at least the trend during a specific time
window, is obviously a definite advantage for traders. Some people believe that the fu-
ture behaviour of a given market is somehow “implied” by the current and past market
data and that the careful study of these data patterns will reveal future trends to some
extent. Others think that only the fundamental economic indicators such as a nation’s
GNP (Gross National Product), unemployment rate, and the like have an influence on
255
LA COMPLEXITÉ
Marco TOMASSINI
markets. Yet others think that market indicators such as the yen/dollar ratio are the re-
ally important data to take into account. Finally, there are those for whom the market
is not predictable at all, not even in principle. For them, prices essentially follow a
random walk and any information is immediately absorbed into the current prices and
cannot be made use of.
Today, there are technical reasons to believe that markets are not entirely random,
at least on some time scales, although the question is far from settled [8]. If one accepts
this view that there are opportunities for predicting future prices, what are the possible
approaches? One could use econometric models such auto-regressive moving averages
of prices plus random shocks to try to forecast future behaviour. But these models are
normally linear ones, while non-linear and even chaotic behaviour has been reported
for financial markets [8]. Thus, it would seem that feedforward multilayer ANNs might
be useful, since they are able to map any well-behaved function of the inputs thanks to
their universal approximation capabilities (see Section 6.3).
Feedforward networks trained by backpropagation can be employed to achieve
non-linear forecasting of time series. The general idea exploits a result of Takens
[12] and it goes as follows: let
p(t), p(t − 1),p(t − 2)...p(t − m)
(6.10)
be the current and past m + 1 values of an observation such as a price or the sampled
value of a signal. Using these values as an input pattern to a neural network and the
observed value p(t + 1) at time step t + 1 as a target for many values of t, we can train
a feedforward network to make one-step ahead prediction of the unknown future value
of the time series. Prediction n steps ahead can also be done but the quality of the
prediction will degrade as n increases. Once the net has learned to make the prediction
on the training values, it can be used out-of-sample for predicting future values of
the series. What the network learns implicitly is of course an approximation of the
unknown functional relationship between the next value and the m past samples of the
variable. Reported results using this prediction method are at least as good as several
other methods for non-linear forecasting [23]. This univariate, pure time-lagged model
works well for physical systems showing well-defined deterministic chaos. In the
less precise world of economical prediction one could also exploit the multivariate
approach that consists in taking into account related time series to the one for which a
predictor is sought and also some important market indicators as input values. Such an
approach has been implemented with good effect in [1,2], where a feedforward neural
network is used to find a mapping f of the form:
x(t) = f (I1, I2, . . . , In, x(t − ∆i))
(6.11)
where x(t−∆i) are lagged values of the variable to be predicted and the Ij are financial
indicators, possibly lagged, that might have an influence on x according to economic
theory. Among the many possible indicators, the most influential ones are chosen by
using statistical sensitivity analysis methods. For instance, in a prediction study of the
Swiss stock market index (SPI) [1], it was found that the most relevant indicators are
256
6. Artificial Neural Networks
LA COMPLEXITÉ
the monthly differences of long interest rates in Germany and Japan and the money
exchange rate between USA and Japan. These indicators, together with the past sam-
ples of the SPI time series are fed as inputs to a two-layer network. The output is the
one-step ahead prediction of the SPI value.
Fairly good results have been obtained with ANNs in the financial markets, in
general better than those obtained with most other methodologies (see for instance
[8] and references therein). However, one should be aware that there are no “magic”
solutions in this difficult field, only promising techniques. The phenomena are not
well understood and the ANN approach, even when it works, has a limited temporal
validity due to the evolutionary character of the markets. Besides, the well-known
black-box character of ANNs calls for using them together with other methodologies
having more explanatory power.
6.7
Unsupervised Learning
In previous sections we discussed networks of units in which the nets were trained
to perform an unknown input/ouput mapping by presenting the network examples of
input/output pairs. In unsupervised learning there is no feedback from the environment
as to the correcteness of the mapping; in other words, there is no “teacher”. Instead, the
network must be able to discover by itself any categories, patterns, or features possibly
present in the data. Networks that are able to infer pattern relationships without being
supervised are also called self-organizing. Among the unsupervised learning rules, we
will briefly discuss here Hebbian learning for single units and networks, competitive
learning, and Kohonen’s self-organizing feature maps.
6.7.1
Hebbian Learning
The rule that goes under his name, was proposed by Donald O. Hebb in his seminal
work “The Organization of Behavior” [13]. Hebb’s rule makes the weight strength
proportional to the product of the firing rates of the two interconnected neurons. That
is, when two connected neurons fire at the same time and repeatedly, the synapse’s
strength is increased. This biologically-motivated rule can be expressed for a single
unit as :
wij(t)
y
i
j
(t)
j
x (t)
i
wt+1 = wt + ρ yt xt,
ij
ij
j
i
(6.12)
where xti and ytj are the output values of neurons i and j at time t, wtij is the current
interconnection weight between neuron i and j, and ρ is a parameter called the learn-
257
LA COMPLEXITÉ
Marco TOMASSINI
ing rate. wt+1
ij
is the future value of the synaptic weight being updated during learning.
Hebb’s rule, which has inspired a large number of learning algorithms has the impor-
tant characteristic that it is local. The original Hebb rule is divergent, progressively
driving the weight to an infinite magnitude. Several modified Hebbian rules that do
not have this undesirable property have been devised [12].
Hebbian learning can be applied to networks of interacting units. One of the most
studied approaches is called Principal Component Analysis (PCA). PCA is a standard
statistical technique whose goal it is to extract m normalized orthogonal vectors in the
input space that account for as much of the data’s variance as possible. By projecting
the n-dimensional input vectors onto the m (m < n) orthogonal directions, we can
achieve dimensionality reduction with minimum loss in information content. Such
a transformation is achieved by rotating the coordinate system with standard linear
algebra operations. E. Oja [19] demonstrated that an artificial neural network is able
to compute in parallel, and on-line, the PCA transform. The PCA network is a layer
of parallel linear artificial neurons. Oja’s algorithm is a modification of Hebb’s rule
in which a weight-decay term proportional to y2 has been added. In Oja’s network
PCA takes place as a consequence of unsupervised neural learning with a Hebb-like
learning rule in which weight updating for a network is done as follows:
m
wt+1 = wt
wt
ij
ij + ρ (xtj −
kj ytk) yti,
(6.13)
k=1
where ρ is a learning constant and the other symbols have the usual meaning. There is
not room to further develop these ideas here but the interested reader can consult [12]
for instance.
6.7.2
Competitive Learning
Competitive learning is an unsupervised learning procedure in which the neurons of
a network learn to recognize clusters of similar input vectors. The network detects
regularities and correlations among the input vectors and adapts the future response
of the units to similar inputs. In competitive networks output units compete among
themselves for activation. The simplest competitive learning network consists of a
single layer of output neurons to which all inputs are connected. All the units are
presented with a given input vector x but only one output neuron is activated at any
given time: the so-called winner neuron. The winner unit i is the one with the largest
activation value (see Section 6.2):
wi · x ≥ wk · x, ∀k = i
(6.14)
If the input vectors are normalized ( xk = 1, for all k = 1 . . . n) then the unit i
with the smallest activation:
258
6. Artificial Neural Networks
LA COMPLEXITÉ
x1
y1
x2
y2
x3
y3
Figure 6.13: A single layer competitive learning network. The solid lines indicate
excitatory connections whereas the dashed lines indicate inhibitory connections.
wi − x ≤ wk − x , ∀k = i
(6.15)
that is, the unit with normalized weight closest to the input vector, becomes the winner.
The winner-take-all operation is implemented by connecting the outputs to the
other neurons by means of so-called lateral inhibitions, and also by means of self-
excitatory connections as depicted in Figure 6.13. Reference [12] and the references
cited therein give details of how inhibition and self-excitation is effected.
In practice, there is no need for the above net to be implemented explicitly: the
winning neuron can be found by simple search of the maximum activation. The neu-
ron having the maximum activation updates its weight while the weights of the other
neurons remain unchanged according to the iterative application of the following rule:
ρ(x
∆w
j − wij) if i is the winning unit
ij =
(6.16)
0
otherwise
The training process can be seen as a progressive tilt of the weight vector of the
winning unit towards the direction of the current input vector. Normalized input
vectors (see above) can be depicted as points on the surface of a unit sphere (or
circle). At the beginning the unit’s weight vectors and the input data vectors are not
aligned, if one assumes that both the initial weights as well as the input values are
drawn from a random distribution. With time, the competitive learning rule tends to
associate certain units with neighboring input vectors, also called data clusters. This
phenomenon is geometrically represented in Figure 6.14. When a stable solution is
found then the weight vector for each cluster represents in some sense the “center of
gravity” of that cluster, a kind of typical vector for this class of data. Since the number
of data clusters is not known in advance, some units in the network, that happen to be
far from any input vector, will turn out to be useless, in the sense that their activation
is small or zero for the data of the training set. This is what happens to vector w4
in Figure 6.14. These “spare” units could still be useful if the distribution of input
vectors changes in time, making the system more robust. Otherwise, there are several
259
LA COMPLEXITÉ
Marco TOMASSINI
ways to turn these units into “useful” units during the learning process (see [12]).
w1
w 2
w
w
1
2
w
w
3
4
w 4
w 3
Figure 6.14: A schematic representation of unsupervised competitive learning. The
points on the circle surface represent training data. On the left hand side of the fig-
ure initial unit weights are represented as random vectors. After training, the weight
vectors orient themselves toward the data clusters (right hand side figure).
Vector Quantization
An important use of competitive learning networks is vector
quantization which has applications in data compression. In this scheme the input
space is divided into disjoints regions in such a way that any input vector x falling into
one of the regions is represented by a single label characterizing that class. The class
label encoding can then be used later instead of the vector itself, leading to efficient
compression of data for storage and transmission purposes. The classes are defined by
a number of prototype vectors and the class to which a given input belongs is calculated
by taking the nearest prototype vector using a Euclidean distance metric.
There also exist supervised versions of vector quantization, such as Kohonen’s
learning vector quantization [12].
6.7.3
Self-Organizing Features Maps
Biological neural networks often show an architecture that depends on their function.
For example, in the visual region of the mammalian cortex the receptive zones are
arranged in such a way that neighboring light photons will stimulate areas of the cor-
tex that are also physically close to each other. This is called a topographic map and
is quite a common arrangement of neurons in regions of the cortex that are respon-
sible for processing sensory information such as visual and audio stimuli. Several
observations point to the fact that such topographic maps are not entirely genetically
determined. Rather, they seem to be created during the individual development by a
sort of unsupervised learning process.
Teuvo Kohonen invented an ANN model directly inspired by the existence of such
topographic maps in the brain [16]. Kohonen’s self-organizing map (SOM) is an un-
supervised learning network in which the neurons have a spatial arrangement, i.e., the
260
6. Artificial Neural Networks
LA COMPLEXITÉ
neurons are typically organized in a line or a plane (see Figure 6.15). This kind of net-
work is able to encode proximity features among the data. In other words, nodes that
are neighbors in the network encode patterns that are adjacent in some sense in pattern
space. Indeed, a self-organizing map has the property of topology preservation, that
is, nearby input patterns should activate nearby output neurons on the map. A network
that performs such a mapping is also called a feature map. In self-organized maps the
weights of locally interacting units are modified in response to input vectors according
to a certain rule until a global ordering emerges.
x1
x2
x3
Figure 6.15: Self-Organizing Map.
An early model of self-organization was developed by Willshaw and von der Mals-
burg in the mid 1970s [22]. They used excitatory lateral connections with neighbor-
ing neurons, and inhibitory connections with distant neurons, instead of using a strict
winner-take-all mechanism, as shown in Figure 6.16. The function that defines the
form of such lateral connection weights is known as the Mexican hat.
Wij
excitatory
connections
inhibitory
+
+
connections
-
-
i-j
Figure 6.16: Mexican hat form of the lateral connection weights.
Teuvo Kohonen [16] described a self-organizing map learning algorithm that ap-
proximates the effect of a Mexican hat form of lateral connection weights by taking
into account neighboring units for each output neuron. Neighborhoods introduce the
notion of topology in neural learning and make the neurons’ position with respect to
261
LA COMPLEXITÉ
Marco TOMASSINI
each other a significant aspect of the net. The neighborhood can be square, rectangular,
circular, etc. During training, all weight vectors associated with the winner neuron and
its neighbors are updated.
SOM can be seen as an extension to the simple competitive network of the previous
section. Given an input vector x, the winner unit is determined in the same way as in
Equation 6.15. Weight adaptation is then performed according to a learning rule very
similar to the competitive rule of Equation 6.16:
ρN (i)(x
∆w
j − wij) for all units in the neighborhood
ij =
(6.17)
0
otherwise.
The difference with the simple competitive rule is the presence of a “neighbor-
hood” function N (i) of unit i. This function is usually symmetric and monotonically
decreasing with the distance. At the beginnings of the learning process N defines
a large neighborhood and all the units in the neighborhood are updated for any in-
put vector x. As time goes by and learning progresses, the neighborhood is shrunk
down while the learning rate ρ is also being iteratively decreased in order to achieve
convergence. With respect to competitive training, learning now takes place over an
extended neighborhood and it is this feature that allows the emergence of a topographic
map. Eventually, a topographic map will be obtained in which nodes not only repre-
sent clusters of data but they are also arranged in the map such that neighboring units
encode clusters that are “close” to each other in some sense in the input pattern space.
An example of this process is a two-dimensional self-organizing map of 30 neurons
used to classify a certain number of random two-dimensional input vectors between -1
and 1. The two-dimensional map is a five by six neurons grid. Since each unit has only
two inputs, it is possible to visualize the representation of this net in weight space, in
which a point corresponds to each node’s weight vector. The progressive organization
of the weights according to the topology of the input space is depicted in Figure 6.17.
At the beginning the weight vectors are randomly placed. However, after about 300
cycles one can see that the map has begun to organize itself and, as time passes and the
neighborhood shrinks, the map is more and more evenly distributed across the input
space. Finally, after 1000 cycles, the net has become ordered with the correct topology.
Since the input vectors covered the square uniformly, the final map is correspondingly
an almost regular grid in this domain. In this sense it can be said that the map has been
able to “learn” the topology of the input space.
Feature maps have been applied in many areas including speech recognition, motor
control, robot sensing, finance, data compression, and combinatorial optimization.
6.8
Fault Tolerance
Soft-computing methodologies enjoy a certain amount of robustness in the face of
errors, and this aspect is one of their strong points when applied to noisy and imprecise
environments. Although ANNs are inspired by the nervous system, we have seen that
262
6. Artificial Neural Networks
LA COMPLEXITÉ
1
1
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2
0
0
W(i,2)
W(i,2)
−0.2
−0.2
−0.4
−0.4
−0.6
−0.6
−0.8
−0.8
−1
−1
−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
W(i,1)
W(i,1)
(a)
(b)
1
1
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2
0
0
W(i,2)
W(i,2)
−0.2
−0.2
−0.4
−0.4
−0.6
−0.6
−0.8
−0.8
−1
−1
−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
W(i,1)
W(i,1)
(c)
(d)
Figure 6.17: A 6 ×5 planar array of units mapping input vectors uniformly distributed
in the interval [-1,1] . (a) After 100 iterations. (b) After 300 iterations. (c) After
500 iterations. (d) Final configuration after 1000 iterations. These images have been
obtained with the Matlab software.
263
LA COMPLEXITÉ
Marco TOMASSINI
they are but a pale imitation of actual biological neural networks. However, they retain
some of the outstanding features of the latter such as collective computation abilities
and resilience with respect to noise and errors. The fault-tolerant aspects of neural
networks are all the more appreciable if one consider how faults may affect classical
computing devices. Many computational procedures are strictly deterministic: given
certain input values, they will always output the same results, using the same state
transitions. This fact is comforting but what happens if errors appear somewhere?
Change a single bit in that lovely fast sorting program and chances are that it won’t
work anymore. The fact is that engineered systems, as well as classical algorithms are
built in such a way that errors always have serious or fatal consequences, unless much
redundancy is built into the system. Thus, the standard engineering design process
produces systems that are optimized in some sense but that are also very fragile.
In contrast, the human brain, in spite of its enormous complexity is much more
reliable and less prone to catastrophic failure. Individual neurons are constantly dying
without being replaced but the brain manages to compensate for this loss during life-
time, unless the damage is a serious one. Natural systems have not been designed in
the customary engineering way; rather, they are the result of the evolutionary process.
As such, they have been able to survive in all kinds of hostile environments by select-
ing and reproducing the structures that were the fittest at a given moment. At the same
time, new structures were being created continuously by mutation and recombination
and tried in new environments. These largely random processes have been remarkably
successful, although they have taken millions of years. The human brain is as much
the product of evolution as it is of learning. In fact, the detailed “wiring” of the brain
is too complex to be determined by the genetic code alone. Together with nerve cell
maturation and morphogenesis, learning shapes and fine-tunes the cell connections, al-
lowing the brain to become the marvelous machine that allows us to function properly
in a continuously changing and complex environment. We will see that ANNs too can
be designed by similar processes of evolution and learning, but these processes will
obviously be artificial and usually simulated by a computer program. Artificial neu-
ral networks too can withstand a certain amount of errors and units may fail without
necessarily driving the whole system to a grinding halt. This is also called graceful
degradation and is obviously a very desirable property for machines as well as for hu-
mans. Most of the ANN systems that have been described here are not very sensitive
to small amounts of noise or the loss of a unit here and there. Weight noise and unit
misfiring has even been used purposefully during ANN training in order to enhance
the fault-tolerance properties of the nets [6]. If the artificial neural system is a physi-
cal machine and not just a simulation program, this feature is even more useful, since
failures are possible and very real in this case.
6.9
Artificial Neural Nets and Statistics
In many fields such as social science, finance, and economy there often exist a lot of
data but explanatory theories and models are either lacking or they are not realistic
264
6. Artificial Neural Networks
LA COMPLEXITÉ
Neural Networks
Statistics
learning
model estimation
supervised learning
non-linear regression
unsupervised learning
cluster analysis
weights
parameters
inputs
independent variables
outputs
dependent variables
Table 6.1: Artificial neural networks and statistics: glossary of corresponding terms.
enough as to guide prediction and decision. In data-rich but model-poor environments,
people normally use statistical inference techniques to build system understanding and
to estimate models. Parametric and non-parametric linear and non-linear regression
models, such as autoregressive moving average (ARMA) models are well-known ex-
amples of statistical inference methods. Non-linear, nonparametric methods in particu-
lar make few a priori assumptions about the underlying system and, as a consequence,
have a considerable number of degrees of freedom and explore a large space of func-
tions to fit the observations.
It is clear that there are close links between neural network models and these sta-
tistical analysis techniques. Table 6.1 gives a minimal glossary of terms describing
similar concepts in the two fields.
Indeed, from the formal statistical point of view, ANN models such as supervised
feedforward nets can be seen as non-linear regresssion models [4], while other models
are similar or identical to clustering, discriminant analysis techniques, or to stochastic
approximation theory. Thus, the question is: are these ANN models really new and
what do they have to offer with respect to established statistical methods?
The answer (see [4] and the comments by other authors therein) is not entirely
clear. As noted, it appears that, indeed, ANN formalisms can be reduced for the most
part to some known statistical model. But for sure, ANNs are “sexier” and more attrac-
tive than equivalent statistical techniques for many non-specialists. Also, statisticians
have concentrated for the most part on linear models and on comparatively small typ-
ical numbers of parameters. Neural networks provide us with tractable multivariate
non-linear models that are easy to implement and that enjoy suggestive pictorial inter-
pretations, avoiding the somewhat “dry” aspects of an equivalent statistical approach.
There are other advantages as well: they are simple to implement, they enjoy a great
simplicity of representation, and can be easily tuned to particular problems. Besides,
they can be implemented in hardware with the obvious performance gains due to the
direct execution and their intrinsic parallelism. Another advantage is that ANNs can
be used as “modules” in hybrid systems. Statistics, on the other hand, may give ANNs
researchers a firm footing for evaluating and analysing new ANN techniques and for
relieving somewhat the lack of explanatory power of ANNs. Neural networks and
statistics should not be seen as competing methods. Combinations of ANN and purely
statistical methodologies are of great interest and are being actively pursued [4].
265
LA COMPLEXITÉ
Marco TOMASSINI
Acknowledgment.
Many of the figures have been provided by my colleague A.
Pérez, who also kindly read the manuscript.
6.10
Bibliography
[1] T. Ankenbrand and M. Tomassini. Multivariate time series modelling of financial
markets with artificial neural networks. In D. W. Pearson, N. C. Steele, and
F. Albrecht, editors, Proceedings of International Conference on Artificial Neural
Networks and Genetic Algorithms (ICANNGA95), pages 257–260, Vienna, 1995.
Springer-Verlag KG.
[2] T. Ankenbrand and M. Tomassini. Predicting multivariate financial time series
using neural networks: the swiss bond case. In Proceedings of the IEEE/IAFE
Conference on Computational Intelligence for Financial Engineering, pages 27–
33, New York, 1996. IEEE.
[3] T. Ash and G. Cottrell. Topology-modifying neural network algorithms. In
Michael A. Arbib, editor, Handbook of Brain Theory and Neural Networks, pages
990–993. MIT Press, 1995.
[4] B. Cheng and D. M. Titterington. Neural networks: a review from a statistical
perspective. Statistical Science, 9:2–54, 1994.
[5] S. Cho. Neural-network classifiers for recognizing totally unconstrained hand-
written numerals. IEEE Transactions on Neural Networks, 8:43–53, 1997.
[6] J. D. Cowan. Fault tolerance. In Michael A. Arbib, editor, Handbook of Brain
Theory and Neural Networks, pages 390–395. MIT Press, 1995.
[7] Y. Le Cun, B. Boser, J. Denker, D. Henderson, R. Howard, W. Hubbard, and
L. Jackel. Handwritten digit recognition with a backpropagation network. In
D. S. Touretzky, editor, Advances in Neural Information Processing Systems 2,
pages 396–404, San Mateo, CA, 1990. Morgan Kaufmann.
[8] A.-P. Refenes (Ed.). Neural Networks in the Capital Markets. John Wiley, Chich-
ester, 1995.
[9] S.E. Fahlman and C. Lebiere. The Cascade-Correlation learning architecture. In
NIPS2, pages 524–532, 1990.
[10] P. D. Gader, J. M. Keller, R. Krishnapuram, J. Chiang, and M. Mohamed. Neural
and fuzzy methods in handwriting recognition. IEEE Computer, pages 79–86,
February 1997.
[11] K. Gurney. An introduction to neural networks. UCL Press, London, 1997.
[12] M. H. Hassoun. Fundamentals of artificial neural networks. MIT Press, Cam-
bridge, MA, 1995.
[13] D. O. Hebb. The Organization of Behavior. Wiley, New York, 1949.
266
BIBLIOGRAPHY
LA COMPLEXITÉ
[14] J. Hertz, A. Krogh, and R. G. Palmer. Introduction to the theory of neural com-
putation. SFI Studies in the Sciences of Complexity. Addison-Wesley, Redwood
City, CA, 1991.
[15] J. Hopfield. Neural networks and physical systems with emergent collective com-
putational properties. Proceedings of the National Academy of Sciences USA,
79:2554–2558, 1982.
[16] T. Kohonen. Self-Organizing Maps, volume 30. Springer Series in Information
Sciences, april 1995.
[17] J. L. McCulloch and W. Pitts. A logical calculus of ideas immanent in nervous
activity. Bulletin of Mathematical Biophysics, 5:115–133, 1943.
[18] M. L. Minsky. Computation: Finite and Infinite Machines. Prentice-Hall, Engle-
wood Cliffs, New Jersey, 1967.
[19] E. Oja. Principal component analysis. In Michael A. Arbib, editor, Handbook of
Brain Theory and Neural Networks, pages 753–756. MIT Press, 1995.
[20] R. Rojas. Theorie der Neuronalen Netze. Springer, Heidelberg, 1993.
[21] F. Rosenblatt. Principles of Neurodynamics: Perceptrons and the theory of brain
mechanics. Spartan Books, Washington D.C., 1962.
[22] C. von der Malsburg. Self-organization of orientation sensitive cells in the striate
cortex. Kybernetik, 14:85–100, 1973.
[23] A. S. Weigend and N. A. Gershenfeld (Eds.). Times Series Prediction: Fore-
casting the Future and Understanding the Past. Addison-Wesley, Reading, MA,
1994.
[24] P.J. Werbos. The Roots of Backpropagation: From ordered derivatives to Neural
Neworks and Political Forecasting. John Wiley and Sons, New York, 1994.
[25] B. Widrow and M. Lehr.
Perceptrons, Adalines, and Backpropagation.
In
Michael A. Arbib, editor, Handbook of Brain Theory and Neural Networks, pages
719–724. MIT Press, 1995.
267
Chapitre 7
Evolutionary Design of Artificial
Neural Networks
7.1
Introduction
We saw that artificial neural networks (ANN) are biologically-inspired computational
models that have the capability of somehow “learning” or “self-organizing” to ac-
complish a given task. They are particularly efficient when the nature of the task is
ill-defined and the input/output mapping largely unknown. However, many aspects
may affect the performance of an ANN on a given problem. Among them, the most
important is the structure of the neuron connections i.e., the topology of the net, the
connection weights, the details of the learning rules and of the neural activation func-
tion, and the data sets to be used for learning. There are guidelines for picking or
finding reasonable values for all of these network parameters but most are rules of
thumb with little theoretical background and without any relationship with each other.
Artificial evolution can be used in conjunction with neural networks and it has
the potential for addressing several current network design problems. A bio-inspired
rationale for this approach comes from the study of nervous systems. Today we know
that the brain is as much a product of evolution as it is of development and learning. Its
overall structure is determined by the information stored in the genotype (the DNA);
during development, which is a mapping from genotype to phenotype, this information
gives rise to the actual material structure of the brain. But this process is strictly
intertwined with cell and connection modifications determined by learning. Thus, by
using artificial evolutionary techniques together with the adaptive capabilities of ANN,
we can somehow “imitate” an abstract form of the natural processes that give rise to the
mature nervous system. Automatically designing ANNs through artificial evolution
has advantages over manual design as the complexity of the ANN increases. The
evolutionary engineering approach is a more integrated and rational way of designing
ANNs since it allows single aspects of the design to be taken into account as well
as several interacting aspects at once and does not always require expert knowledge.
Thus, artificial evolution is the only known way of designing ANNs that covers all
269
LA COMPLEXITÉ
Marco TOMASSINI
aspects of the design problem as an integrated whole.
Evolutionary algorithms can be applied to neural networks in several ways. The
most important are: setting the weights in a fixed topology network, determining net-
work topologies, evolving learning rules, and input feature selection. Several systems
allow the simultaneous evolution of the architecture and the weights and even of the
node transfer function. This field of research is only ten years old and has not yet been
systematized. In the following sections we will describe these applications using the
recently published literature. Two good review articles on the subject are [40,37].
7.2
Evolving Weights in a Predefined Network
Architecture
Let us consider again feedforward, multilayer neural networks. It is well-known that
the standard algorithm for finding suitable weights in the supervised learning regime is
backpropagation. However, backpropagation has a number of problems, one of which
is the tendency to get stuck at local optima in weight space. Stochastic training meth-
ods could be a good alternative to gradient-based ones for network training. Global op-
timization heuristics, such as simulated annealing and evolutionary algorithms should
be more effective for finding suitable weights in multi-modal, non-convex complex
search spaces, since they are less prone to get trapped in local optimum points. Here
we will concentrate on genetic algorithms, another advantage of which is that they also
work for nondifferentiable activation functions such as threshold logic units and for
recurrent and arbitrarily interconnected networks, which pose problems for gradient-
based techniques. In fact, rather than adapting weights based on local improvement
only, GAs evolve weights based on the whole network fitness. Early work in the field
has been done by Montana and Davis [27] and by Whitley and Hanson [38].
The GA chromosome representing the network can be a list of weights. Usually,
each weight in the list would be represented as a binary string to be decoded into real
values between, say, −1 and +1. However, binary encoding has a couple of drawbacks
for this application (and for many others as well). As the number of connections
increase, and it is not uncommon for ANNs to possess hundreds of them, the length
of the string increases to a size that severely slows down the evolutionary process.
Besides poor scalability, if higher precision is needed in the weight values, more bits
have to be added to their binary representation, slowing the search even further. To
relieve this problem, real-coded strings can be used instead to directly represent the
weight values of a given network. Real-valued chromosomes have many advantages
but require specialized genetic operators. Montana and Davis [27] chose this kind
of representation. They also needed a convention for the ordering of weights in the
chromosome: successive weights in the list were assigned to an individual network
from top to bottom and left to right. Figure 7.1 depicts this encoding for a small
example network (networks with 126 connection weights were used in [27]).
After decoding an individual into the corresponding network, its fitness is calcu-
270
7. Evolutionary Design of Artificial Neural Networks
LA COMPLEXITÉ
0.2
-0.3
0.6
0.7
-0.5
0.4
Individual: (0.2, -0.3, 0.6, -0.5, 0.4, 0.7)
Figure 7.1: GA individual encoding. The weights are represented by a list of real
numbers in a predefined order (see text).
lated by computing the cumulated error on the training data. Mutation consists of
randomly choosing a non-input neuron and altering its weight by a random value. The
recombination operator takes two parent weight vector strings and constructs a single
offspring by selecting at random one of the parents’ weight for each non-input unit
connection from the previous layer. Figure 7.2 illustrates this strategy which is a kind
of uniform crossover. Putting the weights of the connections to the same hidden or
output node together in the individual representation is beneficial from the point of
view of the recombination operator, since otherwise useful evolved feature detectors
might systematically be destroyed by crossover. The whole evolutionary process can
be described by the following pseudo-code:
generation = 0
Assign random weight vectors to the initial population of networks
while not termination condition do
generation = generation + 1
For each genotype, construct the corresponding network
Compute the fitness of each network on the test data
Select and reproduce networks according to fitness
Recombine and mutate selected networks
end while
Figure 7.3 is a schematic illustration of the relationships between the population
of weight vector strings and the trained fixed architectures that correspond to this par-
ticular encoding. With this technique, Montana and Davis obtained good results on
a problem of classification of sonar data as compared to standard backpropagation.
This need not always be the case. It has been found that genetic algorithms are rather
slow with respect to fast versions of gradient-based methods for supervised applica-
tions. When there is gradient information, hybrid methods may be effective. Indeed,
the efficiency of the evolutionary training can be speeded up by incorporating a local
271
LA COMPLEXITÉ
Marco TOMASSINI
a)
b)
0.2
-0.3
-0.2
0.4
0.6
-0.5
0.4
0.7
0.8
-0.7
0.1
-0.3
(0.2, -0.3, 0.6, -0.5, 0.4, 0.7)
(-0.2, 0.4, 0.8, -0.7, 0.1, -0.3)
c)
-0.2
0
- .4
0.3
0.8
6
-0.7
0.1
-0.3
(-0.2, -0.3, 0.6, -0.7, 0.1, -0.3)
Figure 7.2: Montana and Davis’s crossover. Weights for the connections of a single
offspring c) are chosen at random from either parent a) or b) (see text).
gradient-descent operator. GAs are usually good at locating promising broad regions,
while local search may add the needed fine-tuning ability. This process can be seen
as a synergetic interaction between evolution and learning [19,5] and has given better
results than using GAs alone on some applications.
Another factor that makes the GA not entirely suitable for training feedforward
nets is the so-called permutation problem , a topological phenomenon that limits the
power of the standard GA crossover operator. The origin of the problem stems from
the fact that different linear genotypes may give rise to functionally equivalent net-
works. This can be seen for example in Figure 7.4. In general, any permutation of
the hidden nodes will produce equivalent networks in terms of network function and,
of course, also in terms of the fitness measure. GAs that use standard crossover are
likely to create offspring that contain repeated components. That is, because the strings
of weights (genotypes) are ordered differently, crossover between such individuals is
likely to produce an offspring that contains multiple copies of the same hidden node
and will omit other hidden nodes. Highly fit children are thus difficult to result, since
the offspring network will be less able computationally than either parent because use-
ful feature detectors associated to hidden nodes during evolution will be lost.
Since crossover seems to be at the root of several difficulties when evolving weights
for fixed structure networks, some researchers have proposed the use of other evo-
lutionary algorithms that do not rely on recombination of individuals. Evolutionary
programming (EP) is such a technique. The sole source of variation in EP is a sophis-
272
7. Evolutionary Design of Artificial Neural Networks
LA COMPLEXITÉ
decoding
w
w
1
2
w
w
w
w
w
w
1
2
3
4
5
6
w
w
3 w
6
4
w5
...
...
fitness
valu
evolution
es
training
operators
Figure 7.3: Illustration of the artificial evolution of a population of chromosomes en-
coding the connection weights of a fixed network architecture. The left part of the
figure shows the population of strings coding for the weights on which the evolution-
ary operators are applied. The right part depicts the decoded networks that are subject
to fitness evaluation.
ticated form of mutation which, by its local character, ensures that individual modifi-
cations will not be disruptive. By avoiding crossover, feature detectors are conserved
and each network keeps its individual functionality with smooth variation. Fogel et
al. [16] applied EP to the evolution of connection weights by training a population of
networks on a set of standard problems and several successive studies demonstrated
the viability of the approach. In all these works the mutation operator randomly alters
the weigths according to a Gaussian distribution with variance proportional to some
measure of the network error on the task. Evolution strategies, being similar to evo-
lutionary programming in their enphasis on advanced mutation operators, are likewise
useful for the weight search problem.
7.2.1
Genetic Algorithms and Reinforcement Learning Networks
While the usefulness of evolutionary algorithms is questionable for setting weights in
cases where established and fast methods already exist, they can be effective when
gradient-based methods are not appropriate or cannot be applied. This is the case
for reinforcement learning. To understand the principles, consider an ideal neural
network-based robot that can learn. It has sensors to perceive the state of its envi-
273
LA COMPLEXITÉ
Marco TOMASSINI
6
2
2
6
4
3
3
4
8
8
( (
6 6
, ,
22
,,
3
3 ,
,4
,8
, ,
80
,)
0 )
( (
22
,,
6
6 ,
,4
,3
, ,
30,
, 8
0 )
, 8 )
Figure 7.4: Illustration of the permutation problem. The networks are topologically
equivalent but their encodings differ.
ronments and it can perform a set of actions through its neurocontrol system, such as
avoiding obstacles, picking up objects, recharging its batteries, and so on. Such a sen-
sorimotor agent may learn to control its behavior by experimenting in its environment.
The robot performs certain actions as a function of the sensory information it gathers
along the way. The learning system gives a reinforcement signal, either a positive re-
ward, if the action is considered beneficial, or a punishement if the action decreases
the fitness of the agent with respect to the environment. The basic principle is that a
GA evolves the synaptic weights such that a positive response causes strengthening of
active connections, while a negative response weakens them. Genetic algorithms can
be used here even in the absence of precise target output values because they can rely
on a relative performance measure for each set of weights. Genetic algorithms have
been successfully applied in finding good weights for neurocontrol problems such as
the inverted pendulum or some autonomous robot learning tasks [14]. The case study
at the end of this document deals with the evolution of autonomous adaptive robots
in detail, and the reader is referred to that section for an in-depth discussion. Rein-
forcement learning is described, for example, in an article by Barto [4]. A discussion
of evolutionary algorithms in the context of reinforcement learning networks can be
found in [37].
7.3
Evolving Network Architectures
Selection of a suitable network structure is a very important step towards the success
of an ANN approach to a given task. If there are too many degrees of freedom, overfit-
ting and poor generalization may result. Thus, a compromise must be found between
providing sufficient degrees of freedom for the problem to be well represented and
the generalization ability on the task. Choosing a suitable topology implies determin-
ing the appropriate number of nodes and their interconnection patterns. Practitioners
usually design the network architecture in an unsystematic manner by guesswork and
trial and error. But it is true that there are also more refined constructive or pruning
274
7. Evolutionary Design of Artificial Neural Networks
LA COMPLEXITÉ
heuristics. Overall, these methods approach the network construction problem in a
constrained manner: the space of possible architectures, which is enormous and struc-
turally complex, is searched in a very limited way, slightly changing a given archi-
tecture through predefined structural modifications. Clearly, evolutionary computation
with its global search capabilities of multimodal, discontinuous spaces is a promising
methodology for the architecture induction problem. There are two major ways in
which EAs have been used for searching network topologies: either all aspects of a
network architecture are encoded into an individual or a compressed description of the
network is evolved. In the first case we speak of direct encoding schemes, while the
latter leads to so-called grammatical , morphogenetical , or simply indirect encodings.
7.3.1
Direct Encoding
In this scheme, the connection topology is represented by means of an adjacency ma-
trix that is, an N -node architecture is represented by an N ×N matrix A, where aij = 1
means that there is a connection between units i and j and aij = 0 stands for no con-
nection. An individual in a population of architectures is simply the string resulting
from the concatenation of successive rows of the matrix. This encoding is depicted in
Figure 7.5. Actually, the entries aij may also represent connection weights, in which
case both the architecture and the weights can be evolved at the same time. There are
several examples of work done with this kind of representation, see for instance ref-
erences [26,39]. This encoding is easy to understand and to implement. For example,
setting architectural constraints on the network types that will be evolved, such as strict
feedforward nets or arbitrarily connected recurrent ones is straightforward.
1 2 3 4 5 6
6
1
0 0 0 1 1 0
2
0 0 0 1 0 1
3
0 0 0 0 1 0
4
5
4
0 0 0 0 0 1
5
0 0 0 0 0 1
6
0 0 0 0 0 0
1
2
3
Figure 7.5: A direct encoding representation scheme. The left part of the figure repre-
sents the network architecture, while the right part of the figure shows the correspond-
ing connectivity or adjacency matrix. A 1 at the crosspoint between row i and column
j means that there is a connection going from unit i to unit j. A 0 entry stands for no
connection between the corresponding units.
The evolutionary algorithm works according to the following pseudocode:
generation = 0
275
LA COMPLEXITÉ
Marco TOMASSINI
Initialise the population of individuals
while not termination condition do
generation = generation + 1
For each individual, decode its representation into
the corresponding architecture
Compute the fitness of each architecture by training
it on the test data
Select and reproduce networks according to fitness
Recombine and mutate selected networks
end while
Figure 7.6 schematically depicts the relationships between genetic evolution of the
population of genotypes and the decoding and training process of the actual networks
(phenotypes).
Decode
¥¡¥¡¥¡¥
£¡£¡£¡£
¤¡¤¡¤¡¤
...
¡ ¡ ¡ ¢¡¢¡¢¡¢
¡ ¡ ¡
...
Fitness
Val
Evolutions
u
es
TRAINING
operators
Figure 7.6: In the direct encoding scheme, architectures are fully described by their
chromosome string representation. Decoded architectures are trained by some form
of supervised learning in order to attribute a fitness value to each one of them. The
evolutionary algorithms selects and evolve encoded architectures based on these fitness
values.
The genetic operators can be implemented in several ways. For example, Miller
et al. [26] used low probability bit mutation and crossover was done by swapping a
randomly chosen row between the two selected parents. Once decoded into a network,
each individual’s fitness is simply calculated by some variant of backpropagation learn-
ing as usual. However, it is sometimes useful to include network complexity measures
such as the number of nodes and connections into the fitness, in order to create selec-
tive pressure towards smaller networks. Training time is another fitness criterion that
276
7. Evolutionary Design of Artificial Neural Networks
LA COMPLEXITÉ
has been often used. We mention here in passing that there is practically no limitation
as to how the network’s fitness should be defined since continuity and differentiability
is not required.
But the direct encoding scheme also has a major drawback: it does not scale well
since an N -node network potentially has on the order of N 2 connections, leading to
very long chromosomes. Although the size of the matrices can be reduced by using
constraints coming from previous knowledge on the architectures to be evolved, the
scaling properties of the representation are poor. The permutation problem that was
described in the previous section is still there and, moreover, training a whole popula-
tion of networks by backpropagation or similar methods can be extremely slow. Direct
encoding is thus only useful for small architectures.
As we noted at the beginning of the section, there is the possibility of evolving the
architecture and the weights of the network at the same time [7,2]. In this context,
the work of Angeline et al. [2] is interesting because it avoids some of the problems
related to recombination and can evolve a wide spectrum of ANN architectures. In
[2] Evolutionary Programming (EP) was used. In EP, the representation evaluated by
the fitness function is directly submitted to a single evolutionary operator: mutation. A
form of structural mutation is used for architecture evolution and, at the same time, the
connection weights are also altered by parametric mutations. Parametric mutations are
similar to those described in the previous section on the evolution of weights in a fixed
architecture [16]. Structural mutations alter the number of hidden nodes and the con-
nectivity between all nodes, with some restrictions concerning input and output nodes.
Structural mutations attempt to preserve the behavior of a network by making changes
as smooth as possible. This is different from genetic algorithms where recombination
may strongly disrupt the continuity between parent and child, often destroying useful
building blocks. The methodology has been demonstrated on a number of test cases
with good results and little restriction on the architecture typology.
Another effort in this direction is the recent work of Yao and Liu [42] in which the
evolutionary system called EP-Net is proposed. EP-Net is based on evolutionary pro-
gramming with several different sophisticated mutation operators. The operators are
designed in such a way as to maintain close links between parents and their offspring.
EP-Net tries to evolve the behavior of the ANN and simultaneously determines the
architecture and the weights, including biases. Although an explicit measure of parsi-
mony is not used in EP-Net, the system favors simpler and more effective networks by
preferring node or connection deletion rather than node or connection addition, if the
performance is not lowered by the move. The system has been tested on a number of
standard benchmark problems with good results.
The same authors recently proposed making explicit use of population information
[43]. Instead of just picking the best ANN in the last generation as the final result,
Yao and Liu suggest forming the result by combining in a certain way the individuals
in the last generation in order to exploit all the information contained in the whole
population. The idea is that a population contains more information than any isolated
individual and that such information can be suitably used to improve generalization of
the learning system. The appproach has been tested on real world data sets, showing
277
LA COMPLEXITÉ
Marco TOMASSINI
that there are simple linear combination methods of individuals in the last generation
that always produce results outperforming the best individual.
Genetic programming has been used by Koza as a flexible system for finding both
the architecture and the weights of a neural network [21]. Koza and coworkers used
trees as building blocks for representing neural networks. Although they were able
to evolve neural networks for solving a few standard tasks, the tree representation ap-
peared not to be flexible enough for evolving general purpose ANNs. In the next sec-
tion we will see other more useful ways of using genetic programming-like techniques
for discovering neural network architectures and weights for a given task.
Most of the applications of evolutionary algorithms to date have been made on
multilayer feedforward or recurrent architectures. A notable exception is the work of
Polani and Uthmann [31] who applied a GA to find improved topologies for Koho-
nen’s feature maps . In this study, GAs were applied to create self-organizing maps
able to adapt to a given input space without imposing a predefined topology. It was
found that supposedly optimal “flat” topologies do not always give the best results.
If these findings are confirmed, the technique could thus be useful for constructing
network topologies for Kohonen’s maps with better convergence speed and adaptation
properties.
7.3.2
Indirect Encoding
In view of the scalability problems brought about by direct encoding methods and
of their consequences in terms of performance, several researchers focused their ef-
forts on techniques for developing or growing neural networks, rather than looking
for a complete network description at the individual level. In general terms, these so-
called grammatical encodings are variations or extensions of Lindenmayer’s L-system
[22,32], a string-rewriting system which was originally introduced as a mathematical
model of plant development. The purported advantages of grammatical encodings are
better scalability and the possibility of finding building blocks of general utility and of
reusing developmental rules for general classes of problems. Moreover, the destructive
effect of crossover should be less relevant for developmental rules than for direct en-
codings. In a way, this approach is the closest in spirit to the growth and development
of biological neural networks, although at a very much simpler and abstract level.
The early work of Kitano [20] is representative of this methodology and will be
described in some detail. In Kitano’s methodology, a graph-generation grammar is
encoded into a chromosome. A standard GA is used to evolve a population of such
network-generating grammars, fitness being measured after “morphological develop-
ment”, whereby an actual ANN architecture is generated from its grammar description.
A grammar consists of one or more productions. A production is a rewriting rule
that associates a left-hand side called a head to a right-hand side called a body. The
two sides are separated by the metasymbol →. The symbol → indicates that the
category to the left of it can be composed of the category to the right; that is, it “can be
rewritten as” the category to the right. For example, the following productions define
278
7. Evolutionary Design of Artificial Neural Networks
LA COMPLEXITÉ
a number grammatically:
digit → 0 | 1 | 2 | 3 | 4 | 5 | 7 | 8 | 9
number → digit
number → number digit
The symbols 0 . . . 9 are called terminals. Terminals cannot appear on the left-hand side
of a production. The symbol | stands for “or”. That is, when applying the production,
only one of the terminals can be chosen. To construct a number , let us start with
the third production and replace 2 for digit : this gives us number → number 2.
Now rewrite number on the right using again the same production with 4 as digit :
number → number 42, finally use production two with digit = 7 in place of
number on the right: number → 742. That is, the numeric string 742 has been
generated. Since all the symbols on the right are terminals, the generation is complete.
Kitano applied the above idea to generate connection matrices for networks by
means of a set of graph generation rules. Figure 7.7 shows an example of a graph-
generation grammar. Each production rule consists of a left-hand side which is a non-
terminal and a right-hand side which is a 2 × 2 matrix. The right-hand side can be
a terminal or a non-terminal; in the latter case the matrices are binary i.e., they have
elements that can only be 0 or 1. By successively applying the productions from a
given start symbol one ends up with a binary array which can be interpreted as the
adjacency matrix of a directed graph representing a neural network. Since each non-
terminal symbol in the grammar has only one right-hand side, the generated network is
unique. The example shown in the Figure depicts the development of a neural network
which is able to compute the exclusive-or function if suitable weights are provided.
A GA was used for evolving a population of individuals consisting of a grammar
representation of networks. Each individual has a number of separate components
which correspond to the production rules of the grammar. Each rule is represented in
five positions. The first rule begins with the start symbol S and is followed by four
symbols chosen from the set A − Z. This is always required to get the development
process started. All the other rules have a left-hand side of a rule in the first position,
followed by four symbols chosen from the set a −p. The rules that transform a symbol
in a −p to a binary 2×2 matrix are fixed and do not appear in the individual represen-
tation. Figure 7.7 graphically depicts the structure of the chromosome. The GA used
in [20] was a rather standard one, with fitness proportionate reproduction, elitism, vari-
able mutation rate, and single and multi-point crossover. The fitness of an individual
is evaluated by first building a network from the grammar encoded in the individual
and then training the network by backpropagation on a given task. As in other studies,
the sum of the square of the errors was used as the quality criterion. As Yao and Liu
[42] indicated, this method of evaluating fitness is very noisy and inaccurate because
the fitness obtained depends on the random initial weights.
Kitano applied his grammatical approach to various sizes of encoder/decoder prob-
lems, comparing the results with those obtained from direct encoding methods. He
279
LA COMPLEXITÉ
Marco TOMASSINI
S
A B
C D
c p
a a
a a
a a
A
B
C
D
a c
a e
a a
a b
a
0 0
b
0 0
c
1 0
d
0 1
e
1 1
0 0
0 1
0 1
0 1
1 1
¡ ¡ ¡
¢¡¢¡¢
(1)
¡ ¡ ¡
¢¡¢¡¢
1 0 1 1 0 0 0 0
¡ ¡ ¡
¢¡¢¡¢
c p a a
0 1 1 1 0 0 0 0
a c a e
0 0 1 0 0 0 0 1
¡ ¡ ¡
¢¡¢¡¢
S
A B
C D
a a a a
0 0 0 1 0 0 0 1
a a a b
0 0 0 0 0 0 0 0
¡ ¡ ¡
¢¡¢¡¢
0 0 0 0 0 0 0 0
¡ ¡ ¡
¢¡¢¡¢
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 1
¡ ¡ ¡
¢¡¢¡¢
7
2
3
0
1
(2)
Figure 7.7: Example of the use of graph-generation production rules for the develop-
ment of a feedforward exclusive-or network. The grammar is depicted in the upper
part (1). The lower part (2) of the figure shows the successive application of the gram-
mar rules starting from symbol S and producing a network for the computation of the
exclusive-or Boolean function. The figure is adapted from Kitano’s article [20].
280
7. Evolutionary Design of Artificial Neural Networks
LA COMPLEXITÉ
...
S
A
B
C
D
A
c
p
a
c
...
Figure 7.8: Schematic illustration of a graph-generation grammar encoded into a GA
individual. The figure is adapted from Kitano’s article [20].
concluded that the grammar encoding method converges much faster, generates more
regular network structures, and scales better with respect to the network size. But Ki-
tano’s test cases were very simple and, on the other hand, recent work by Siddigi and
Lucas [34] has shown that Kitano’s encoding scheme does not provide an advantage
over the direct encoding scheme. Although no general conclusions as to whether the
approach is actually superior can be drawn yet, grammatical encoding is an interest-
ing idea that has been further pursued by other researchers. F. Gruau, in particular,
developed a grammar based developmental technique called cellular encoding [17] in
which genetic programming is used to evolve the architecture of neural networks. Cel-
lular encoding is a powerful but rather complicated method that cannot be reasonably
explained succinctly. Reference [37] has a readable introductory description of the
technique. Using cellular encoding, Gruau and coworkers have been able to grow ar-
tificial neural network architectures and weights for various difficult tasks, including
recurrent network architectures (see [37] and references therein). The weights could be
at first only +1 or −1, that is only Boolean networks could be developed. Later, real
weighting schemes were included in the evolutionary process, as well as biases and
thresholds and a form of automatic function definition that allows reuse of subtrees
and results in more modular architectures.
7.4
Evolution of Learning Rules
An interesting application of evolutionary algorithms to the design of neural networks
is the discovery of learning rules for adjusting the connection weights. There exist
several standard learning rules. However, when there is little knowledge about the
most suitable architecture for a given problem, the automatic and possibly dynamic
adaptation of the learning rule becomes very useful. Evolutionary algorithms are em-
inently suited to the task. One of the first studies in this direction was conducted by
Chalmers [9]. In this work, Chalmers’s aim was to see if the well-known delta rule,
or a fitter variant, could be evolved automatically by a GA. A number of assumptions
were made at the outset in order to restrict the form of the learning algorithm to linear
281
LA COMPLEXITÉ
Marco TOMASSINI
functions of the relevant parameters i.e., the input, output, and target values, as well
as the weight change and a scale parameter similar to the learning rate constant. The
pairwise products of the parameters were also used in the linear combination. The
network architecture was feedforward with input and output layers only, which can
only learn mappings that are linearly separable. With a suitable chromosome encoding
and using a number of linearly separable mappings for training, Chalmers was able to
evolve a rule analogous to the delta rule, as well as some of its variants. Although this
study was limited to somewhat constrained network and parameter spaces, it paved the
way for further progress. It is still unknown wheter artificial evolution can discover ef-
ficient learning algorithms for complex networks. However, it is apparent that genetic
programming should prove particularly suitable for the task of discovering new rule
forms and, of course, coevolution of architectures, weights, and learning rules could
take place simultaneously, although the search space would be enormous in this case.
Reference [40] presents a summary of the work done in the field in the last few years.
7.5
ANN Input Data Selection
Statistical methods, such as principal components analysis, are customarily used to
select input features to a neural network in order to reduce or combine the data into
a statistically significant and effective sample. Indeed, in many works using neural
networks, no particular effort is made to select appropriate input features, which obvi-
ously detracts from efficiency by slowing down training and may lead to poor general-
ization capabilities. Evolutionary algorithms are an attractive alternative to statistical
methods for dimensionality reduction of the input data set. Several studies have shown
that searching the space of input data for optimal or nearly optimal compact subspaces
can be done effectively with evolutionary algorithms, substantially reducing the size
of the input data without a loss in performance. Essentially, in an EA encoding of the
problem, each individual in the population represents a portion of the input data. The
ANN is trained with these vectors and the result is part of the individual’s fitness. Two
references in the field are [10,8].
A related issue is the partitioning of the input data of a network into a training and a
validation set. This partitioning is almost always done quite arbitrarily, although it may
influence the network performance in a significant manner. Reeves and Taylor [33] re-
cently applied genetic algorithms to select training sets for radial basis function neural
networks. They found better generalization performance for an artificial benchmark
problem and for a real-world classification problem over randomly selected training
data. Since a fixed architecture was used here as well as in [10,8], this raises the issue
of the mutual influence between the architecture and the training data. When the cor-
rect architecture is not known, it may well be that different architectures would work
best with different training sets on a given problem. The possibility of co-evolution of
training sets and network architectures has been investigated by Mayer [24] by using
a symbiotic approach in which independent populations of ANNs and traning sets are
co-evolved by a GA. The fitness of an ANN is shared between the net and the data set
282
7. Evolutionary Design of Artificial Neural Networks
LA COMPLEXITÉ
it has been trained with. The conclusions were not clear-cut but the idea, which is an
extension of previous works by Hillis and Paredis [18,30], is interesting.
7.6
Evolution of Neural Machines
Artificial neural networks are usually simulated on a conventional general-purpose
computer. However, they actually represent attractive alternative computational mod-
els and it would make sense to develop dedicated hardware for neurocomputing, es-
pecially for on-board and special-purpose applications. In fact, if executed on an
appropriate hardware, these machines would be much faster than the corresponding
simulations and they would reach their real potential, being closer architecturally to
the inspiring biological systems and enjoying the far superior circuit speeds of mod-
ern electronics or other technologies. A number of neural chips have been designed
and built to date. However, quite often special-purpose machines are the victims of
technological advances, in the sense that today’s fast specialized hardware will prob-
ably be slow compared to tomorrow’s simulations on state-of-the-art general-purpose
hardware. Nevertheless, dedicated machines are often required in many applications
and neuro-hardware has been implemented using different technologies: digital and
analog circuits, optical architectures and, more recently, even biological and chemi-
cal chips. However, in the context of evolutionary neural networks, it is worth noting
that promising hardware technologies exist that do not require the system to be de-
signed and wired-up once and for all, thus allowing for real-world applications that
may change in real-time. This new research domain called evolvable hardware which
is less than ten years old, is based on the idea of using a particular kind of circuit whose
architecture and functions may evolve dynamically and autonomously as a function of
task requirement. Thanks to the existence of reconfigurable hardware devices such as
field-programmable gate arrays (FPGA) this seemingly odd idea can be exploited in
practice. An FPGA is an array of logic cells (see Figure 7.9) whose functions and in-
terconnections may be programmed by means of a string of configuration bits. FPGAs
can be reconfigured at will and very quickly by just downloading a new configuration
string into the chip.
The key idea of evolvable hardware is to consider the bit string as a genetic al-
gorithm chromosome. The genetic algorithm evolves a population of bit strings such
that, in time, better individuals—i.e. bit strings—are likely to emerge. Given a prop-
erly designed fitness function, suitable hardware structures for a given task can then be
evolved by evaluating each individual chromosome of a population through configur-
ing and measuring the adequacy of the corresponding circuit to the task. This straight-
forward concept is illustrated in Figure 7.10. There is a number of subtle points lurking
behind the scenes for the realization of this neat idea. First of all, configured devices
should not crash, or damage their measuring environment when meaningless bit con-
figurations are sent to them. Moreover, reconfiguration and fitness measure should be
very fast, in order not to slow down the evolutionary process. Modern FPGAs, such
as the Xilinx 6200 series, do indeed possess these features to some extent and can
283
LA COMPLEXITÉ
Marco TOMASSINI
programmable
programmable
functions
interconnections
configuration
logic cell
I/O cell
Figure 7.9: Schematic representation of an FPGA.
be used, although existing reconfigurable devices were not designed with evolution in
mind from the start.
When speaking of evolvable machines, it is useful to distinguish between offline
and online evolution. The simplest case of offline evolution is evolutionary circuit
design, where all operations are carried out in software, with the resulting final solution
possibly loaded onto a real circuit. Though a potentially useful design methodology,
this falls completely within the realm of traditional evolutionary techniques.
In the case depicted in Figure 7.10, although a real circuit is used during the evo-
lutionary process, we still speak of offline hardware evolution since most operations
are carried out offline in software. In effect, the population is stored in an external
computer, which also controls the evolutionary process. Ideally, in online evolution
one would have all operations (selection, crossover, mutation), as well as fitness eval-
uation carried out in hardware. Building machines that would evolve and even adapt
autonomously during their lifetime is an enticing perspective, but all practical systems
to date have been evolved using the scheme of Figure 7.10.
Evolvable hardware has been used to automatically design a number of computing
machines including cellular automata, data compression and decompression devices,
analog circuits and autonomous robot controllers. The interested reader can find arti-
cles covering basic issues and current research in the field in references [3,23,41]. Here
we are interested in neural networks hardware evolution, two prominent examples of
which being the works of Higuchi et al. [28] and of de Garis et al. [11]. We mention
in passing that both de Garis and Higuchi played a leading role in the establishment of
the foundations of the field in the early 1990s.
284
7. Evolutionary Design of Artificial Neural Networks
LA COMPLEXITÉ
evaluation
selection
recombination
mutation
download
configuration bits
0 1 0 0 0 1 0 1 0 0 1 0 1 0
FPGA or another
fitness
reconfigurable
1 0 1 0 0 1 0 1 0 0 1 0 1 1
...
computing device
0 1 1 0 0 1 0 1 1 0 1 0 1 1
GA population of
configuration bit strings
Figure 7.10: Illustration of the evolution of configuration bit strings by a GA. Each
individual is downloaded into an FPGA where the fitness of the resulting circuit is
evaluated and sent back to the GA.
7.6.1
Evolvable Neural-Network Hardware
Higuchi et al. observe that most industrial applications of artificial neural networks
are limited to systems in which learning takes place offline, before the network is ac-
tually used for recognition tasks on unseen data. This obviously prevents the system
from making real-time adjustments according to changing requirements and therefore
causes a lack of online adaptation capabilities. We have seen that evolutionary methods
are useful for the overall ANN design process. However, until now these methodolo-
gies have been mostly limited to computer simulations of the actual nets. Genetic
algorithms and reconfigurable hardware are the key ingredients for truly adaptable and
fast networks. Thus, dynamically reconfigurable ANNs possess the potential to be
useful for embedded applications where compact, adaptable, and fast machines are
preferable to either simulations or specialized ANN hardware.
The GRD (Genetic Reconfiguration of Digital signal processors) chip [28] is an
evolvable hardware system for neural network applications designed with optimality,
adaptability, and efficiency issues in mind. The system consists of a 32-bit RISC pro-
cessor and a binary tree of 15 DSPs (Digital Signal Processors). The GA runs on the
RISC processor thus avoiding the need for a host machine for this task. With rapidly
declining hardware costs, it will be common in the future to have a lot of processing
power available, even for special-purpose applications. Both the topology and the hid-
den layer node transfer functions of the neural net can be dynamically reconfigured
285
LA COMPLEXITÉ
Marco TOMASSINI
Chromosome
(G,a1,b1,w1) (G,a2,b2,w2)
(G,a1,b1,w1) (S,c2,d2,w2)
(S,ci,di,wi)
(G,a15,b15,w15)
Genetic Operations
Network
w1
w1
w2
w15
a1,b1
c2,d2
w2
a1,b1
a2,b2
ci,di
a15,b15
Gaussian
Sigmoid
Evolution
x1 x2
xr
x1 x2
xr
(a)
Downloading
(b)
Downloading
GRD Chip
Reconfigured
Dynamically
RISC
RISC
4
4
+
3
3
+
+
2
2
+
+
DSP
+
+
+
5
5
1
1
x1x2
x1x2
(c)
Gaussian
(d)
Gaussian and summation
+
Sigmoid function
+
Sigmoid function and summation
Figure 7.11: Illustration of hardware evolution of the GRD chip for neural network
online applications. Figure reproduced by permission of the authors [28].
using a GA. Figure 7.11 schematically illustrates how the configuration and learning
take place. A GA individual (or chromosome) represents a particular network archi-
tecture, including the choice of the transfer function which can be either a Gaussian or
286
7. Evolutionary Design of Artificial Neural Networks
LA COMPLEXITÉ
a signoid. The upper part of the figure shows the architecture of two evolved networks.
These networks are mapped onto the GRD chips which are programmable function
units, that is, their function can be changed in real time by just downloading another
configuration string into them, a fact that is illustrated in the lower portion of the figure.
New networks are obtained through the GA by applying the usual genetic operators. In
the figure, the network on the left with three nodes is finally evolved into the 15-node
network on the right. The fitness of a network is calculated as the sum of the squared
error over the training set by using local learning with steepest descent. The binary
tree architecture allows the parallel calculation of the sum of node activations and is
thus useful for applications that require high performance. Alternatively, a single DSP
can be used by time slicing.
The GRD chip has been employed in test applications of digital mobile commu-
nications with excellent results. The planned use includes a number of applications
whose environments vary over time and have real-time constraints.
7.6.2
Evolving Digital Brains
Standard ANN systems do not comprise a very large number of neurons: a few tens
to a few hundreds of them being the rule. On one hand, statistical and computational
properties are more difficult to analyse for large networks, unless there is a lot of
regularity in the interconnections, such as in cellular neural networks. On the other
hand, if the system is to be simulated on a conventional computer, which is often the
case, a large number of units may slow down learning to unacceptable levels. Small
artificial neural networks have proved very useful in many fields and today they are
standard components of common devices and appliances in industry and the consumer
market. However, if one could afford to directly build adaptive neural machines in
hardware, then it would be worthwhile to self-assembly a large number of computing
units by evolution and learning to see if some higher-level properties could emerge that
are a better match to the biological systems they imitate. Of course, with millions, or
even billions, of self-assembling units, we will be forced to abandon to some extent the
idea of being able to analyze how the machine works in detail, a prospect which can be
upsetting for some people. This is not necessarily as frightening as it sounds: after all,
nobody really understands how the brain works in detail, nevertheless we all make use
of it daily without ever noticing. Statistical physics is another example of a perfectly
valid description of natural systems in which only average quantities over zillions of
atoms and molecules can actually be observed, the fine behavior of a single atom being
totally irrelevant. de Garis’s CAM-Brain project, an ambitious and visionary research
endeavor, is structured along these lines. In de Garis’s own words:
The original (perhaps over-ambitious) aim of our “CAM-Brain Project”,
as stated at its beginning in 1993, was to build an artificial brain with a bil-
lion artificial neurons, by the year 2001, using evolved cellular automata
(CA) based neural circuit modules. In reality, 6 years later, this number
will be maximum 40 million neurons and 32,000 modules. These CA
287
LA COMPLEXITÉ
Marco TOMASSINI
based neural network modules grow and evolve at electronic speeds in-
side special FPGA based hardware called a CAM-Brain Machine (CBM),
which can update CA cells at a rate of 150 billion a second, and can evolve
a neural net module in about 1 second. This speed should make brain
building practical. Tens of thousands and more of these evolved mod-
ules can be assembled into humanly defined artificial brain architecures.
The evolved CA based circuit modules are downloaded into a large RAM
space and updated by the CBM fast enough for real time control of a kitten
robot.
The CAM-Brain machine (CAM stands for Cellular Automata Machine) is a re-
search tool for the simulation of huge assemblages of artificial neurons, “artificial
brains” in de Garis’s language. A small sample of an evolved portion of a neural
circuit is depicted in Figure 7.12.
Figure 7.12: Small enlarged region of a CAM-BRAIN 2D 10,000,000 neuron circuit
sample. Reproduced by permission of the author.
The CBM machine is specially designed using dynamically reconfigurable FPGA
chips to support the growth and signalling of neural networks in two and three dimen-
sions. The originality of the approach lies in the fact that the neural modules are un-
usually large and they are not designed in the customary way; rather, they are evolved
directly in hardware using genetic algorithms, in the manner previously outlined in the
section. Figure 7.13 is a photograph of the newly released CBM machine.
This is a large-scale ongoing project. Results on simple learning tasks to date
are encouraging (see for instance [11]). The current research aims at controlling the
288
7. Evolutionary Design of Artificial Neural Networks
LA COMPLEXITÉ
Figure 7.13: The CAM-Brain Machine (CBM) is a piece of specialized “evolvable
hardware” for fast evolution of cellular automata-based neural network circuit mod-
ules. The shape is supposed to represent a slice of cortex, and its colors (grey and
white) represent the outer“grey matter” of the cortex (i.e. the neurons) and its inner
“white matter” (the interconnecting axons). Figure reproduced by permission of the
author.
kitten-robot which will be used as a demonstration vehicle to show the capabilities of
the evolved artificial brain. The robot will be life-size and its “brain” will be offline,
controlling a whole range of sensor and motor behaviors via a radio link.
7.7
A Case Study: Evolutionary Autonomous Robots
An autonomous robot is a mechanical device that can operate without being attached
to a power supply or an external computer. Ideally, the aim is for the robot to be able
to adapt to unpredictable sources of change. This is a toll order and there are as yet
few experimental autonomous robots that are able to function correctly in a restricted
but changing environment.
Evolutionary robotics is advocated as an automatic method to discover efficient
controllers for robots that operate in real environments. The situated nature of the
evolutionary approach is such that often evolved controllers find surprisingly simple,
yet efficient, solutions that capitalize upon unexpected invariants of the interaction
between the robot and its environment. The remarkable simplicity and efficiency of
these solutions is a clear advantage for fast and real-time operation required from au-
tonomous robots, but it raises the issue of robustness when environmental conditions
change. Environmental changes can also be a problem for other approaches (program-
ming, learning, e.g.) to the extent to which the sources of change have not been con-
289
LA COMPLEXITÉ
Marco TOMASSINI
sidered during system design, but they are even more so for evolved systems because
these often rely on environmental aspects that are often not predictable by an external
observer.
A useful approach consists of combining evolution and learning “during life” of
the individual (see [6] for a comprehensive review of the combination of evolution
and learning). This strategy not only can improve the search properties of artificial
evolution, but can also make the controller more robust to changes that occur faster
than the evolutionary time scale (i.e., changes that occur during the life of an individ-
ual) [29]. This is typically achieved by evolving neural controllers that learn with an
off-the-shelf algorithm, such as reinforcement learning or back-propagation, starting
from synaptic weights specified on the genetic string of the individual [1]. Only initial
synaptic weights are evolved. A limitation of this approach is the “Baldwin effect”
(see [19] for an example of the Baldwin effect in a computational model), whereby the
evolutionary costs associated with learning give a selective advantage to the genetic
assimilation of learned properties and consequently reduce the plasticity of the system
over time.
In previous work Floreano and Mondada[12] have suggested evolving the adaptive
characteristics of a controller instead of combining evolution with off-the-shelf algo-
rithms. The method consists of encoding on the genotype a set of four local Hebb
rules for each synapse, but not the synaptic weights, and let these synapses use these
rules to adapt their weights online starting always from random values at the begin-
ning of the life. Since the synaptic weights are not encoded on the genetic string,
there cannot be genetic assimilation of abilities developed during life. In other words,
these controllers can rely less on genetically-inherited invariants and must develop
in real time the connection weights necessary to achieve the task. When comparing
evolution of genetically-determined weights with evolution of adaptive controllers on
a simple navigation task, it has been shown that the latter approach generates simi-
larly good performances in less generations [13] by taking advantage of the combined
search methods and that evolutionary adaptive controllers can adapt to environmental
changes that involve new sensory characteristics and new spatial relationships of the
environment [36].
A set of experiments to test the robustness of this approach to environmental
changes are presented here addressing two important types of change for robot con-
trollers: the transfer of evolved controllers from a simulated robot to a physical robot
(Khepera) and across different robots. The results are compared to those obtained from
evolution of genetically-determined weights and evolution of noisy synaptic weights
(control condition). Evolutionary adaptive controllers not only report significantly bet-
ter performances, but also display qualitatively different ways of coping with the task
at hand. More details can be found in the original article [35].
290
7. Evolutionary Design of Artificial Neural Networks
LA COMPLEXITÉ
IRl
IRf
IRr
IRb
Ll
Lf
Lr
Vl
Vf
Vr
Ml
Mr
Figure 7.14: The neural controller is a fully-recurrent discrete-time neural network
composed of 12 neurons giving a total of 12 x 12= 144 synapses (here represented
as small squares of the unfolded network). 10 sensory neurons receive additional in-
put from one corresponding pool of sensors positioned around the body of the robot
shown on the left (l=left; r=right; f=front; b=back). IR=Infrared Proximity sensors;
L=Ambient Light sensors; V =vision photoreceptors. Two motor neurons M do not
receive sensory input; their activation sets the speed of the wheels (Mi > 0.5 forward
rotation; Mi < 0.5 backward rotation).
7.7.1
Method
The controller that was used in the experiments is a fully-recurrent discrete-time neural
network (Figure 7.14). It has access to three types of sensory information from the
robot:
1. Infrared light : the active infrared sensors positioned around the robot (Figure
7.15, a) measure the distance from objects. Their values are pooled into four
pairs and the average reading of each pair is passed to a corresponding neuron.
2. Ambient light : the same sensors are used to measure ambient light too. These
readings are pooled into three groups and the average values are passed to the
corresponding three light neurons.
3. Vision: the vision module (Figure 7.15, b) consists of an array of 64 photore-
ceptors covering a visual field of 36◦ (Figure 7.15, center). The visual field is
divided up in three sectors and the average value of the photoreceptors (256 gray
levels) within each sector is passed to the corresponding vision neuron.
Two motor neurons are used to set the rotation speed of the wheels (Figure 7.15,
c). Neurons are updated every 100 ms according to the following equation:
yi ← σ
N
w
j=0
ij yj
+ Ii,
291
LA COMPLEXITÉ
Marco TOMASSINI
b
2
3
Receptor
1
4
Activation
0
5
o
36
c
1 cm
7
6
a
Figure 7.15: The Khepera robot used in the experiments. Infrared sensors (a) mea-
sure object proximity and light intensity. The linear vision module (b) is composed
of 64 photoreceptors covering a visual field of 36◦ (center). The output of the con-
troller generates the motor commands (c) for the robot. Right figure shows the sensory
disposition of the Khepera robot.
where yi is the activation of the ith neuron, wij is the strength of the synapse between
presynaptic neuron j and postsynaptic neuron i, N is the number of neurons in the
network, 0 ≤ Ii < 1 is the corresponding external sensory input, and σ(x) = (1+ex)−1
is the sigmoidal activation function. Ii = 0 for the motor neurons.
Each synaptic weight wij is randomly initialized at the beginning of the individual’s
life and can be updated after every sensory-motor cycle (100 ms),
wt = wt−1 + η∆w
ij
ij
ij ,
where 0.0 < η < 1.0 is the learning rate and ∆wij is one of the four modification rules
specified in the genotype which may co-exist within the same network:
1. Plain Hebb rule: strengthens the synapse proportionally to the correlated activity
of the two neurons.
2. Postsynaptic rule: behaves as the plain Hebb rule, but in addition it weakens the
synapse when the postsynaptic node is active but the presynaptic is not.
3. Presynaptic rule: weakening occurs when the presynaptic unit is active but the
postsynaptic is not.
4. Covariance rule: strengthens the synapse whenever the difference between the
activations of the two neurons is less than half their maximum activity, otherwise
the synapse is weakened.
Synaptic strength is maintained within a range [0, 1] (notice that a synapse cannot
change sign) by adding to the modification rules a self-limiting component inversely
proportional to the synaptic strength itself.
Two types of genetic (binary) encoding are considered (see table):
1. Synapse Encoding: also known as direct encoding [40], every synapse is indi-
vidually coded on five bits, the first bit representing its sign and the remaining
four bits its properties (either the weight strength or its adaptive rule).
292
7. Evolutionary Design of Artificial Neural Networks
LA COMPLEXITÉ
Encoding
Bits for one synapse / node
Genotype
1
2
3
4
5
A
sign
strength
B
sign
Hebb rule
rate
C
sign
strength
noise
Table 7.1: Genetic encoding of synaptic parameters for synapse encoding and node
encoding. In the latter case the sign encoded on the first bit is applied to all outgoing
synapses whereas the properties encoded on the remaining four bits are applied to
all incoming synapses. A: genetically determined controllers; B: adaptive synapse
controllers; C: noisy synapse controllers.
Figure 7.16: A mobile robot Khepera equipped with a vision module gains fitness by
staying on the gray area only when the light is on. The light is normally off, but it
can be switched on if the robot passes over the black area positioned on the other side
of the arena. The robot can detect ambient light and the color of the wall, but not the
color of the floor.
2. Node Encoding: each node is characterized by five bits, the first bit representing
its sign and the remaining four bits the properties of all its incoming synapses
(consequently, all incoming synapses to a given node have the same properties).
Synapse encoding allows a detailed definition of the controller, but for a fully con-
nected network of N neurons the genetic length is proportional to N 2. Instead node
encoding requires a much shorter genetic length (proportional to N ), but it allows
only a rough definition of the controller. In recent work Floreano and Urzelai [15]
showed that the evolutionary adaptive approach does not need a lengthy representa-
tion because the actual weights of the synapses are always shaped at run-time by the
genetically specified rules. However, this is not possible in the traditional approaches
where it is necessary to assign good initial weights to the controller. Therefore, the ex-
periments reported here compare evolution of genetically-determined networks using
synapse encoding with evolution of adaptive networks using node encoding.
293
LA COMPLEXITÉ
Marco TOMASSINI
What is encoded on the remaining four bits depends on the evolutionary condition
chosen, namely:
1. Genetically-determined : 4 bits encode the synaptic strength. This value is con-
stant during “life”.
2. Adaptive synapses: 2 bits encode 4 adaptive rules and 2 bits the learning rate.
Synaptic weights are always randomly initialized at the beginning of an individ-
ual’s life and then updated according to their own adaptation rule.
3. Noisy synapses: 2 bits encode the weight strength and 2 bits a noise range. The
synaptic strength is genetically determined at birth, but a random value extracted
from the noise range is freshly computed and added after each sensory motor cy-
cle. This latter condition is used as a control condition to check whether the ef-
fects of Hebbian adaptation (condition above) are equivalent to random synaptic
variability.
7.7.2
A Sequential Task
This set of experiments was designed to compare the performance of evolutionary
adaptive controllers with respect to the evolution of synaptic weights and the evolution
of noisy synapses in a sequential task that is complex enough to require non-trivial
solutions. A mobile robot Khepera equipped with a vision module is positioned in the
rectangular environment shown in Figure 7.16. A light bulb is attached on one side of
the environment. This light is normally off, but it can be switched on when the robot
passes over a black-painted area on the opposite side of the environment. A black stripe
is painted on the wall over the light-switch area. Each individual of the population is
tested on the same robot, one at a time, for 500 sensory motor cycles, each cycle lasting
100 ms. At the beginning of an individual’s life, the robot is positioned at a random
position and orientation and the light is off.
The fitness function is given by the number of sensory motor cycles spent by the
robot on the gray area beneath the light bulb when the light is on divided by the total
number of cycles available (500). In order to maximize this fitness function, the robot
should find the light-switch area, go there in order to switch the light on, and then move
towards the light as soon as possible, and stand on the gray area. Since this sequence
of actions takes time (several sensory motor cycles), the fitness of a robot will never be
1.0. Also, a robot that cannot manage to complete the entire sequence will be scored
with 0.0 fitness.
A light sensor placed under the robot is used to detect the color of the floor—white,
gray, or black— and passed to a host computer in order to switch on the light bulb and
compute fitness values. The output of this sensor is not given as input to the neural
controller. After 500 sensory motor cycles, the light is switched off and the robot is
repositioned by applying random speeds to the wheels for 5 seconds. The experiments
have been carried out in simulations sampling sensor activation and adding 5% uniform
294
7. Evolutionary Design of Artificial Neural Networks
LA COMPLEXITÉ
10
10
f = 0.422, <f> = 0.499
f = 0.260, <f> = 0.302
Adaptive synapses
Fixed synapses
Figure 7.17: Behaviors of the two best individuals (from last generation) with adap-
tive synapses and node encoding (left ) and with genetically-determined synapses and
synapse encoding (right ). When the light is turned on, the trajectory line becomes
thick. The corresponding fitness value is printed on the top of each box along with
the average fitness of the same individual tested ten times from different positions and
orientations.
noise to these values [25]. All experimental conditions have also been repeated on the
physical robot and yielded similar results [15].
The fitness results show that individuals with adaptive synapses and node encoding
are much better than individuals with genetically-determined synapses and synapse
encoding in that:
1. Both the fitness of the best individuals and of the population report higher values
(0.6 against 0.5). The performance difference measured on best individuals of
the last generation is statistically significant.
2. They reach the best value obtained by genetically-determined individuals in less
than half generations (40 against more than 100).
Individuals with genetically-determined synapses and node encoding never man-
aged to complete the task in any of the ten simulations performed.
Figure 7.17 shows the behaviors of the two best individuals evolved with adap-
tive synapses and node encoding (left) and with genetically-determined weights and
synapse encoding (right). In both cases individuals aim at the area with the light switch
and, once the light is turned on, they move towards the light and remain there. The bet-
ter fitness of the adaptive controllers (given on the top of each box, see figure caption)
is given by straight and faster trajectories showing a clear behavioral change between
the first phase where they go towards the switching area and the second phase where
they become attracted by the light. Instead, genetically-determined individuals always
display the same looping trajectories around the environment with some attraction to-
wards the stripe and the light. This minimalist behavior that depends on invariant
geometrical relations of the environment gives them a chance to accomplish the task
but with a lower performance.
From Simulations to Real Robots
One way of measuring the adaptive abilities of
evolved controllers is to transfer them from simulated to real robots. Since physical
295
LA COMPLEXITÉ
Marco TOMASSINI
c
L0
L1
R0
L2
R1
R2
L3
R3
b
L4
R4
L5
R5
a
5 cm
L6
R6
L7
R7
Figure 7.18: The Koala robot used in the experiments. Infrared sensors (a) measure
object proximity and light intensity. The linear vision module (b) is the same as used
in the experiments with the Khepera robot. The localization module (c) provides the
position of the robot at every time step. Right figure shows the sensory layout of the
Koala robot. Only 8 equally-spaced sensors are selected as input to the network.
robots and environments often have characteristics different from simulations, solu-
tions evolved in simulation typically fail when tested on real robots. The best individ-
uals of the last generation for each of the 10 populations evolved in simulation have
been transferred on a physical Khepera robot. The performance of adaptive individuals
is less affected by the transfer to the physical environment than genetically-determined
individuals. Individuals with noisy synapses are not affected by the transfer because
their behavior is always random and not effective in both simulated and physical envi-
ronments.
Some performance loss in adaptive individuals is caused by the fact that in some
cases the robot performs looping trajectories around the fitness area without coming
to rest on it. Instead, the two major reasons for failure of genetically-determined in-
dividuals are that they often get stuck on the walls and they cannot manage to move
efficiently towards the light. These failures are due to differences between simulated
and real sensors. Since the weights are fixed, genetically-determined individuals can-
not accommodate these changes as adaptive individuals do.
Crossplatform Evolution
Cross-platform transfer is a very useful feature but it is
very difficult to transfer a control system across different robots without changes. One
may develop (or evolve) control systems for a desktop sturdy robot like the Khepera
and then download them to larger and consequently more fragile robots. Obviously, the
two robots must share some characteristics, such as type of sensors and actuators used,
that allow a suitable interfacing of the control system. In previous work it was shown
that this can be achieved by using incremental evolution of genetically-determined
networks [13]. However, even for a simple reactive navigation behavior it took an
additional 20 generations to re-adapt to the new robot.
Here the adaptive properties of the evolutionary adaptive strategy are tested by
transferring on a physical Koala robot (Figure 7.18, left) the best individuals of the last
296
7. Evolutionary Design of Artificial Neural Networks
LA COMPLEXITÉ
Figure 7.19: A mobile robot Koala equipped with a vision module gains fitness by
staying near the lamp (right side) only when the light is on. The light is normally off,
but it can be switched on if the robot passes near the black stripe (left side) positioned
on the other side of the arena. Position of the robot is controlled by an external posi-
tioning system and passed to the computer in order to control the light and to compute
the fitness.
generation evolved in each of the 10 simulations of the experiment presented in Sec-
tion 7.7.2. The Koala robot has six wheels driven by two motors (one on each side) and
16 infrared sensors (Figure 7.18, right) with a different and stronger detection range.
A mobile robot Koala equipped with a vision module is positioned in the rectangular
environment shown in Figure 7.19. As in the previous experiment with the Khepera
robot, the Koala robot must find the light-switching area, go there in order to switch
the light on, and then move towards the light as soon as possible and stay there in order
to score fitness points.
An external positioning system emitting laser beams at predefined angles and fre-
quencies is positioned on the top of the environment and the Koala robot is equipped
with an additional turret capable of detecting laser and computing in real-time the robot
displacement. This information is used in order to control the light and to compute the
fitness. The performance of adaptive individuals is not affected by the transfer from the
Khepera robot to the Koala robot, whereas genetically-determined individuals report a
significant fitness loss. Individuals with noisy synapses are not affected by the transfer
because their behavior is always random and not effective in both Khepera and Koala
robots.
Individuals evolved in simulation for the Khepera robot display a satisfactory be-
havior when tested on the Koala robot. They correctly approach the light-switching
area and they are clearly attracted by light (Figure 7.20, left). As in the case of a real
Khepera robot, once it has arrived under the light the Koala robot moves around the
fitness area while remaining close to it until the testing time is over.
On the other hand, genetically-determined individuals (right) perform spiralling
trajectories around the environment and do not display any attraction by the black
stripe or the light. They eventually manage to pass through the light-switching area,
turn the light on, and occasionally score fitness points passing through the fitness area.
In several cases, genetically-determined individuals get stuck on the walls of the envi-
297
LA COMPLEXITÉ
Marco TOMASSINI
10
10
f = 0.302, <f> = 0.322
f = 0.018, <f> = 0.071
Adaptive synapses
Fixed synapses
Figure 7.20: Behaviors of individuals with adaptive synapses (left) and genetically-
determined synapses (right). Individuals belong to the last generation evolved in sim-
ulation for the Khepera robot.
ronment (behaviors not shown). Individuals with noisy synapses score a low perfor-
mance because their strategy is based on random navigation.
Conclusions
The experiments presented here show that evolution of adaptive
synapses provides better adaptation capabilities than standard evolution of synaptic
weights in the transfer from simulations to physical robots and across different robotic
platforms. Evolutionary adaptive controllers can autonomously modify their synaptic
weights and behavior online to the new environmental conditions without requiring
additional evolution or ad-hoc manipulation of the evolutionary conditions.
The evolutionary technique proposed in [35] represents a significative step for-
ward towards making evolutionary robotics applicable to real-world applications of
autonomous robotics. In scenarios like those, for example, of robots probing an aster-
oid surface or robots interacting with a handicapped person it is impossible to evolve
the control system on the spot (not even incrementally). However, one might repro-
duce the working conditions in the laboratory to some degree of approximation and
evolve the adaptive controller in there. The controller would then be transferred on the
final robot and let free to adapt to actual working conditions in a few seconds. This
adaptive strategy will also be useful for evolving more complex and powerful control
architectures. In current methods there is a trade-off between the complexity the geno-
type/phenotype mapping and the evolvability of such systems which is partly due to
the fact that the phenotype largely depends on genetic instructions. By evolving the
adaptive characteristics along with other high-level parameters (position and type of
nodes, e.g.) of the controller, one may obtain simpler genetic encodings and a higher
tolerance to mutations. This would make the evolved controllers more viable, add
neutrality to the genetic landscape, and ultimately improve evolvability.
298
BIBLIOGRAPHY
LA COMPLEXITÉ
7.8
Bibliography
[1] D. H. Ackley and M. L. Littman. Interactions between learning and evolution.
In C.G. Langton, J.D. Farmer, S. Rasmussen, and C. Taylor, editors, Artificial
Life II: Proceedings Volume of Santa Fe Conference, volume X, pages 487–509.
Addison Wesley: series of the Santa Fe Institute Studies in the Sciences of Com-
plexities, Redwood City, CA, 1992.
[2] P. Angeline, G. Saunders, and J. Pollack. Complete induction of recurrent neural
networks. In A. V. Seibald and L. J. Fogel, editors, Proceedings of the Third
Conference on Evolutionary Programming, pages 1–8. World Scientific, 1994.
[3] Various authors. Special issue on evolvable hardware. Communications of the
ACM, 42:46–79, April 1999.
[4] A. G. Barto. Reinforcement learning. In Michael A. Arbib, editor, Handbook of
Brain Theory and Neural Networks, pages 804–809. MIT Press, 1995.
[5] R. Belew, J. McInerney, and N. N. Schraudolph. Evolving networks: Using
the genetic algorithm with connectionist learning. Technical Report CS90-174,
University of California, San Diego, 1990.
[6] R. K Belew and M. Mitchell, editors. Adaptive Individuals in Evolving Popula-
tions. Models and Algorithms. Addison-Wesley, Redwood City, CA, 1996.
[7] H. Braun. Evolving neural networks for application oriented problems. In D. B.
Fogel and W. Atmar, editors, Proceedings of the Second Conference on Evo-
lutionary Programming, pages 62–71, La Jolla, California, 1993. Evolutionary
Programming Society.
[8] F. Z. Brill, D. E. Brown, and W. N. Martin. Fast genetic selection of features for
neural network classifiers. IEEE Transactions on Neural Networks, 3(2):324–
328, 1992.
[9] D. J. Chalmers. The evolution of learning: An experiment in genetic connection-
ism. In D. S. Touretzky, J. L. Elman, and G. E. Hinton, editors, Connectionist
Models: Proceedings of the 1990 Summer School, pages 81–90, San Mateo, Cal-
ifornia, 1990. Morgan Kaufmann.
[10] E. J. Chang and R. P. Lippmann. Using genetic algorithms to improve pattern
classification performance. In R. P. Lippmann, J. E. Moody, and D. S. Touret-
sky, editors, Advances in neural information processing 3, pages 797–803, San
Mateo, California, 1991. Morgan Kaufmann.
[11] H. de Garis, , M. Korkin, F. Gers, E. Nawa, and M. Hough. Building an artificial
brain using an FPGA based CAM-Brain machine. Applied Mathematics and
Computation Journal, 111:163–192, 2000. Special issue on Artificial Life and
Robotics, Artificial Brain, Brain Computing and Brainware.
299
LA COMPLEXITÉ
Marco TOMASSINI
[12] D. Floreano and F. Mondada. Evolution of plastic neurocontrollers for situ-
ated agents. In P. Maes, M. Mataric, J-A. Meyer, J. Pollack, H. Roitblat, and
S. Wilson, editors, From Animals to Animats IV: Proceedings of the Fourth Inter-
national Conference on Simulation of Adaptive Behavior, pages 402–410. MIT
Press-Bradford Books, Cambridge, MA, 1996.
[13] D. Floreano and F. Mondada. Evolutionary neurocontrollers for autonomous
mobile robots. Neural Networks, 11:1461–1478, 1998.
[14] D. Floreano and J. Urzelai. Evolution and learning in autonomous robotic agents.
In D. Mange and M. Tomassini, editors, Bio-Inspired Computing Machines:
Towards Novel Computational Architectures, pages 317–346. Presses Polytech-
niques et Universitaires Romandes, Lausanne, 1998.
[15] D. Floreano and J. Urzelai.
Evolution of Neural Controllers with Adaptive
Synapses and Compact Genetic Encoding. In D. Floreano, J.D. Nicoud, and
F. Mondada, editors, Advances In Artificial Life: Proceedings of the 5th Euro-
pean Conference on Artificial Life (ECAL’99), pages 183–194. Springer Verlag,
Berlin, 1999.
[16] D. B. Fogel, L. J. Fogel, and V. W. Porto. Evolving neural networks. Biological
Cynernetics, 63:487–493, 1990.
[17] F. Gruau. Genetic synthesis of boolean neural networks with a cell rewriting
developmental process. In D. Whitley and J. D. Schaffer, editors, Proceedings of
the Workshop on Combinations of Genetic Algorithms and Neural Networks, Los
Alamitos, CA, 1992. IEEE Computer Society Press.
[18] W. D. Hillis. Co-evolving parasites improve simulated evolution as an optimiza-
tion procedure. In C. G. Langton, C. Taylor, J. D. Farmer, and S. Rasmussen,
editors, Artificial Life II, volume X of SFI Studies in the Sciences of Complexity,
pages 313–324, Redwood City, CA, 1992. Addison-Wesley.
[19] G. E. Hinton and S. J. Nowlan. How learning can guide evolution. Complex
Systems, 1:495–502, 1987.
[20] H. Kitano. Designing neural networks by genetic algorithms using graph gener-
ation systems. Complex Systems, 4:461–476, 1990.
[21] J. R. Koza. Genetic Programming. The MIT Press, Cambridge, Massachusetts,
1992.
[22] A. Lindenmayer. Mathematical models for cellular interaction in development,
Parts I and II. Journal of Theoretical Biology, 18:280–315, 1968.
[23] D. Mange and M. Tomassini (Eds). Bio-Inspired Computing Machines: Towards
Novel Computational Architectures. Presses Polytechniques et Universitaires Ro-
mandes, Lausanne, 1998.
300
BIBLIOGRAPHY
LA COMPLEXITÉ
[24] H. A. Mayer. Symbiotic coevolution of artificial neural networks and training
data sets. In A. Eiben, D. Bäck, M. Schoenauer, and H.-P. Schwefel, editors,
Parallel Problem Solving from Nature - PPSN V, volume 1498 of Lecture Notes
in Computer Science, pages 511–520. Springer-Verlag, Heidelberg, 1998.
[25] O. Miglino, H. H. Lund, and S. Nolfi. Evolving Mobile Robots in Simulated and
Real Environments. Artificial Life, 2:417–434, 1996.
[26] G. F. Miller, P. M. Todd, and S. U. Hegde. Designing neural networks using ge-
netic algorithms. In J. D. Schaffer, editor, Proceedings of the Third International
Conference on Genetic Algorithms, pages 379–384. Morgan Kaufmann, 1989.
[27] D. Montana and L. Davis. Training feedforward neural networks using genetic
algorithms. In Proceedings of the 11th International Conference on Artificial
Intelligence, pages 762–767. Morgan Kaufmann, 1989.
[28] M. Murakawa, S. Yoshizawa, I. Kajitani, X. Yao, N. Kajihara, M. Iwata, and
T. Higuchi. The GRD chip: Genetic reconfiguration of DSPs for neural network
processing. IEEE Transactions on Computers, 48(6):628–639, June 1999.
[29] S. Nolfi and D. Floreano. Learning and evolution. Autonomous Robots, 7(1),
1999. to appear.
[30] J. Paredis. Coevolutionary computation. Artificial Life, 2(4):355–375, 1995.
[31] D. Polani and T. Uthmann. Training kohonen feature maps in different topolo-
gies: an analysis using genetic algorithms. In S. Forrest, editor, Proceedings of
the Fifth International Conference on Genetic Algorithms, pages 326–333. Mor-
gan Kaufmann Publishers, San Mateo, California, 1993.
[32] P. Prusinkiewicz and A. Lindenmayer.
The Algorithmic Beauty of Plants.
Springer-Verlag, New York, 1990.
[33] C. R. Reeves and S. J. Taylor. Selection of training data for neural networks by
a genetic algorithm. In A. Eiben, D. Bäck, M. Schoenauer, and H.-P. Schwefel,
editors, Parallel Problem Solving from Nature - PPSN V, volume 1498 of Lecture
Notes in Computer Science, pages 633–642. Springer-Verlag, Heidelberg, 1998.
[34] A. A. Siddigi and S. M. Lucas. A comparison of matrix rewriting versus di-
rect encoding for evolving neural networks. In Proceedings of 1998 IEEE Inter-
national Conference on Evolutionary Computation (ICEC’98), pages 392–397.
IEEE Press, 1998.
[35] J. Urzelai and D. Floreano. Evolutionary robotics: coping with environmental
change. In Proceedings of the Genetic and Evolutionary Computation Confer-
ence GECCO 2000, San Francisco, 2000. Morgan Kaufmann. to appear.
301
LA COMPLEXITÉ
Marco TOMASSINI
[36] J. Urzelai and D. Floreano. Evolutionary robots with fast adaptive behavior in
new environments. In T. C. Fogarty, J. Miller, A. Thompson, and P. Thomson,
editors, Third International Conference on Evolvable Systems: From Biology to
Hardware (ICES2000), Berlin, 2000. Springer-Verlag.
[37] D. Whitley. Genetic algorithms and neural networks. In M. Galan G. Winter,
J. Périaux and P. Cuesta, editors, Genetic Algorithms in Engineering and Com-
puter Science, pages 203–216. John Wiley, 1995.
[38] D. Whitley and T. Hanson. Optimizing neural networks using faster, more ac-
curate genetic search. In J. D. Schaffer, editor, Proceedings of the Third Inter-
national Conference on Genetic Algorithms, pages 391–396. Morgan Kaufmann,
1989.
[39] D. Whitley, T. Starkweather, and C. Bogart. Genetic algorithms and neural net-
works: optimizing connections and connectivity. Parallel Computing, 14:347–
361, 1990.
[40] X. Yao.
Evolving artificial neural networks.
Proceedings of the IEEE,
87(9):1423–1447, 1999.
[41] X. Yao and T. Higuchi. Promises and challenges of evolvable hardware. IEEE
Transactions on Systems, Man, and Cybernetics, Part C, 29(1):87–97, 1999.
[42] X. Yao and Y. Liu. A new evolutionary system for evolving artificial neural
networks. IEEE Transactions on Neural Networks, 8:694–713, 1997.
[43] X. Yao and Y. Liu. Making use of population information in evolutionary artifi-
cial neural networks. IEEE Transactions on Systems, Man, and Cybernetics, Part
B, 28(3):417–425, 1998.
302
Cinquième Partie
La complexité en biologie et en
bioinformatique
Robin Gras
Institut Suisse de Bioinformatique, Université de Genève
1, rue Michel Servet, CH-1211 Genève 4
303
Chapitre 8
La complexité en biologie et en
bioinformatique
8.1
Introduction
Ce cours porte sur la notion de complexité en bioinformatique. nous présentons
tout d’abord ce qu’est la complexité en biologie en étudiant comment est organisé le
monde vivant du point de vue macroscopique dans une première partie puis de fa-
çon plus détaillée pour l’aspect de la biologie moléculaire. Nous présentons dans une
deuxième partie une introduction à la bioinformatique et montrons comment elle traite
les problèmes complexes de la biologie moléculaire. Nous montrons enfin quelques
exemples d’utilisation des systèmes adaptatifs complexes permettant une approche ori-
ginale pour traiter les problèmes fortement combinatoires développés précédemment.
8.2
La complexité du monde vivant
La complexité du monde biologique nous apparaît principalement par la diversité
des espèces et le fonctionnement des écosystèmes. Il est actuellement encore difficile
d’estimer précisément le nombre d’espèces sur Terre (entre 3 et 30 millions) et leurs
variétés va d’organismes unicellulaires extrêmement simples comme les bactéries (voir
les virus à la frontière de ce que l’on considère comme le vivant) jusqu’à des orga-
nismes multicellulaires complexes tels que l’Homme ayant développés un systèmes
nerveux leurs permettant d’appréhender leur environnement. Des systèmes complexes
de plus hauts niveaux apparaissent également par interaction entre individus plus ou
moins simples. Ces systèmes produisent alors des comportements émergeant tels que
ceux des colonies d’insectes sociaux ou des sociétés humaines dont la complexité
est difficilement modélisable par un système centralisé. Cette complexité repose sur
un "empilement" hiérarchique de niveaux dans lesquelles une grande combinaisons
d’interactions est possible entre les éléments appartenant à ce niveaux. Chaque ni-
veau permet l’association de briques élémentaires dont les combinaisons fournissent
305
LA COMPLEXITÉ
Robin GRAS
les briques du niveau suivant. Ainsi les écosystèmes découlent d’interactions entres
groupes ou sociétés d’individus eux même étant basés sur l’interaction entre individus
constitués d’un assemblage de cellules construites à partir de macromolécules elles-
mêmes constituées de molécules élémentaires... La biologie moléculaire sur laquelle
porte principalement la bioinformatique traite des niveaux allant de la cellule jusqu’aux
molécules élémentaires et c’est donc eux que nous décrivons plus précisément.
8.3
Les molécules élémentaires
Les molécules élémentaires à la base de la composition de tous les êtres vivants
peuvent pratiquement être limités à l’eau, les hydrates de carbone (sucre), les lipides,
les nucléotides et les acides aminés. L’eau joue un rôle essentiel dans un grand nombre
de réactions chimiques nécessaires au fonctionnement des organismes ainsi que dans
la formation de structures moléculaires complexes. Les sucres sont une source de ré-
serves énergétiques, composant de la structure des macromolécules et impliqués dans
la processus de reconnaissance moléculaire. Les lipides sont également des réserves
énergétiques mais aussi les principaux composant des membranes cellulaires. Les nu-
cléotides sont les briques élémentaires des molécules génétiques comme l’ADN et
l’ARN ainsi que comme constituant de signaux intra et inter cellulaire. Les acides
aminés enfin sont les briques élémentaires des protéines, neurotransmetteurs, enzymes,
molécules signals...
8.4
L’information génomique
Nous présentons ensuite les deux grands types de molécules génomiques: l’ADN
et l’ARN ainsi que les mécanismes biologiques qui leurs sont associés. L’ADN est
la molécule porteuse de l’information génétique sous la forme de gènes. L’ADN est
constituée d’une succession de gènes et de parties non codantes, les gènes pouvant
eux-mêmes être constitués des parties codantes (exons) et non-codantes (introns). Les
gènes sont ensuite "exprimés" c’est à dire qu’ils servent à produire une (ou plusieurs)
molécule d’ARN qui leur correspond lors du processus de transcription. Les ARNs
seront ensuite traduit en protéines qui interviendront dans les diverses activités bio-
logiques nécessaires au fonctionnement des êtres vivants. Le processus complexe de
la transcription permet, par un mécanisme combinatoire appelé épissage, de produire
plusieurs ARNs différents pour un même gène. Nous donnons enfin quelques exemples
du principe de création de diversités génétiques par l’évolution des génomes, les méca-
nismes de mutations, transposition, crossover ou de transmissions horizontales servant
d’illustration.
306
8. La complexité en biologie et en bioinformatique
LA COMPLEXITÉ
8.5
Les protéines
Les protéines sont les molécules "agissantes" des mécanismes biologiques. Elles
ont des fonctions très diverses comprenant molécule structurale, constituant cellulaire,
messager intra et inter cellulaire, molécule de transport... Elle se représente suivant
quatre niveaux hiérarchiques, chacun combinant les éléments du niveau précédent. Le
niveau primaire est la succession d’acides aminés constituant la séquence protéique.
Le niveau secondaire est le repliement local de la structure primaire en structure "al-
pha" et "beta". Le niveau tertiaire est le repliement tridimensionnel de la protéine par
association de structures "alpha" et "beta" pour former des domaines. Finalement, le
niveau quaternaire est la combinaison de domaines pour former la structure globale de
la protéine qui déterminera sa fonction.
Les modifications post-translationnelles, propriétés essentielles des protéines per-
mettant leur diversité et dynamique fonctionnelle, sont ensuite décrites. Elles consistent
en molécules spécifiques se fixant sur certains acides aminés en fonction de différentes
conditions environnementales. Elles modifient ainsi la structure et la fonction des pro-
téines augmentant la diversité des fonctions réaliser et créant un mécanisme dynamique
permettant de réguler les processus biologiques en fonction de l’environnement (des
mécanismes de régulations génomiques sont aussi fréquemment impliqués).
Nous présentons finalement la notion de protéome, correspondant protéique du
génome, qui est l’ensemble des protéines exprimées à un moment donné pour un tis-
sus donné. Cette notion caractérisant beaucoup plus finement l’état de fonctionnement
d’un organisme que le génome est maintenant un outil essentiel des études pharmaco-
logiques et médicales visant à la découverte de nouveaux médicaments efficaces.
8.6
Les méthodes bioinformatiques classiques
Nous donnons en introduction des éléments de théorie du langage utiles pour l’ana-
lyse des séquences biologiques. Nous présentons la Hiérarchie de Chomsky décrivant
les quatre grandes classes de langages: les langages réguliers, les langages algébriques,
les langages contextuels et les langages à structure de phrase. Nous donnons pour cha-
cune des trois premières classes des exemples classiques et leur correspondant biolo-
giques.
Une grande partie des travaux de bioinformatiques est basée sur la notion de simila-
rité de séquences, l’hypothèse étant que deux séquences similaires vont avoir des fonc-
tions similaires. L’alignement de séquences, permettant de mettre au mieux en corres-
pondance caractères à caractères deux séquences, est l’outil principal de recherche de
similarité. Nous présentons l’algorithme exact de programmation dynamique permet-
tant d’effectuer ces alignements (Smith et Waterman) ainsi que les deux algorithmes
heuristiques (Fasta et Blast) généralement utilisés en bioinformatique.
L’alignement multiple est l’extension à un nombre quelconque de séquences du
principe d’alignement. C’est un problème reconnu comme difficile car de complexité
307
LA COMPLEXITÉ
Robin GRAS
exponentiel sur le nombre de séquences à traiter. Nous donnons un aperçu des heuris-
tiques les plus couramment employés passant par l’ensemble des alignements deux à
deux de toutes les séquences.
La notion de motifs est également centrale en bioinformatique. Elle repose sur le
fait que des portions de séquences impliquées dans des processus biologiques impor-
tant doivent se trouver conservées par l’évolution et donc présent de façon plus ou
moins similaire dans un grand nombre de séquences. Les présentons les différentes re-
présentations de motifs utilisées ainsi que les méthodes classiques pour les découvrir
au sein d’un ensemble de séquences.
Toutes ces approches travaillent sur la classe des langages réguliers et ne sont donc
pas à même de représenter des informations de corrélation dans une séquence dont
on peut déduire des propriétés de structures. D’autres méthodes font de la prédiction
de structure secondaire des protéines mais elles sont encore loin d’être complètement
fiable.
8.7
Application de système bio-inspirés en protéomique
Ce chapitre commence par un exemple d’algorithme bio-inspiré récent, appelé
swarm intelligence, prenant modèle sur le fonctionnement de colonies d’insectes so-
ciaux. L’idée est qu’un ensemble d’entités, chacune régie par un ensemble de règles
très simples, mises en interaction les unes avec les autres va produire un comporte-
ment émergeant global complexe. Ces algorithmes se sont montrés particulièrement
efficaces pour résoudre des problèmes combinatoires difficiles tels que le voyageur de
commerce.
Nous présentons ensuite un aspect de la bioinformatique se prêtant bien à une ana-
lyse par une approche bio-inspirée du fait de sa forte complexité combinatoire: la pro-
téomique. Une première étape de l’analyse protéomique est l’identification des pro-
téines. L’approche abordée consiste en une séparation des protéines par electrophorèse
bidimensionnelle, suivi d’une mesure par spectrométrie de masse et d’une comparai-
son entre données expérimentale et base de données de séquences protéiques. Nous
donnons l’exemple d’un algorithme d’identification automatique basé sur un score ap-
pris par algorithme génétique. Une extension des algorithmes génétiques permettant
la classification à été développé pour permettre un apprentissage de score spécialisé
par catégorie de protéines (masse, charge...). Un autre exemple d’identification par
spectrométrie de masse MS/MS montre l’utilisation de la swarm intelligence pour un
parcours de graphe modélisant l’ensemble des séquences d’acides aminés pouvant cor-
respondre à un spectre donné.
La deuxième (et la plus difficile) étape d’analyse protéomique est la caractérisa-
tion dans laquelle on cherche à prédire la fonction d’une nouvelle protéine en fonction
de sa séquence et on étudie l’interaction entre les différentes protéines. L’approche
bioinformatique principale pour la caractérisation est la comparaison de séquence et
la recherche de motifs. Nous présentons un algorithme évolutionniste d’inférence de
308
8. La complexité en biologie et en bioinformatique
LA COMPLEXITÉ
motifs à partir d’un ensemble de séquences. Cette approche permet une meilleure ex-
ploration de l’espace de recherche de toutes les positions conservées dans l’ensemble
de séquences que les méthodes classiques.
D’autres travaux sont actuellement en cours utilisant des algorithmes bio-inspirés
pour la bioinformatique et ils seront de plus en plus nécessaire avec l’augmentation
des données expérimentales et des connaissances sur les phénomènes biologiques.
309
Sixième Partie
Complexité et auto-organisation chez
les insectes sociaux
Jean-Louis Deneubourg
Centre d’étude des phénomènes non linéaires et des systèmes complexes
Université Libre de Bruxelles
CP 231, Boulevard du Triomphe, B-1050 Bruxelles
311
Chapitre 9
Introduction
9.1
Introduction
Ces trente dernières années, l’étude des comportements sociaux a été dominée par
l’identification des causes ultimes à l’origine des stratégies sociales, et notamment
celles impliquées dans la reproduction et la résolution de conflits [1]. Chez les in-
sectes sociaux, les travaux ont porté essentiellement sur les origines de l’eusocialité et
les conflits reproductifs. Les études portant sur les différentes formes de coopération,
le traitement de l’information (au niveau individuel) et des dynamiques collectives
qui y sont associées sont restées marginales. Cependant depuis quelques années, ces
questions ont commencé d’une part à retenir l’attention d’une part des biologistes (et
pas seulement des spécialistes des insectes sociaux) et d’autre part l’attention d’autres
disciplines, en particulier des sciences du non-linéaire et des sciences de l’informa-
tion. Ainsi différents algorithmes comportementaux identifiés chez les insectes sociaux
constituent la base de nouvelles procédures pour aborder des questions classiques de
recherche opérationnelle [2].
L’utilisation conjointe de méthodes expérimentales et théoriques est nécessaire
pour comprendre ces comportements collectifs et en particulier le lien entre le compor-
tement des individus et celui du groupe. Les problèmes abordés dans le cours avaient
non seulement pour but d’illustrer les grandes questions du domaine, mais également
les différentes approches qui sont nécessaires et la contribution importante que peut
apporter le physicien.
Le cours portait sur l’étude expérimentale de la dynamique sociale de modèles
sociaux, modèles qui font largement appel à des communications à caractère ampli-
fiant: différentes activités des sociétés d’insectes, les fourmis en particulier (récolte de
nourriture, constructions, . . . ) et des espèces grégaires par exemple les blattes.
Dans les sociétés d’insectes de nombreuses (toutes les ?) décisions et structures col-
lectives résultent d’une multitude d’interactions locales ne contenant pas explicitement
313
LA COMPLEXITÉ
Jean-Louis DENEUBOURG
les éléments de la réponse collective.
Lors de ses activités, la colonie d’insectes est confrontée à des situations présentant
un grand nombre de choix, chacun de ceux-ci étant caractérisés par divers paramètres
(par ex pour une source il s’agit de sa productivité, de sa taille, de sa distance au nid
ou à d’autres sites, la présence de compétiteurs, . . . ). La partie «collective» porte sur
l’organisation et sa dynamique des insectes dans l’espace et sur les différents sites
présentés. La partie individuelle vise à identifier les algorithmes comportementaux.
La modélisation, dont les outils sont largement empruntés à la physique non-linéaire,
permet d’établir un lien entre les deux niveaux [3].
Les articles repris ci-dessous illustrent ces différentes questions et les démarches
qui permettent de les aborder pour deux activités importantes: la recherche de nour-
riture et la construction au sens large du terme. Nombre de travaux ont été consacrés
aux choix collectifs de chemins ou de sources. Ces expériences ont permis de montrer
comment le recrutement par piste et la modulation des comportements individuels per-
mettent à la colonie de sélectionner les sources les plus intéressantes ou les chemins
ou les réseaux les plus courts. Les modèles montrent qu’une diversité de formes peut
être produite sans modulation des comportements individuels.
Les succès rencontrés conduisent parfois à sous estimer la complexité compor-
tementale des individus. Un ensemble de travaux expérimentaux et théoriques ont
conduit à reconsidérer le niveau de complexité individuelle nécessaire pour atteindre
une réponse collective efficace [3,4]. Dans ces dynamiques la physique des problèmes
et l’utilisation d’indices «intelligents» qui constituent un pré-traitement de l’informa-
tion pour l’animal occupe une place essentielle. L’identification de ces indices et des
algorithmes comportementaux passe par une analyse détaillée et quantitative des com-
portements individuels (par ex. vitesse et trajectoire, probabilité d’émettre un signal,
de réagir à la piste, . . . ). Les indices modulant le comportement des individus ont été
identifiés pour certaines situations telles que l’exploitation de proies (décision de re-
cruter en fonction de la taille de la proie, organisation du transport coopératif [5]).
Cet aspect n’est que peu abordé dans les articles repris ici. Cependant aujourd’hui,
cette question commence à retenir l’attention de plus en plus de théoriciens et d’ex-
périmentateurs, ainsi que celle de la synergie entre ces capacités individuelles et les
dynamiques collectives dans la résolution de problèmes.
9.2
Bibliographie
[1] Pour une revue récente voir par exemple S. Aron & L. Passera, Les sociétés
animales, De Boeck Université Bruxelles (2000).
[2] Pour une revue haute vulgarisation voir E. Bonabeau & G. Theraulaz, Scientific
American (March 2000) ou Pour la Science, (Avril 2000).
[3] S. Camazine, J. L. Deneubourg, N. Franks, J. Sneyd, E. Bonabeau & G. Therau-
laz, Self-organization in biological systems, Princeton University Press, Prince-
ton (2001).
314
BIBLIOGRAPHIE
LA COMPLEXITÉ
[4] Cl. Detrain, J. L. Deneubourg & J. M. Pasteels (eds), Information Processing in
Social Insects, Birkhäuser Verlag, Basel (1999).
[5] Cl. Detrain, J. L. Deneubourg & J. M. Pasteels, Decision-making in foraging
by social insects, In Information Processing in Social Insects, Eds Cl. Detrain,
J. L. Deneubourg & J. M. Pasteels, Birkhauser, 331-354 (1999).
[6] Cl. Detrain & J. L. Deneubourg, Prey scavenging by Pheidole pallidula. A key for
understanding decision-making systems in ants, Animal Behaviour 53, 537-547
(1997).
[7] A. C. Mailleux, Cl. Detrain & J. L. Deneubourg, How do ants assess food volume,
Animal behaviour 59, 1061-1069 (2000).
315
Chapitre 10
Optimality of Collective Choices: a
Stochastic Approach
S. C. Nicolis1,4, C. Detrain2, D. Demolin3 and J. L. Deneubourg1
1Center for Nonlinear Phenomena and Complex Systems, Université Libre de Bruxelles,
Campus Plaine, CP231, 1050 Brussels, Belgium
2Biologie animale et cellulaire, Université Libre de Bruxelles, Campus Solbosh, CP160/12
1050 Brussels, Belgium
3Laboratoire de Phonologie, Université Libre de Bruxelles, Campus Solbosh, CP175, 1050
Brussels, Belgium
4Author to whom correspondence should be addressed: snciolis@ulb.ac.be
Abstract
Amplifying communication is a characteristic of group-living animals.
This
study is concerned with food recruitment by chemical means, known to be associated
with foraging in most of ant colonies but also with defense or nest moving.
A
stochastic approach of collective choices made by ants faced with different sources
is developed to account for the fluctuations inherent to the recruitment process. It
has been established that ants are able to optimize their foraging by selecting the
most rewarding source. Our results suggest that the selection is the result of a trail
modulation according to food quality and an intrinsic capacity of individuals to lay
a certain quantity of pheromone. We show the existence of an optimal quantity of
laid pheromone for which the selection of a trail is at the maximum, whatever the
difference between the two sources might be. In terms of colony size, large colonies
317
LA COMPLEXITÉ
Jean-Louis DENEUBOURG
more easily focus their activity on one source. Moreover, trail selection is sharper
if many individuals lay small quantities of pheromone, instead of a small group of
individuals laying higher trail amount. These properties of optimality and efficiency
are generic and can be extended to other social phenomena in which competition of
informations occurs.
Key Words: trail-recruitment, optimization, collective choice.
318
10. Optimality of Collective Choices
LA COMPLEXITÉ
10.1
Introduction
Amplifying communication is a characteristic of group-living animals, such as social
arthropods [1,2,3,4,5,6,7], one common type of such communication being recruit-
ment. The nature of interactions implied in this phenomenon depends on the species
and can involve chemical means and/or physical contacts [2,8,9,10,11,12]. Mathe-
matical modeling of these amplifying processes then leads to coupled nonlinear dif-
ferential equations linking the characteristics of individual behavior to the collective
response [13,14].
The present study is concerned with food recruitment by chemical means, known
to be associated with foraging in most of ant colonies but also with defense or nest
moving [5,15]. In particular, we study cases of competition between food sources
leading to trail selection and choice of a particular source by the colony. In this respect
ants are able to optimize their foraging behavior by selecting the most rewarding
source, due merely to a modulation of the quantity of pheromone laid on a trail [5,15].
The study of the mechanisms at the origin of such a modulation is hindered by a
number of practical difficulties. In particular, it is not clear how one can identify
experimentally or by mathematical modeling based on the traditional mean field
approach, the parameters that can optimize the selection of a source. It is therefore
important to resort to other methods. In the present paper a stochastic method of
simulation of the trails system is developed to account for the fluctuations inherent to
the trail recruitment process due, for example, to the variable frequency of individuals
leaving the nest and/or coming to the choice between the trails. By taking into account
this kind of fluctuations, we will complete mean field type of analysis and give key
information of statistical nature such as the frequency of visit of an ant in one or other
trail or the selection rate of a particular trail i, which are inaccessible in the traditional
approach. Furthermore, fluctuations in the number of individuals are expected to be
important in small size colonies.
The model is developed in Section 2. Section 3 is devoted to our main results.
In the last Section 4 we discuss how one can gain further insights from experiments
available on this process.
10.2
The model
10.2.1
Mean field formulation
The model describes the evolution of the concentration of trail pheromone and as a
consequence, the traffic of the ants over each trail. The analytical formulation at the
mean- field level and its experimental validation has been already carried out in the
case where two sources are present [16] (for other models, see [17,18]), while a general
modeling accounting for a great number of sources has been developed by Nicolis &
Deneubourg (1999) [14].
319
LA COMPLEXITÉ
Jean-Louis DENEUBOURG
The differential equations describing the time evolution of the concentration of
pheromone (ci) on the trails possess two terms. The first, positive part reflects the
«birth» of the trail i, and the second, negative part describes the «death» of the trail i
through progressive disappearance of the pheromone by evaporation, −νci. The flux
of foragers from the nest (φ) to the trails is related to the colony size. The quantity
of pheromone laid on trail i (qi) is related to the richness of the sources i and ν is the
evaporation rate of the pheromone. The function Fi describes the relative attractiveness
of trail i over the others. The form taken here is [19]
(k + c
F
i)l
i =
s
,
(10.1)
(k + c
j=1
j )l
k acting like a concentration threshold beyond which the choice of a trail begins to be
effective. The parameter l stands for the sensitivity of the process of choice of a partic-
ular trail on the pheromonal concentrations ci present. In the sequel it will be fixed to
a value l = 2, drawn from the experiments made in the Lasius niger species [20,21].
The model equations can be now written in the form
dci
(k + c
= φq
i)l
dt
i
s
(k + c
j=1
j )l − νci,
i = 1, . . . , s,
(10.2)
s being the number of sources present.
Fig. 10.1 summarizes the main results of analytical work previously performed
on these equations in the case where two sources are present (for a discussion of s
sources, see [14]). It shows the bifurcation diagram of c1/(c1 + c2) with respect to
the parameter q1. As can be seen, when q1 = q2 (Fig. 10.1(a)) we have a typical
pitchfork bifurcation diagram, meaning that the homogeneous state (equal exploitation
of the two sources) becomes unstable at a particular value of the parameter (q1 =
2νk ). As q
φ
1 becomes different from q2, one witnesses the breaking of the pitchfork
bifurcation. In particular, for increasing differences between the two food sources
(Fig. 10.1(b), (c), (d)), the colony is led to exploit preferentially one particular source,
since only one stable inhomogeneous solution subsists in a wide region of parameter
values. Moreover, in the domain of coexistence of two states, the attraction basin
around one of the inhomogeneous solution is greater than that of the other.
We investigate the effects of stochasticity in the process and, in particular, in how
fluctuations of the number of individuals at the choice point (φ) and of the choice itself
influence the frequency of collective selection.
10.2.2
Principle and implementation of a Monte Carlo simulation
In order to sort out the main effects arising from the fluctuations we appeal to Monte
Carlo simulations. The advantage of this type of approach is that one can simulate
directly the process of interest rather than solve master type equations [22] modeling
it at a probabilistic level. In such a numerical experiment, the random aspects of the
320
10. Optimality of Collective Choices
LA COMPLEXITÉ
Figure 10.1: Bifurcation diagrams of the steady state solutions of eqs. (10.2) as a function of q1
in the case q2/q1 = 1 (a); q2/q1 = 0.75 (b), q2/q1 = 0.5 (c); q2/q1 = 0.25 (d). Parameter values
k = 6,φ = 0.01s 1
1
−
and ν = 1/2400s− .
321
LA COMPLEXITÉ
Jean-Louis DENEUBOURG
process are thus automatically incorporated. We can summarize the different steps as
follows.
a. Initial conditions
The pheromone concentrations and numbers of ants over each trail are fixed to
zero.
b. Decision process
- The first decision concerns the coming or not of an ant to the choice point.
This probability is given by the normalized value of the flux parameter. A
random number is sampled from a uniform distribution between 0 and 1.
If its value is less than or equal to φ, an ant comes to the choice point.
- The second decision is the choice of the trail. The trails will have initially
the same probability to be followed, but will differ as soon as at least one
individual has adopted a trail and laid a quantity of pheromone. The choice
of a specific trail is governed by the function (10.1) used in the analytical
formulation of the model. It is implemented by sampling a second random
number from a uniform distribution. If it is less than or equal to the function
F1 (see eq. (10.1)) for i = 1, the ant will follow and lay on the trail 1. If
not, but less than or equal to F2, it will follow and mark on trail 2, and so
on.
c. Time evolution
When an ant chooses a trail i, it lays a quantity qi of pheromone that gradu-
ally disappears through the parameter ν. Hence, the probabilities represented
by function (10.1) are updated at each simulation step according to the actual
pheromone concentrations. The process is repeated for a number of steps suf-
ficient to reach the stationary state, where the total quantity of pheromone over
both trails is constant.
The simulations are run for 30000 realizations and we calculate the mean selection
percentage for these simulations, that is, the average value of the fractions associated
to the richest source (eq. (10.1)) at the stationary state (in the following, this will be
referred as the index R). The choice of other indicators (ratio of the total number of
passages, ratio of the quantity of pheromone deposited, food retrieve on the different
paths,. . . ) leads to similar results.
10.3
Results
We consider two distinct cases. First we study the role of the colony size in the se-
lection of the richest source. Next, we show the existence of an «optimum» absolute
322
10. Optimality of Collective Choices
LA COMPLEXITÉ
value of qi in the selection of the richest source and the corresponding choice of a for-
aging path. Finally we study the relative role of the distribution of the total quantity of
pheromone among the individuals of the colony. The following results correspond to
the situation where two sources are simultaneously offered to the colony.
10.3.1
The role of the colony size
We are interested in how the parameter φ -the ants flux, known to be related to the size
of the colony- plays a role in the selection rate (ratio of the frequencies in eq. (10.1))
of the richest source. Fig. 10.2 gives the selection rate of a trail leading to the richest
source with respect to φ, for different absolute values of q1 and for q2/q1 fixed to
the value 0.75. One sees that for small values of q1 most individuals focus on the
trail leading to the richest source for large colonies, even if the selection rate is less
sensitive for small values of the flux. In other words, ants from small colonies have to
lay large quantities of trail pheromone to reach a good selection rate while individuals
from large colonies can lay smaller quantities per passage and reach a better global
selection rate.
Figure 10.2: Selection rate versus parameter φ for different values of q1 and q1/q2 = 0.5. Parameter
values are k = 6 and ν = 1/2400s 1
− .
323
LA COMPLEXITÉ
Jean-Louis DENEUBOURG
10.3.2
Optimization of the selection
We now study the influence of the absolute values of q 1 and q2 on the selection rate for a
given colony size and ratio q2/q1. As seen in Figs. 10.3, there exists an optimized value
of q1 (and thus of q2) for which the selection of the richest source reaches a maximum at
the stationary state. The maximum is higher if the difference (ratio q2/q1) between the
two sources is larger. This can be intuitively understood since the competition is less
marked as the increasing difference between sources leads to less marked competition
between trails (inducing the selection of the richest source). Indeed higher differences
in trail modulation according to the food quality imply a higher determinism in the
choice of the richest source.
Figure 10.3: Selection rate (ratio of the frequencies (eq. (10.1)) at the end of the process) versus
parameter q
1
1
1 for different values of q2/q1 with φ = 1/600s−
(a); and φ = 1/10s− (b). Parameter
values as in Fig. 10.2.
We also see that, for increasing values of the flux parameter, the maximum is
shifted to smaller absolute values of q1. This is to be related to our previous results
in Sect. 10.3.1 showing how large colonies are capable to reach a high selection rate
with small values of q1. Remarkably, the notion of optimized selection of the richest
source holds true in the transient regime as well. This is shown in Fig. 10.4, where the
selection after one hour of exploitation versus the parameter q1 is plotted.
These results mean that the optimized selection of a trail leading to the richest
source is not only due to the relative modulation of trail- laying according to food qual-
ity (q2/q1) but also to the intrinsic capability of individuals to lay a certain quantity q
of trail pheromone. A natural question is whether the existence of such a maximum is
due to the bifurcation (multistationarity) underlying the process of choice (Fig. 10.1)
but there is no current evidence that there is a one- to- one link between the two phe-
nomena. Still it is worth noting that in a situation where the parameter l is equal to
1 (applicable to bees, see e. g. [11]), meaning that there is no bifurcation and that the
probability to select one source is directly proportional to the relative number of re-
324
10. Optimality of Collective Choices
LA COMPLEXITÉ
Figure 10.4: Selection rate versus parameter q1 for q2/q1 = 0.5 after one hour of exploitation.
Parameter values are k = 6 and ν = 1/2400s 1
1
−
and φ = 0.1s− .
325
LA COMPLEXITÉ
Jean-Louis DENEUBOURG
cruiters, the maximum remains but has a very weak amplitude (Fig. 10.5a). On the
other hand, Fig. 10.5b depicts the hypothetical case of l equal to 5, implying a stronger
nonlinearity (but giving otherwise rise to the same bifurcation properties as l=2) that
is to say a higher sensitivity to the pheromone, associated to a marked amplification
effect. The maximum again subsists but is now sharper than for smaller values of
parameter l. Moreover, there is a value of the parameter l for which the selection is
optimized. Fig 10.5c gives the selection rate of the richest source (q2/q1 = 0.5) for two
different (given) values of q1 and q2 with respect to the parameter l. It should be noted
that the l values, which optimize the selection, are between 1.9 and 2.6 (this range
includes the experimental value of the parameter for the species Lasius niger, l ≈ 2).
Moreover, it can be shown that if we decrease the flux of individuals, the maximum is
shifted to higher values of l, suggesting that if a colony possesses a small number of
individuals, the ants need to have a more deterministic behavior (close to a all or none
response).
Figure 10.5: Selection rate versus parameter q1 for different values of q2/q1 in the case of parameter
value l = 1 (a) and l = 5 (b). (c) stands for the selection rate versus parameter l for q2/q1 = 0.5 for two
different absolute values of q
1
1
1 and q2. Parameter values are φ = 1/10s− , k = 6 and ν = 1/2400s− .
326
10. Optimality of Collective Choices
LA COMPLEXITÉ
10.3.3
High trail- laying vs. being numerous
Fig. 10.6 shows the optimal selection (maxima in Figs. 10.3) versus the parameter φ.
The corresponding q1 are explicitly indicated. We see that the optimal value is always
bigger for high values of φ and small values of q1. These results show thus clearly that
it is more efficient for an ant colony to have more individuals who lay small quantities
of pheromone rather than to have few individuals laying large quantities of pheromone.
Figure 10.6: Optimal selection rate versus parameter φ. The numbers written next to the points in
stand for the values of q1 which optimizes the selection.
We have considered until now a simplified case where there is only one behavioral
category of ants that were all engaged in trail laying. However, in nature, ants colonies
are composed of different groups of individuals among which some are specialized in
trail laying. Hence we need to address the situation in which the colony is composed
by active (laying) ants and inactive (non laying) ones. In this case, the parameter φ
stands only for the flux of active ants, the size of the colony being constant. In this re-
spect, for a given total amount of trail quantity, ant societies have the choice to allocate
this task to a large number of individuals or to restrain this activity to a small number
of individuals laying a large quantity q of trail pheromone. More specifically, we are
interested in how the "distribution" of the total quantity of pheromone in the colony can
imply a higher selection of the richest source. We thus consider the situation where the
327
LA COMPLEXITÉ
Jean-Louis DENEUBOURG
product of the parameters φ (flux of active ants) and q 1 is constant but absolute values
of parameters are changing, meaning that there exists a modulation between the num-
ber of laying ants and the quantity of pheromone laid down per individual. Fig. 10.7
shows the selection rate with respect to the parameter φ (a) and q1 (b). While in a
mean field approach all these situations would lead to identical global responses, we
see that, in the presence of fluctuations the selection of one trail becomes more marked
for large φ’s and low q1. This suggests that colonies with more laying individuals that
allow themselves to lay less pheromone are more capable to focus on one trail than
colonies with fewer active ants. The figures also show that the selection is better when
the difference between the sources is larger (parameter q2/q1). Notice that since in our
setting the final choice is only sensitive to the concentration of pheromone and thus to
the effect of active ants, our previous results stay true.
Figure 10.7: Selection rate versus φ (a) and q1 (b) for different values of q1/q2, the product φq1 being
constant and equal to 0.001. Parameter values as in Fig. 10.3.
10.4
Discussion
An approach accounting for the fluctuations in the number of foragers and the stochas-
ticity in the process of decision has been carried out here in the case where an ant
colony is confronted to the possibility to follow trails leading to food sources of dif-
ferent quality. The simulation extends the analytical mean field type of formulation
previously studied, by giving access to some additional results of statistical nature.
We first showed the existence of a preferred quantity of laid pheromone for which
the selection of a trail is at the maximum, whatever the difference between the two
sources might be. Moreover, in terms of size of the colony, we saw that large colonies
can more easily focus on one trail but also that focusing is sharper if the individuals lay
small quantities of pheromone. This is especially clear from the results of sect. 10.3.3,
where we took the product of the parameters φ and q1 constant, each of them having
328
10. Optimality of Collective Choices: a Stochastic Approach
LA COMPLEXITÉ
been varied. It strongly suggests that the selection is the result of not only a modulation
of trail- laying according to food quality but also of the intrinsic capacity of individuals
to lay a certain quantity of pheromone. Furthermore, small colonies (or small groups of
ants specialized in trail- laying) are less able to take advantage of the trail recruitment
than large colonies (or large group of trail- laying foragers).
It is well known that trail recruitment in ants mainly occurs in large societies. Dif-
ferent hypotheses have been formulated to explain the positive correlation between
cooperativity through trail recruitment and colony size [23,5,24]. Our results provide
further insights on this matter by showing that large numbers of trail- laying ants en-
hance the optimality and the efficiency of collective choices.
Our results also suggest that optimal responses are reached when during a recruit-
ment the majority of the foragers are implied in trail- laying. Experimental results
on the mass-recruiting ant Lasius niger seem in agreement with our prediction [25].
These authors show that close to 90% of the foragers lay down a trail pheromone at
the beginning of the recruitment. One can expect the decisions during the beginning
of the recruitment play an essential role in the final collective choice. This is con-
firmed by the analysis and the dynamics of our model (e. g. after ten minutes, in the
case of q1 = q2 = 0.1 (0.2) and a flux equal to 0.1 sec −1, 75% (85%) of simulations
have already made its choice). This result could mean that a high percentage of laying
individuals is needed to optimize the selection of a source. Later when the choice is
made, the percentage of trail laying ants may decrease without affecting the foraging
efficiency of the colony as experimentally observed through extinction of trail- laying
behavior over successive trips [25,26].
As pointed out in Introduction, the results obtained in this paper in the specific bi-
ological context of trail recruitment can be generalized to other decision processes in-
volving different competing options [1,13]. For instance, aggregation can be described
by similar mathematical models when individuals of a colony have the choice between
different relative attractive sites to aggregate themselves [27,28,29]. It can therefore
be expected that since the mechanisms underlying this phenomenon (and more gen-
erally all phenomena implying competition) are similar to recruitment, the same kind
of fluctuations are at work and there exists an optimized value of amplification and
interaction between animals.
Acknowledgments.
This study was supported by the Belgian Fund for Joint Basic
Research (grant nb. 2.4510.01). C. Detrain and J. L. Deneubourg are research asso-
ciates from the Belgian National Fund for Scientific Research. S. C. Nicolis would
thank the Action de Recherche Concertée and the Fondation David et Alice Van Bu-
uren for its support. The authors are grateful to Nigel Franks for helpful discussions
regarding this work.
329
LA COMPLEXITÉ
Jean-Louis DENEUBOURG
10.5
Bibliography
[1] E. Bonabeau, M. Dorigo, and G. Theraulaz, Swarm Intelligence: From natural to
artificial Systems, Oxford University Press (1999).
[2] S. Camazine, and J. Sneyd, A model of collective nectar source selection by
honeybees. Self-organization through simple rules, J. Theor. Biol. 149, 547-571
(1991).
[3] C. Detrain, J. L. Deneubourg, and J. M. Pasteels, Information Processing in So-
cial Insects, Birkhäuser Verlag, Basel (1999).
[4] T. D. Fitzgerald, The tent caterpillars, Cornell University Press, Ithaca (1995).
[5] B. Hölldobler, and E. O. Wilson, The ants, Springer Verlag, Berlin (1991).
[6] T. D. Seeley, The wisdom of the hive, Harvard University Press, Cambridge
(1995).
[7] G. Theraulaz, and F. Spitz, Auto-organisation et comportement , Hermes Editions
(1997).
[8] J. T. Costa, and R. W. Louque, Group foraging and trail following behavior of
the red-headed pine sawfly Neodiprion lecontei (Fitch) (Hymenoptera: Sym-
phyta: Diprionidae) , Annals of the Entomological Society of America 94, 480-
489 (2001).
[9] D. M. Miller, and P. G. Koehler, Trail-Following Behavior in the German Cock-
roach (Dictyoptera: Blattellidae) , J. of Econom. Entomol. 93, 1241-1246 (2000).
[10] C. Ruf, J. T. Costa, and K. Fiedler, Trail-based communication in social cater-
pillars of Eriogaster lanestris (Lepidoptera: Lasiocampidae) , J. Insect Behav. 14,
231-245 (2001).
[11] T. D. Seeley, S. Camazine, and J. Sneyd, Collective decision-making in honey-
bees: how colonies choose among nectar sources, Behav. Ecol. and Sociobiol. 28,
277-290 (1991).
[12] P. K. Visscher, and S. Camazine, Collective decisions and cognition in bees,
Nature 397-400 (1999).
[13] S. Camazine, J. L. Deneubourg, N. R. Franks, J. Sneyd, E. Bonabeau, and
G. Theraulaz, Self-organized Biological Superstructures, Princeton University
Press, Princeton (2001).
[14] S. C. Nicolis and J. L. Deneubourg, mphEmerging Patterns and Food Recruit-
ments in Ants: an Analytical Study, J. Theor. Biol. 198, 575-592 (1999).
330
BIBLIOGRAPHY
LA COMPLEXITÉ
[15] J. F. A. Robson, and S. K. Traniello, Trail and territorial communication in social
insects, in: W. J. Bell and R. Cardé (eds), The Chemical Ecology of Insects, vol.
II, Chapman & Hall (1995).
[16] R. Beckers, J. L. Deneubourg, and S. Goss, Trail laying behavior during food
recruitment in the ant Lasius niger (L.) , Ins. Soc. 39, 59-72 (1992).
[17] L. Edelstein-Keshet, Simple models for trail-following behavior: Trunk trails
versus individual foragers, J. of Math. Biol. 32, 303-328 (1994).
[18] T. Stickland, N. F. Britton, and N. R. Franks, Complex trails and simple algo-
rithms in ant foraging, Proceedings of the Royal Society London (B) 260, 53-58
(1995).
[19] J. L. Deneubourg, S. Aron, S. Goss, and J. M. Pasteels, The self-organizing ex-
ploratory pattern of the Argentine ant , J. Insect. Behav. 3, 159-168 (1990).
[20] R. Beckers, J. L. Deneubourg, and S. Goss, Trails and U-turns in the selection of
a path by the ant Lasius niger, J. of Theor. Biol. 159, 397-415 (1992).
[21] R. Beckers, J. L. Deneubourg, and S. Goss, Modulation of trail laying in the ant
Lasius niger (Hymenoptera: Formicidae) and its role in the collective selection
of a food source, J. of Insect Behav. 6, 751-759 (1993).
[22] N. G. Van Kampen, Stochastic processes in physics and chemistry, North Hol-
land, Amsterdam (1981).
[23] R. Beckers, S. Goss, J. L. Deneubourg, and J. Pasteels, Colony size, communica-
tion and ant foraging strategy. Psyche, 96, 239-256 (1989).
[24] C. Anderson and D. W. McShea, Individual versus social complexity,
Biol. Rev. (Camb.) 76, 211-237 (2001).
[25] A. C. Mailleux, C. Detrain, and J. L. Deneubourg, How do ants assess food vol-
ume? , Anim. Behav. 59, 1061-1069 (2000).
[26] O. Geissles, and F. Roces, Crop-loading dynamics in the nectar-feeding ant Cam-
ponotus rufipes: Effects of foraging experience, food quality and colony starva-
tion, Proceedings of the meeting UIEIS (European Sections) 27 (2001)
[27] A. Lioni, C. Sauwens, G. Theraulaz, and J. L. Deneubourg, Chain formation in
Oecophylla longinoda, J. Insect Behav. 14, 679-696 (2001).
[28] C. Rivault, G. Theraulaz, A. Cloarec, and J. L. Deneubourg, Auto-organization et
reconnaissance coloniale: le modèle de l’agrégation des blattes, Actes Colloques
Insectes Sociaux Section française, Albi (1998).
[29] Rasse et al., in preparation.
[30] D. T. Gillespie, Markov processes, Academic Press, San Diego (1992).
331
Chapitre 11
Emerging Patterns and Food
Recruitment in Ants: an Analytical
Study
S. C. Nicolis1, and J. L. Deneubourg1
1Center for Nonlinear Phenomena and Complex Systems, Université Libre de Bruxelles,
Campus Plaine, CP231, 1050 Brussels, Belgium
Abstract
A model of food recruitment by social insects accounting for the competition
between trails in the presence of an arbitrary number of sources is developed and
analyzed in detail. Both the case of identical environmental characteristics and the
case where one source and the corresponding trail are different from the others are
considered. Different collective responses depending on the environmental conditions,
and without change of individual behavior, are shown to exist, associated with the
possibility that the colony may be led to exploit one source or a group of sources
preferentially. The full bifurcation diagram of steady state solutions is constructed
from which the dominant exploitation patterns are identified. The biological relevance
of the results is discussed and suggestions are made for their experimental testing
in connection with the recruitment behavior of species using trail recruitment. The
same phenomenological model can be used for different trail- laying species since the
predictions are generic and not restricted to a given species, except for the parameter
values used.
Running title: Food recruitment patterns in ants
333
LA COMPLEXITÉ
Jean-Louis DENEUBOURG
11.1
Introduction
Amplifying communications occupies an important place in the organization of ani-
mal societies and, most particularly, in social insects. From a theoretical point of view,
amplification implies interaction between at least two individuals and is therefore ex-
pected to introduce to the dynamics a nonlinear element [1] that could in principle
be manifested at the level of the society as a whole in the form of complex collec-
tive spatio-temporal phenomena. Since nonlinearity generally implies multiplicity,
these phenomena cannot in principle be predicted without appealing to mathemati-
cal modeling linking the characteristics of individual behavior to the collective re-
sponse [2,3,4,5,6,7].
One of the most intensely studied cases of communication is food recruitment. De-
pending on the species, different processes may be involved. In bees, it implies direct
interactions between individuals [8,9,10]. In ants, recruitment is primarily ensured by
chemical means [11,12,13,14,15]. In the present paper we will be concerned by re-
cruitment in ant societies associated with foraging. Here a scout having discovered a
food source returns to the nest, laying a pheromonal trail which stimulates the inactive
foragers waiting in the nest (Fig. 11.1). These recruits can become recruiters in their
turn.
Roughly speaking, two types of phenomena are at work during this type of recruit-
ment:
- A first mechanism, where the recruiter and/or the trail stimulate the inactive
foragers waiting in the nest to leave it (worker ant recruitment),
- A second mechanism, only due to the trail which guides the recruited ants and
transmits the information concerning the location of the food source (trail re-
cruitment).
As recruits can become recruiters in their turn, both mechanisms are gradually
amplified.
It as been shown that a pheromone trail alone is able to stimulate forager ants
to leave their nest [14], but in most cases, both recruiters and trail are involved.
On the other hand, for the orientation to be followed, the situation is much more
simple: the trail alone is involved and the recruiter is not needed to guide the
recruitees.
An important step in the understanding of recruitment behavior has been the de-
sign of experiments in deliberately idealized situations, in which many of the com-
plications present in the real world can be eliminated. This has also facilitated the
development of mathematical models in which the parameters can be determined di-
rectly from the experiment. In this context, detailed experimental studies reported by
several authors [12,16,17,18] showed unexpected behavior when two food sources- or
two paths- were simultaneously made available. In particular, the competition between
the two chemical trails leading to the sources gave rise to a bifurcation phenomenon,
334
11. Emerging Patterns and Food Recruitment in Ants
LA COMPLEXITÉ
Figure 11.1: Schematic representation of the recruitment process: (1) discovery of the source, (2)
return to the nest while laying a trail, (3) trail stimulates individuals to leave the nest toward the source.
335
LA COMPLEXITÉ
Jean-Louis DENEUBOURG
in which one of the trails attracted most of that population and predominated clearly
over the other.
Several models aiming to study the influence of different parameters on this behav-
ior have been proposed in the literature. Different methods of analysis have been ap-
plied, such as numerical solutions of the differential equations describing the evolution
of the relevant variables or Monte Carlo type of simulations (see e.g. [19,20,21,22,23]).
In the present paper a model capable of accounting for the competition between trails
in the presence of an arbitrary number of sources is proposed and analyzed in detail.
The model extends that previously proposed and tested experimentally by [24,25,26]
for species using trail laying recruitment, which was limited to two food sources or
two paths. In particular, the different types of steady state solutions are identified and
their stability properties studied. This allows us to construct the full bifurcation dia-
gram. As will be shown, the colony may be led to exploit only a subset of sources. The
model accounts not only for the environmental features such as the number and quality
of sources but also for social and physical parameters such as the flux of individuals,
the quantity of pheromone deposited on a trail, or its rate of evaporation. We will be
interested both in the case of a homogeneous environment and in the case where one
of the sources is different from the others.
The model is introduced in Sect. 11.2 where the problem of stability is also formu-
lated in the most general case. In Sect. 11.3 the steady state solutions and their stability
are studied in the case where the physical and chemical characteristics of the sources
and trails leading to them are identical. In Sect. 11.5, the case where the characteris-
tics of one of the sources and of the corresponding trail are different from the others
is considered. On the basis of the results obtained, a number of concrete, quantitative
predictions are made in these two sections for recruitment behavior of ants using trail
recruitment, using, as an example, experimentally determined parameter values for the
species Lasius niger. The procedure also applies to other ant species undergoing sim-
ilar types of recruitment, the only difference being in the parameter values. The main
conclusions are summarized in Sect. 11.7.
11.2
The model
As stressed in the Introduction, we concern ourselves only with the case of trail re-
cruitment (the second form of recruitment, see Introduction). We analyze the traffic
among these trails and, in particular, identify those that will be followed in a preferred
manner. One may reasonably expect that the direct interaction between individuals is
then superseded by their response to the pheromone concentration present in a given
trail. Thus, in the framework of such a "macroscopic" description the principal vari-
ables will be, then, the pheromone concentration rather than the number of individuals
present on the various trails, at a given time. A schematic representation of the set- up
and the associated processes is given in Fig. 11.2a. The outgoing flux from the nest
(φ(R, c)), depending of the nest size, is a function of the quantity of trail pheromone
(c) and the number of recruiters (R) inside the nest. However these two variables are
336
11. Emerging Patterns and Food Recruitment in Ants
LA COMPLEXITÉ
strongly correlated because if a recruiter comes back, it adds automatically pheromone
on the trail, and we shall consider that the flux is only function of the concentration
of pheromone, φ = φ(c). An empirical equation which seems to fit well with the
experimental data is [24]:
a + bc2
φ(c) =
.
(11.1)
d + b c2
Beckers et al. observed that ants are trailing only after ingesting food. The re-
cruited ants thus do not lay a trail on their first trip between the nest and the food
source, but ants which leave the nest for their subsequent foraging also lay trail be-
tween the nest and the food source. To increase still the complexity of the problem,
after a certain number of trips ants may stop laying trail, despite the fact they are still
foraging [27]. It follows that between a departure from the nest and an increase of
pheromone there is a time delay τ , (one round trip plus time in the nest plus loading
time. The increase of trail concentration at time t is then proportional to the flux at
time t − τ. As it turns out, taking into account this delay does not change the basic
features of the dynamics but may merely give rise to damped oscillations under certain
conditions (see e. g. [28]). Since time delays (minutes) are negligible comparing to
the time scale of the dynamics (hours), in the rest of this paper we will assume the
flux parameter φ is a constant. One may check that there are no effects on stationary
solutions and their stability resulting from this simplification.
Under natural conditions or in experimental setups, ants have a choice between
trails (Fig. 11.2b). The model is mainly devoted to this orientation choice, for which
the trail alone is involved and the recruiter is not needed. Let ci be the pheromone con-
centration on trail i = 1, . . . , s. The rate of change of ci with time can be decomposed
into two parts. A first, positive contribution reflecting the "birth" of trail i through
the deposition of pheromone by the individual provided the food source is suitable;
and a second, negative contribution describing the "death" of the trail via progressive
disappearance of the pheromone through evaporation
dci = φσ
dt
iFi ({ci}) − νici, i = 1, . . . , s.
(11.2)
Here φ is the total ant flow from the nest toward the trails (taken to be a constant
as discussed above), σi the quantity of pheromone deposed on the trail i (in turn, an
increasing function of the quality of the food source), νi the corresponding evaporation
rate and Fi ({ci}) a function describing the relative attractiveness of trail i over the
others. It is reasonable to assume that this function increases with increasing values of
ci and eventually saturates at a plateau value as ci gets very large. The particular form
chosen here is [25]
(k + c
F
i)l
i =
s
.
(11.3)
(k + c
j=1
j )l
Here s is the number of sources and k, a concentration threshold beyond which the
pheromone laid on a trail begins to be effective. The parameter l measures the sen-
sitivity of the process of choice of a particular trail on the pheromonal concentration
ci present, and will be therefore be referred to hereafter as «cooperativity parameter».
337
LA COMPLEXITÉ
Jean-Louis DENEUBOURG
Figure 11.2: (a) Set- up proposed for experimental test of the model. The sources are placed around
the nest and are accessible through paths of identical texture. (b) Schematic representation of the choice
between two identical paths, once the indiduals left the nest.
338
11. Emerging Patterns and Food Recruitment in Ants
LA COMPLEXITÉ
This function has been applied and quantified in the case of different species, in partic-
ular Lasius niger [16,24,27], Linepitema humile [25,26], army ants [29] and used also
for Messor pergandei [30], and tested in different situations. It is a generic function
to describe the choice between paths in terms of the concentration of pheromone. We
recall that the interactions among recruitees in the trails do not figure in the model.
The model accounts for three types of feedback:
- A positive, nonlinear feedback of trail i onto itself through the function Fi,
- A negative, linear, feedback of trail i onto itself through the evaporation of the
pheromone,
- A negative, nonlinear feedback of trail j onto trail i associated with competition.
It is convenient to introduce scaled variables and parameters through the transfor-
mation
c
σ
φ
C
i
i
i =
,
q
,
Φ =
.
k
i = k
ν
Furthermore we assume that the environment and the substrate are homogeneous. We
thus subsequently set all νi equal to a common value ν which, by normalizing time,
can be set to ν = 1. Eqs.(11.1) - (11.2) reduce then to the system
dCi
(1 + C
= Φq
i)l
dt
i
s
(1 + C
j=1
j )l − Ci.
(11.4)
In the sequel the parameter l will be fixed to a value l = 2 compatible with ex-
periment [16,27]. The solutions of eq. (11.4) will therefore depend entirely on the
parameters Φqi and on s.
We shall be especially interested in the steady state solutions. Setting the time
derivative to zero and dividing the equations as applied to the trails i and j one obtains
qi(1 + Ci)2
(11.5)
qj(1 + Cj)2
and
s
(1 + C
(1 + C
i)2
j )2 = Φqi
.
(11.6)
C
j=1
i
Typically these equations admit multiple solutions {Ci,st}. In what follows we
shall therefore be led to test the stability of these different branches in order to deter-
mine the state that will actually be chosen by the system. Setting Ci = Ci,st + δCi and
linearizing eq.(11.4) with respect to the perturbations δCi, we obtain the following set
of equations
dδCi =
A
dt
ikδCk
(11.7)
k
339
LA COMPLEXITÉ
Jean-Louis DENEUBOURG
where
2C2(1 + C
C
A
i
k)
i − 1
ik = −
+
δkr
Φq
ik
(11.8)
i(1 + Ci)2
1 + Ci
where we used eqs.(11.5), (11.6). For simplicity, we have dropped the index "st" from
Ci since it is understood that all these coefficients are to be evaluated in the steady
state.
As well known [31] the characteristic exponents determining the stability of the
stationary state are given by the eigenvalues of the Jacobian matrix {Aik} or, equiva-
lently, by the characteristic equation
det(Aij − ωδkr) = 0.
ij
(11.9)
The stability condition is Re(ωi) < 0 for all i. Notice that for l = 1 there is only one
steady state solution of eq.(11.4), which is always stable.
11.3
The case of identical sources and trails
We shall first deal with the case where the food sources presented to the colony are
identical and the trails leading to them have the same physical characteristics, qi = q
for all i. Eqs. (11.4) admit then a first type of stationary solution, the homogeneous
solution, in which all sources are exploited in an identical manner
Φq
Ci =
= C.
(11.10)
s
To test the stability of this solution we introduce these relations along with qi = q into
eq. (11.8). One checks then easily that the elements of the Jacobian matrix {Aik} in
eq. (11.8) can only take two different values
Φq
A
− s
ii
= a =
−2Φq +
s(Φq + s)
Φq + s
Aij = b =
−2Φq
s(Φq + s)
The characteristic equation (11.9) takes the form
{ω − [a + (s − 1)b]}{ω − (a − b)}s−1
We obtain a first solution
ω1 = a + (s − 1)b = −1 < 0
(11.11)
which is always stable, and a group of s − 1 degenerate ones
Φq
ω
− s
2 = a
.
(11.12)
s
− b = Φq + s
340
11. Emerging Patterns and Food Recruitment in Ants
LA COMPLEXITÉ
We conclude that the homogeneous solution may lose its stability, the instability
condition being Φq > s. When this happens new stationary solutions must take over,
which are necessarily non-homogeneous.
It is easy to show from eqs.(11.4) that the only such solutions are semi-
inhomogeneous ones in which j trails having a concentration C1 are exploited in a
different manner with respect to the other s −j ones having a concentration C2. These
steady state concentrations are given by
Φq
1
Φq 2
s
C± =
− j
1
(11.13)
2j ± 2
j
− 4
j
with
1
Φq
C =
=
− jC±1, j = 1,...s/2
2
for s even or (s + 1)/2 for s odd. (11.14)
C±
s
1
− j
Here the superscripts + and − correspond, respectively, to a trail that is more heavily
or less heavily marked by the individuals. These solutions exist as long as
Φq ≥ 2 (s − j)j.
(11.15)
To determine stability we consider separately the cases j = 1 and j > 1.
11.3.1
The case j = 1
The elements of the Jacobian matrix {Aik} in eq. (11.7) can now take five distinct
values
2(s
A
− 1)
11
= a1 = Φq(1 + C1) − 1
2C
A
1
1j
= b1 = −Φq(1 + C1)
C1 + s−2
A
C1
(11.16)
ii
= a2 = 2Φq(1 + C1) − 1
b
A
1
i1
= b2 = C1
2
Aij = c
=
,
i, j > 1, i = j
Φq(1 + C1)C1
The characteristic equation takes the form
(ω − a2 + c)s−2 ω2 − {a1 + a2 + (s − 2)c}ω + a1a2 + a1c(s − 2)b1b2 = 0.
(11.17)
A numerical evaluation of the roots of this equation reveals that the high concentration
C+
1 is always stable while the low concentration C −
1 is always unstable.
341
LA COMPLEXITÉ
Jean-Louis DENEUBOURG
11.4
The case j > 1
There are now six distinct elements of the Jacobian matrix {Aik} in eq. (11.7)
2C2
C
A
1
1 − 1
ii
= a1 = −
+
Φq(1 + C1)
C1 + 1
2C
A
1
ik
= b1 = −Φq(1 + C1)
2C2
A
1
ii
= c1 = −Φq(1 + C1)
2
1
(11.18)
A
− C1
kk
= a2 = −
+
ΦqC1(1 + C1)
C1 + 1
2
Aki
= b2 = −Φq(1 + C1)
2
Akk
= c2 = −
,
1
ΦqC
≤ i, i ≤ j, i = i
1(1 + C1)
j + 1 ≤ k, k ≤ s, k = k .
After some manipulations of the characteristic determinant one finds that the charac-
teristic equation possesses the following two j and s − j degenerate roots
C
ω
1 − 1
1
= a1 − c1 =
C1 + 1 .
(11.19)
C
ω
1 − 1
2
= a2 − c2 = −C1 + 1
The stability conditions ω < 0 for each of these roots are, clearly, mutually incompat-
ible. We conclude that at least one root of the characteristic equation is positive and
hence that the semi-inhomogeneous solutions with j > 1 are always unstable.
In summary, the homogeneous solution (eq. (11.10)) loses its stability at a thresh-
old value of the parameter Φq = s. As regards the inhomogeneous solutions, only the
branch C+
1 and the corresponding branch C −
2 of the case j = 1 are stable. These solu-
tions emerge at critical parameter values given by eq. (11.15). For j = 1 and s > 2,
each of these values corresponds to a limit point bifurcation, one branch of which goes
in the (C, q) diagram through the critical point Φq = s. For s even and j = s/2, there
is a pitchfork bifurcation of unstable branches emerging from Φq = s, otherwise all
other bifurcations are limit point ones originating on the left of this criticality.
As an illustration, Fig. 11.3a depicts the bifurcation diagram in the case of s = 4.
The homogeneous state loses its stability at (Φq)c = 4, which is also the locus of a
pitchfork bifurcation of two unstable branches (branches corresponding to j = 2). The
limit points are located at (Φq)c = 3.5, generating four semi-inhomogeneous solutions,
two of which are stable (the upper and lower branches). In the domain 3.5 < Φq < 4
one therefore has three simultaneously stable solutions whereas for Φq > 4 only two
states are stable.
342
11. Emerging Patterns and Food Recruitment in Ants
LA COMPLEXITÉ
Figure 11.3: Bifurcation diagram of the steady state solutions of eqs. (11.4) in the case of four sources.
(Φq) : pitchfork bifurcation point where the homogeneous branch loses its stability. (Φq) : limit point
c
c
bifurcation. (b) State diagram representing the parameter regions of different modes of exploitation of
resources in the case of Φ = 10.
343
LA COMPLEXITÉ
Jean-Louis DENEUBOURG
Fig. 11.3b depicts the "state diagram" of the system, showing the range of param-
eter values corresponding to one or multiple solutions. In the upper part of the graph
the homogeneous solution is stable and the semi-inhomogeneous ones do not exist,
which means that one has equal exploitation of multiple resources. In the lower part
the homogeneous solution is unstable and the semi-inhomogeneous ones exist, in other
words, the colony exploits preferentially one resource of the s available; the others be-
ing equally visited but nevertheless overwhelmed by the dominant one. Finally, the
middle part corresponds to coexistence of multiple source and one preferred source
exploitation modes.
11.4.1
Biological relevance
We now propose to exploit the results of Secs. 11.3.1 and 11.4 to give ideas of exper-
iments that could lead to the observation of the patterns predicted and summarized in
Figs. 11.3.
It has been shown [32] that Messor rufitarsis changes its mode of exploitation
with the number of sources and their distribution. Both a single heavily marked trail
and a more diffuse exploitation can be observed when several sources are present.
These results are in qualitative agreement with our predictions, but the situation to
which they refer is not identical to ours. It would therefore be important to undertake
laboratory experiments creating deliberately the idealized conditions stipulated in the
model. A first question to be asked would then be whether there exists a critical number
sc marking the passage from uniform to inhomogeneous exploitation. In terms of the
original parameters and variables, this number is given by (cf. eq. (11.12))
φσ
sc = νk
For the values characteristic of Lasius niger [27] (k = 6, 1/ν = 1500 sec, σ = 1
(for a food concentration in sucrose of 1 M), φ = 0.1 sec−1) this gives sc = 25,
indicating that the inhomogeneous exploitation will prevail up to this number. In terms
of Fig. 11.3a, this value of sc is the intersection of the absissa Φq = 25 with the lower
curve. It marks the limit between an exploitation of a single preferred source and a
mixed exploitation (exploitation of a single preferred source or an equal exploitation
of multiple sources). The intersection of the same absissa with the upper curve of the
2
figure gives (see eq. (11.13)) sc = φσ
+ 1
2νk
≈ 157, which is very high. We conclude
that for these values of Φq, the transition between exploitation of a single source and
mixed exploitation is experimentally accessible, whereas the transition between mixed
exploitation and equal exploitation of multiple sources is not. Conversely, for a given
number of sources, the transition is facilitated by decreasing the size of the colony
(parameter φ) or the quality of the sources (parameter σ): For parameter values φ =
0.05 sec−1 and σ = 0.5 (which correspond to a food source concentration in sucrose
of 0.1M), we see that the corresponding transition values are respectively sc = 6 and
sc ≈ 11. This last value, which corresponds to the transition towards the homogeneous
exploitation, becomes now experimentally accessible.
344
11. Emerging Patterns and Food Recruitment in Ants
LA COMPLEXITÉ
Let us comment now in some detail on mixed exploitation (region between c and
c ). According to Fig. 11.3b, mixed exploitation occurs when Φq lies between the limit
point bifurcation and the point of loss of stability of homogeneous solution. For these
critical points we have (eqs. (11.10)-(11.14))
Φq
C
=
loss of stability of homogeneous solution :
s
(Φq)c = s
Φq
C
1 = 2
onset of semi-inhomogeneous solutions :
2
C
2 =
Φq
(Φq)c = 2√s−1s
We observe that as s increases, C decreases, and C1 and C2 remain unchanged.
This implies that the distance between the homogeneous branch and the high concen-
tration semi-inhomogeneous branch increases while it decreases for the lower semi-
inhomogeneous branch. On the other hand as s increases, (Φq)c increases faster than
(Φq)c . As a result, the range of values of Φq where there is coexistence between
two modes of exploitation increases. Nevertheless, as pointed out above, the homoge-
neous branch and the lower semi-inhomogeneous one tend to be indinstinguishable. In
other words, the passage between an exploitation of one preferred source and an equal
exploitation is blurred. In conclusion, in the case of Lasius niger, the homogeneous
exploitation shall not be often observed except in small colonies and food sources that
induce a low trail-laying activity.
11.5
One of the sources is different
We now turn to the more realistic situation where the food sources and/or the physical
characteristics of the trails leading to them are not identical. To identify the new fea-
tures brought about by this change, we limit our analysis to the case where only one of
the trails (say 1) is different, all other s − 1 trails being identical, q2 = . . . = qs = q,
q1 = q. It then follows straightforwardly from eq. (11.5), (11.6) that two different
types of pheromone concentrations C2, . . . , Cs in the s − 1 identical sources can
be envisaged, just like in Sect. 11.3, eqs.(11.10) and (11.13), (11.14): either all of
them are identical, C2 = . . . = Cs = C2; or there are j trails having a concen-
tration C2 = . . . = Cj + 1 = C2 and s − j − 1 trails having a concentration
Cj + 2 = . . . = Cs = C3, with C3 = 1/C2. We hereafter analyze these two cases
separately.
345
LA COMPLEXITÉ
Jean-Louis DENEUBOURG
11.5.1
The case C2 = . . . = Cs = C2
We obtain after some straightforward manipulations a cubic equation for C1,
q2
2Φq2
2q
1 +
C3
C2
(s − 1)q2
1 +
2 − Φq1 −
−
1
1
(s − 1)q1 q1Φq2
+ 2Φq + s − 2Φq1 +
C
s − 1 1 − Φq1 = 0 (11.20)
with, for C2,
C
q
C
1
2 = . . . = Cs =
Φ −
.
(11.21)
q1
s − 1
Regarding stability, one sees, not unexpectedly, that the Jacobian matrix has just
five different elements as in the case j = 1 of Sec. 11.3:
C
2C2
A
1 − 1
1
11
= a1 = C1 + 1 − Φq1(C1 + 1)
2(C
A
2 + 1)C 2
1
1k
= b
= −Φq1(C1 + 1)2
C
2C2
A
2 − 1
2
kk
= a2 =
(11.22)
C2 + 1 − Φq(C2 + 1)
2(C
A
1 + 1)C 2
2
k1
= c
= −Φq(C2 + 1)2
2C2
A
2
kj
= d
= −
,
k, j > 1, k = j
Φq(C2 + 1)
The characteristic equation has the same structure as in eq. (11.17). No general
statement concerning its solutions can be made, and one has to resort to a numerical
evaluation in which the explicit value of C1 given by eq. (11.20) is inserted in the
expressions (11.22) from which the ω’s can be computed. This can be done explicitly
for s = 3 and 4, but for s > 4 integration of the full equations in time is also performed,
as discussed further below. In all cases considered only the solution branch in which
C1 is dominant turns out to be stable.
Figs. 11.4Ia-d depict, for the parameter value q1 = 1, the bifurcation diagrams of
C1 as a function of the parameter ε = q/q1 for the values s = 3, 5, 10 and 20. We
observe an s-shaped curve but no hysteresis, since only one branch (the upper one)
is stable. As s increases further the s-shape disappears, and one obtains a monotonic
dependance of C1 on ε: this unique branch of solutions is stable for all values of ε, im-
plying that the richer source always takes over. Curiously, under the same conditions,
the value of C1 at a given ε steadily decreases. A closer analysis shows that there is no
optimal value of s in which the richer source both takes over and is visited by a sizable
part of the total population.
346
11. Emerging Patterns and Food Recruitment in Ants
LA COMPLEXITÉ
Figure 11.4: Bifurcation diagrams of C1 (eq. (11.20), (11.23), (11.24) as a function of the parameter
ε = q/q1 for 3 sources (a); 5 sources (b); 10 sources (c); and 20 sources (d). The case C2 = C3 (I) and
the case C2 = 1/C3 (II) are shown in the left and the right panels respectively. Parameter Φq is equal
to 10.
347
LA COMPLEXITÉ
Jean-Louis DENEUBOURG
11.5.2
The case C3 = 1/C2
We obtain now a quartic equation, which is most easily expressed in terms of C2 rather
than C1,
q2
q
q3
j j +
C4
Φq
+
C3
q2
2 − 2j
1 + 1 − qq
2
1
1
q1
q21
q2
q2
+
s + q2Φ2 + 2j(s
1
C
q2
1
− j − 1) + 2Φ
− qq
2
1
1
q1
q2
+ (s − j − 1) s − j − 1 +
= 0 (11.23)
q21
with, for C1,
C
s
C
2
− j − 1
1 = q1
Φ − j
.
(11.24)
q −
qC2
Regarding stability, there are now eleven different elements of the Jacobian matrix
given by :
2C2
C
A
1
1 − 1
11
= a1 = −
+
Φq1(C1 + 1)
C1 + 1
2C2
A
1 (C2 + 1)
1k
= b1 = −Φq1(C1 + 1)2
2C2(C
A
1
2 + 1)
1k
= b2 = −Φq1(C1 + 1)2C2
2C2(C
A
2
1 + 1)
k 1
= c1 = −Φq(C2 + 1)2
2(C
A
1 + 1)
k 1
= c2 = −Φq(C2 + 1)
2C2
C
A
2
2 − 1
k k
= a2 = −
+
Φq(C
(11.25)
2 + 1)
C2 + 1
2
1
A
− C2
k k
= a3 = −
+
Φq(C2 + 1)
C2 + 1
2C2
A
2
k l
= d1 = −Φq(C2 + 1)
2
Ak l
= d2 = −Φq(C2 + 1)
2C
A
2
k l
= e1 = −Φq(C2 + 1)
2
Ak l
= e2 = −
,
2
Φq(C
≤ k , l ≤ j + 1, k = l
2 + 1)
j + 2 ≤ k , l ≤ s, k = l .
As in Sec. 11.3, one now has to distinguish between j > 1 and j = 1. In the
first case, by performing the same type of transformation as in Sec. 11.4, one identifies
348
11. Emerging Patterns and Food Recruitment in Ants
LA COMPLEXITÉ
two roots of opposite sign of that characteristic equation, entailing that this type of
solution is always unstable. We may therefore restrict our attention to the case j =
1. In this situation, we have zero, two, or four physically acceptable solutions (real
positive). Furthermore one can identify, by similar manipulations as above, one root
of the characteristic equation equal to
1
ω = a
− C2
3 − d2 =
.
1 + C2
This implies that the branches with C2 < 1 and hence with large C1 are always unsta-
ble, but does not suffice to guarantee that the branches with C2 > 1 are stable. One
must therefore resort again to numerical evaluations, showing that of all the solution
branches available only the one with the smallest C1 (the greatest C2) is stable.
Figs. 11.4IIa-d depict, for the parameter value q1 = 1, four representative bifur-
cation diagrams illustrating this behavior corresponding respectively to s = 3, 5, 10
and 20. As s increases further the bifurcation diagram reduces to a single limit point
bifurcation starting at ε > 1 with the upper branch unstable and the lower one stable.
Figure 11.5: State diagram representing the parameter regions in the case of Φ = 5 (a); Φ = 10 (b);
Φ = 15 (c); Φ = 20 (d) of different modes of exploitation of the resources. The region in the right
of the curves corresponds to the exploitation of multiple sources and on the left, to the exploitation of
preferential source.
Fig. 11.5 shows the "state diagram" of the system with respect to the parameters
s and ε. The four curves, drawn for four different values of Φ separate the region of
one solution of the cubic equation (eq. (11.20)) and no solution of the quartic equation
(eq. (11.23)) (left part of the curves) from the region of one solution of the cubic equa-
tion physically acceptable solutions of the quartic equation (right part of the curves).
349
LA COMPLEXITÉ
Jean-Louis DENEUBOURG
In more physical terms, these regions correspond respectively to exploitation of a pre-
ferred source (a single stable solution) and to exploitation of multiple resources (two
coexisting stable solutions). We see that as Φ increases, the region of exploitation of a
preferred source shifts towards increasing values of ε.
Figure 11.6: Time evolution of pheromone concentrations as obtained from integration of eqs. (11.4)
in the case of s = 5. The left graph corresponds to the case C2 = C3 with ε = 0.5) and the right graph
to the case C3 = 1/C2 (ε = 3).
As mentioned above, to confirm the stability of the high and low C1 branches of
Figs. 11.4(I) - (II) in the case of several sources, integration of the full set of eqs. (11.4)
in time is also necessary. Fig. 11.6 illustrates two typical outcomes of such an inte-
gration for s = 5, carried out using a second order Runge - Kutta method. One starts
with initial conditions in which the value of C1, C2 and C3 are close. In the left- hand
figure, the parameter ε is such that the solution C2 = C3, is favored. We see that after
a short lapse of time C1 takes over and tends to a high value, whereas C2 and C3 tend
both to the same low value after a slight overshoot in intermediate times. In contrast,
in the right- hand figure the parameter ε is such that C3 = 1/C2 is favored. We see that
C2 takes now over whereas C1 decreases toward a very small value, even less than the
value attained by C3 after an overshoot in intermediate times. These results show that
selection is very sharp, in the sense that there is no induction period during which the
choice of the trails is undecided.
In the above analysis we have taken the parameter q1 = 1, a value corresponding,
in the identical sources case, to the exploitation of a single preferred source (inhomo-
geneous solutions: eqs. (11.13), (11.14)). Actually, eqs. (11.20) and (11.23) display
not only the ratio q/q1 but also q1 itself. One may therefore wonder how the situation
would change by shifting q1 toward the range of values corresponding, in the identi-
cal sources case, to a mixed exploitation, i.e. exploitation of a single source (semi-
inhomogeneous solutions, eq. (11.13), (11.14)) and exploitation of multiple sources
(homogeneous solution, eq. (11.10)).
Fig. 11.7a shows the bifurcation diagram in the case of three sources under these
conditions. We have taken the parameter q1 = 0.291 and varied q in the vicinity of
350
11. Emerging Patterns and Food Recruitment in Ants
LA COMPLEXITÉ
Figure 11.7: (a) Bifurcation diagram of C1 (eq. (11.20), (11.23), (11.24) as a function of parameter
ε = q/q1 for s = 3 and q1 = 0.291. (b) As in (a) but for s = 5 and q1 = 0.167 (corresponding to the
species Lasius niger).
this value. One observes a similar situation as the identical sources case: a coexistence
of three stable solutions, although the range is now more limited than in Sec. 11.3,
Fig. 11.3.
As q1 decreases further the situation again changes. To be specific, consider the
value associated to Lasius niger in the presence of a source of 1M in sucrose. The nu-
merical solution of eqs. (11.20) and (11.23) complemented by stability analysis leads
for s = 5 to a bifurcation diagram depicted in Fig. 11.7b. We observe that the high C1
branch continues in a stable way well beyond q = q1 . For the critical value of ε corre-
sponding to the loss of stability, one observes a subcritical bifurcation of an unstable
solution which is stabilized through a limit point bifurcation. In the region between the
limit point and the instability point of the high C1 branch one has two simultaneously
stable state solutions. Notice that the limit point is shifted toward higher values of ε as
s increases.
So far the bifurcation diagrams of Figs. 11.4-11.7 have been obtained by keeping
Φ at a fixed value Φ = 10. In Lasius niger where ν is equal to 1/1500 sec−1, this
corresponds to a physical flux φ = Φν = 0.0067 (one ant per 150 sec), a very low
flux indeed. Increasing this value to about 0.1, leads to a modification of Fig. 11.7b,
resembling the situation described in Figs. 11.4I-II.
11.6
Biological relevance
One of the predictions coming out of the analysis of Sec. 11.5.1 and 11.5.2 is that
a high C1 branch can remain stable even for parameter values for which source 1 is
poorer and that a low C1 branch can remain stable even for parameter values for which
source 1 is richer. There are, however, limits to this inertia, defined by critical values
351
LA COMPLEXITÉ
Jean-Louis DENEUBOURG
of s and Φq (Figs. 11.4I-II and Fig. 11.7a). This prediction should also be amenable
to experimental testing. Notice, however, that it concerns only the final state of the
system. Before reaching this state, a variety of transient exploitation patterns may be
possible. They could be observed under suitable initial conditions corresponding to
various distributions of the ants in the trails.
Let us now focus on the biologically relevant case where one source is richer than
the others. The prediction that for the parameter values of Lasius niger there exist dif-
ferent modes of exploitation of the sources present as the flux of individuals is varied,
is directly amenable to experimental testing. We have seen (Fig. 11.7b) that for small
flux values the colony has no choice but to exploit preferentially the richest source. To
show this experimentally it would suffice to present to a small colony a few sources
poor in sucrose along with a rich one. The prediction is, then, that, whatever the time
sequence in which the sources are introduced, the rich source will be overwhelmingly
exploited.
On the other hand, we have seen that if the flux is increased different solutions
can be reached. The colony has the choice between a preferred exploitation of the
rich or of one of the poor sources. To observe this, we suggest to introduce a delay
in the discovery of some of the sources. Specifically, let first the colony exploit for
some time (typically some minutes) a few poor sources. We next introduce the rich
source. The prediction is, then, that one of the poor sources will still be exploited in
a preferential manner. If one further introduces a few poor sources, one will observe
after a transient period, in view of the comments at the end of sec. 11.5.1 on the role
of the parameter s, switching to the other stable mode, the preferential exploitation of
the rich source. Notice that in this latter experiment the time sequence in which the
sources are introduced is crucial, contrary to the previous case.
11.7
Conclusions
In this work a mathematical model of food recruitment in the presence of an arbitrary
number of sources, applicable to trail laying ants, was developed and analyzed in de-
tail. Two cases were considered: all the sources are identical; and one of the sources is
different. In both instances it has been possible to cast the problem in terms of two key
parameters, the rate of pheromone deposition on the trail and the number of sources.
We have identified the role of these parameters on the global behavior, particularly in
connection with the possibility of different modes of exploitation of resources by the
colony.
Another important parameter turned out to be the flux Φ, providing a measure
of the size of the colony. In actual fact, as pointed out in the introduction φ should
be allowed to be time dependent, owing to the existence of a time lag between the
discovery of the source and the recruitment at the nest. But if one limits oneself to the
steady state regime one is entitled to replace Φ by a constant value, the only difference
being a slight modification of the transient behavior prior to the establishment of the
352
11. Emerging Patterns and Food Recruitment in Ants
LA COMPLEXITÉ
steady state (see Introduction). Among the other parameters present, the pheromone
disappearance rate ν was shown to play a "trivial role", serving merely to normalize the
time scale. The model can be applied to all trail laying species, provided the parameter
values are appropriately adapted. As an example we have illustrated our predictions
for the specific set of parameters of the species Lasius niger.
In the case where all sources are equal, the most unexpected result was undoubt-
edly the possibility of competition between trails, leading to a semi-inhomogeneous
distribution of pheromone between them. Interestingly, it turned out that under the
assumptions of the model, at most two groups of sources could be exploited in a dif-
ferent manner, one consisting of a single preferred source and a second consisting of
all other sources exploited on equal footing. These results show that the colony may
focus its activity preferentially on one particular source rather than on another, even if
the sources are identical. This type of collective response may partially explain some
aspects of the complex foraging patterns of species present in large colonies and being
individually good trail layers, such Solenopsis, Pheidole, Messor, . . . [11,33]. Con-
fronted to a multiplicity of choices, these species have the capacity to focus on one
activity.
Such strategies are expected to be selected for in species which need a cooperation
between foragers or possess different specialized individuals such as minor and major
as in the case of Pheidole (see e. g. [17]).
Throughout our analysis, we have assumed that the food sources are not exhausted
during one recruitment. Some sources in natural conditions, such as aphids colonies
or Lycaneid caterpillars [11], are close to this situation. Lycaneid caterpillar provide
food to ants in return for protection from predators [34]. It is also known that Ly-
caenid butterflies prefer to oviposit on bushes already containing caterpillars and their
attendant ants. Our model suggests that a local ant colony may select, with a certain
probability one of the aphids colonies or bushes (that are all identical) and that this
probability increases with the population of aphids or caterpillars. As a consequence,
small colonies may remain ignored and exposed to predator. Moreover, the produc-
tivity of ant- attractive or ant- nutritive substances may increase with the number of
insects as it has been shown in the Lycaneid caterpillars [35]. Such situations where
the productivity of the source depends on the number of ants introduces a new positive
feedback, which accentuates the phenomenon.
Our result provides also a partial justification of the idea advanced in the literature
that the use of a limited number of preferred paths rather than of several diffuse ones
is likely to have an adaptive significance. First, a strong heavily marked pheromone
trail is easier to follow and hence fewer ants lose their way [18]. Moreover, there is
likely to be safety in numbers: competition from other ants may influence the effective
profitability of a food source, hence defense against potentially interloping foragers
from other colonies, may prove necessary at food source [36,37]. A single dense traffic
column is also probably better able to defend itself against predators than a sparse
column, since isolated ants can easily fall prey to predators. The defense of ants on a
trail or the defense of a food source may thus be viewed as a cooperative phenomenon,
353
LA COMPLEXITÉ
Jean-Louis DENEUBOURG
and a semi- inhomogeneous exploitation of identical sources may facilitate such a
cooperative defense (the probability to win increases when the number of individuals
implied increases).
Similarly to defense, attack may also be viewed as an example of inhomogeneous
exploitation of resources induced by cooperativity. An example is provided by army
ants [29] which extensively lay trail and use trail recruitment to food sources [38,39].
They are present in large colonies and are capable to attack and immobilize big preys,
thereby focusing collectively on one activity. Another situation in which focalization
on one activity is observed is the case of species capable to attack colonies of social
insects [11]: one can argue that their capacity to attack efficiently is due to their strong
cooperativity.
Intuitively, one expects a strong correlation between the ability to change defense
strategy and the environmental parameters. Our results fully corroborate this idea by
showing that as the number of sources increases, the colony switches to a homoge-
neous exploitation (all the sources are exploited identically), this mode being observed
also for a small colony. In other words the colony changes its exploitation strategy as
an environmental or a social parameter changes as Hahn and Maschwitz [32] showed
that Messor rufitarsis changed its mode of exploitation when the number of sources
increased. A better way to defend and to avoid conflicts when the size of the colony
is small or the number of sources present is large is dispersion, since a small colony
has not the capacity to have sufficient cooperativity to defend itself since defense is
itself a cooperative process. Our model accounts for these different situations without
invoking changes in the individual behavior. We can expect then that natural selection
will favor species able to defend and thus enhance asymmetric exploitation.
In the situation where one source is different, strong selection rules were also
shown to hold. For s not very large either the concentration of pheromone in all s − 1
trails other than the one leading to the differentiated food source (say 1) are identical,
in which case the stable state corresponds to a high C1 concentration; or there is one
trail among those leading to the s − 1 identical sources that has different pheromone
concentration, in which case, the stable state corresponds to a low C1 concentration.
This holds true both if the differentiated source is richer or poorer than the other s − 1
sources. The origin of this behavior is in the occurrence of s-shaped bifurcation dia-
gram, as a result of which the stable solution branch continues until the closest limit
point bifurcation.
On the other hand, beyond some value of s the rich source always takes over: The
model predicts then that the increase of the number of poor sources will facilitate the
exploitation of one richer source. This result allows us to understand the situations
where there is a large choice between different sources and yet, the colony focuses on
one source [40].
This preference results from the amplification of the trail: a small difference be-
tween the parameters of pheromone deposition q and q1, strongly correlated in our
model to the richness of the sources is sufficient to focus the activity. We notice that
other characteristics affecting the pheromone deposition (decreasing the number of
354
11. Emerging Patterns and Food Recruitment in Ants
LA COMPLEXITÉ
trail- laying individuals returning to the nest, returning and discovering time, . . . ) can
lead to the same result.
We saw in this paper a typical example of competition among trails induced by
amplifying interactions, but many other phenomena found in insect societies, such as
nest- moving or building behaviour [41], are following a similar logic. Furthermore,
spatial agregation can be viewed as the result of a competition between different, pos-
sibly identical attractive sites. The selection of one site depends on the size of the
population, the number of sites present and their characteristics. In particular, if the
size of the colony is large, one site will be selected. On the contrary if it is small, the
colony will not have the ability to be cooperative and will be dispersed, no single site
being preferred.
A major theoretical problem that remains open is to account more properly for
the variability of the system. This variability may be of internal origin, the discovery
of the souces being random rather than deterministic. External factors constitute an
additional source of variability in space (complex terrain,. . . ) as well as in time (tem-
perature or humidity variabilities, predation,. . . ). One way to account for these effects
is to view the process as a probabilistic game and assign transition probabilities as-
sociated with the choice of each trail. This could be implemented numerically by a
Monte Carlo type of simulation. Another possibility would be to augment eqs. (11.2)
or (11.4) by noise terms. In the white noise limit this would lead to a Fokker-Planck
equation for the probability density [40] whose stationary solution would provide the
relative probabilities of occupation of the various trails.
As noticed above, the pheromone disappearance rate ν plays a trivial role in the
present model. This may be an oversimplification since, after all, the nature of the
substrate should definetely influence the overall process. Probabilistic analysis and
Monte Carlo simulations should, again, provide the adequate framework for tackling
this problem.
In natural conditions, the parameters pertaining to each of the food sources are,
typically, different. On the other hand, our simplified model (all sources are identical
or one is different) is easy to test in laboratory and allows to sort out the underlying
mechanisms and analyze the influence of the parameters. One may reasonably expect
that in the real world (i. e. when the parameters associated to the different sources are
different) the foraging patterns, though more complex, will be built according to the
same mechanisms.
Finally, throughout this study it has been assumed tacitly that increased pheromone
concentration has, invariably, an attractive effect and that the trail laying is constant for
each source. There is evidence that many other effects influence the trail laying dur-
ing recruitment, such as crowding around the food source [15,28], suppression of trail
laying after a certain number of foraging trips [24] or saturation effects [42]. Never-
theless, these negative feedback mechanisms mainly affect the recruitement in its late
stage. They slow down the amplifying action of the pheromone and their main contri-
bution is to reduce the range of parameters or the conditions for which heterogeneties
or asymmetrical exploitations are observed. Incorporating these possibilities would
355
LA COMPLEXITÉ
Jean-Louis DENEUBOURG
confer further complexity to the dynamics than the one found in the present work, and
would undoubtedly be worth attempting in the future.
Acknowledgments.
We thank the referee for his thorough analysis of the manuscript
and his insightful comments. This research is supported in part by the Interuniversity
Attraction Pole Program of the Belgian Federal Office of Scientific, Technical and
Cultural Affairs.
11.8
Bibliography
[1] G. Nicolis, and I. Prigogine, Self-organization in nonequilibrium systems, Wiley,
New York (1977).
[2] E. Bonabeau, G. Theraulaz, J. L. Deneubourg, S. Aron, and S. Camazine, S.,
Self-organization in social insects, TREE 12, 188-193 (1997).
[3] E. Bonabeau, J. L. Deneubourg, and G. Theraulaz, Within- Brood Competition
and the Optimal Partitioning of Parental Investment , The American Natural-
ist 152(3): 419-427 (1998).
[4] S. Camazine, J. L. Deneubourg, N. R. Franks, J. Sneyd, E. Bonabeau, and
G. Theraulaz, Self- organized Biological Superstructures, Princeton University
Press, in press.
[5] J. L. Deneubourg, and S. Goss, Collective patterns and decision making,
Ethol. Ecol. Evol. 1, 295-311 (1989).
[6] C. Detrain, J. L. Deneubourg, and J. M. Pasteels, in press, Information Processing
in Social Insects, Birkhauser Verlag.
[7] G. Theraulaz, and F. Spitz, Auto-organisation et comportements, Hermès, Paris
(1997).
[8] S. Camazine, and J. Sneyd, A model of collective nectar source selection by
honey bees. Self-organization through simple rules, J. Theor. Biol. 149, 547-571
(1991).
[9] T. D. Seeley, The wisdom of the hive, Harvard University Press, Cambridge
(1995).
[10] T. D. Seeley, S. Camazine, and J. Sneyd, Collective decision making in honey
bees : how colonies choose among nectar sources, Behav. Ecol. Sociobiol 28,
277-290 (1991).
[11] B. Hölldobler, and E. O. Wilson, The ants, Springer Verlag, Berlin (1991).
356
BIBLIOGRAPHY
LA COMPLEXITÉ
[12] J. F. A. Robson, and S. K. Traniello, Trail and territorial communication in social
insects, in: W. J. Bell and R. Cardé (eds), The Chemical Ecology of Insects vol
II, Chapman & Hall (1995).
[13] J. H. Sudd, Communication and recruitment in Monomorium pharaonis,
Anim. Behav. 5, 104-109 (1957).
[14] E. O. Wilson, Chemical communication among workers of the fire-ant Solenop-
sis saevissima (Fr. Smith). 1. The organization of mass foraging. 2. An informa-
tion analysis of the odour trail. 3. The experimental induction of social response,
Anim. Behav. 10, 134-164 (1962).
[15] E. O. Wilson, The insect societies. Harvard University Press, Cambridge, MA
(1971).
[16] R. Beckers, J. L. Deneubourg, and S. Goss, Trails and U-turns in the selection
of a path by the ant Lasius niger, Journal of Theoretical Biology 159, 397-415
(1992).
[17] C. Detrain, and J. L. Deneubourg, Scavenging by Pheidole Pallidula : a key word
for understanding decision-making systems in ants, Anim. Behav. 53, 537-547
(1997).
[18] J. M. Pasteels, J. L. Deneubourg, and S. Goss, Self-organisation mechanisms
in ant societies (I): the example of food recruitment , in: J. M. Pasteels and
J. L. Deneubourg (eds), From individual characteristics to collective organisa-
tion in social insects, Birkhaüser, Basel, pp 155-175 Experentia Supplementum
54 (1987).
[19] L. Edelstein-Keshet, J. Watmough, and G. B. Ermentrout, Trail following in so-
cial insects: Individual Properties determine population behaviour, Behavioral
Ecology and Sociobiology 36(2): 119-133 (1995).
[20] T. R. Stickland, N. F. Britton, and N. R. Franks, Complex trails and simple algo-
rithms in ant foraging, Proc. Roy. Soc. Lond. B 260, 53-58 (1995).
[21] T. R. Stickland, C. Tofts, and N. R. Franks, A path choice algorithm for ants,
Naturwissenchaften 79, 567-572 (1992).
[22] T.R. Stickland, C. Tofts, and N. R. Franks, Algorithms for ant foraging, Natur-
wissenchaften 80, 427-430 (1993).
[23] J. Watmough, and L. Edelstein-Keshet, Modelling the Formation of trail net-
works by foraging ants, J. Theor. Biol. 176, 357-371 (1995).
[24] R. Beckers, J. L. Deneubourg, and S. Goss, Trail laying behaviour during food
recruitment in the ant Lasius niger (L.) , Insectes Sociaux 39, 59-72 (1992).
357
LA COMPLEXITÉ
Jean-Louis DENEUBOURG
[25] J. L. Deneubourg, S. Aron, S. Goss, and J. M. Pasteels, The self-organizing ex-
ploratory pattern of the Argentine ant , J. Insect. Behav. 3, 159-168 (1990).
[26] S. Goss, S. Aron, J. L. Deneubourg, and J. M. Pasteels, Self-organised short cuts
in the argentine ant , Naturwissenchaften76, 579-581 (1989).
[27] R. Beckers, J. L. Deneubourg, and S. Goss, Modulation of trail laying in the ant
Lasius niger (Hymenoptera: Formicidae) and its role in the collective selection
of a food source, Journal of Insect Behavior 6, 751-759 (1993).
[28] J. C. Verheage, and J. L. Deneubourg, Experimental study and modelling of
food recruitment in the ant Tetramorium impurum (Hym. Form.) , Insectes So-
ciaux 303, 47-360 (1983).
[29] N. F. Franks, Teams in social insects: group retrieval of prey by army ants
(Eciton Burchelli, Hymenoptera: Formicidae) , Behavioral Ecology and Socio-
biology 18(6): 425-429 (1986).
[30] S. W. Rissing, and J. Wheeler, Foraging responses of Veromessor pergandei to
changes in seed production, Pan- Pacific Entomology 52, 63-72 (1976).
[31] G. Nicolis, Introduction to non-linear science, Cambridge University Press, Cam-
bridge (1995).
[32] M. Hahn, and U. Maschwitz, Foraging strategies and recruitment behaviour in
the European harvester ant Messor rufitarsis, Oecologia 68(1): 45-51 (1985).
[33] R. Beckers, S. Goss, J. L. Deneubourg, and J. M. Pasteels, Colony size, commu-
nication and ant foraging strategy, Psyche 96, 239-256 (1990).
[34] N. E. Pierce, R. L. Kitching, R. C. Buckley, M. F. J. Taylor, and K. F. Benbow,
The costs and benefits of cooperation between the Australian lycaneid butterfly,
Jalmenus evagoras, and its attendant ants, Behavioral Ecology and Sociobiol-
ogy 21, 237-248 (1987).
[35] O. Leimar, and A. H. Axen, Strategic behaviour in an interspecific mutualism: in-
teraction between lycaenid larvae and ants, Anim. Behav. 46, 1172-1182 (1993).
[36] J. H. Hunt, Foraging and morphology in ants. The role of vertebrate predators as
agents of natural selection, Social insects in the tropics, Ed. Paris, vol. 2 (1983).
[37] N. R. Franks, and L. W. Partridge, Lanchester battles and the evolution of combat
in ants, Anim. Behav. 45(1): 197-199 (1993).
[38] R. Chadab, and C. W. Rettenmeyer, Mass recruitment by army ants, Science 188,
1124-1125 (1974).
[39] N. R. Franks, N. Gomez, S. Goss, S., and J. L. Deneubourg, The blind leading
the blind in army ant raid patterns: testing a model of self-organization (Hy-
menoptera: Formicidae) , J. Insect Behav. 4, 583-607 (1991).
358
BIBLIOGRAPHY
LA COMPLEXITÉ
[40] N. G. Van Kampen, Stochastic processes in physics and chemistry, North Hol-
land, Amsterdam (1981).
[41] P. Rasse, Etude sur la régulation de la taille et sur la régulation du nid souterrain
de la fourmi Lasius niger, PhD dissertation, université Libre de Bruxelles (1999).
[42] S. Aron, J. L. Deneubourg, S. Goss, and J. M. Pasteels, Functional self-
organisation illustrated by inter-nest traffic in the argentine ant Iridomyrmex
humilis, in Biological Motion, eds. W. Alt and G. Hoffman, Lecture Notes in
Biomathematics, Springer Verlag 533-547 (1989).
[43] J. L. Deneubourg, D. Fresneau, J. S. Goss, P. Lachaud, and J. M. Pasteels, A
simple model to explain the organisation of individual foraging in Neoponera
apicalis, in Chemistry and Biology of Social Insects, Eds. J. Eder & H. Rembold,
Verlag Peperny, Munich, 527-528 (1987).
[44] S. Goss, and J. L. Deneubourg, The self-organising clock pattern of Messor per-
gandei , Social Insects 36, 339-346 (1989).
[45] H. L. Vasconcelos, Foraging activity of an Amazonian leaf- cutting ant: re-
sponses to changes in the availability of woody plants and to previous plant dam-
age, Oecologia 112, 370-378 (1997).
359
Chapitre 12
Dynamics of Nest Excavation and Nest
Size Regulation of Lasius Niger
(Hymenoptera: Formicidae)
P. H. Rasse1, and J. L. Deneubourg1
1Center for Nonlinear Phenomena and Complex Systems, Université Libre de Bruxelles,
Campus Plaine, CP231, 1050 Brussels, Belgium
4Author to whom correspondence should be addressed: prasse@ulb.ac.be
Abstract
The adaptation of nest size to its population is one of the most common processes, but
little is known about the dynamics of nest-building and -enlarging in social context.
Furthermore, the mechanisms involved remain totally ignored. We present here the
first results of such dynamics in the context of Lasius niger’s nest excavation. We
find, with an artificial but standardized method, a strong positive correlation between
the number of ants and final nest volume as well as digging rate. Both grow almost
proportionally to population. When the number of individuals is artificially increased
(even slightly) in a nest, its dimension is systematically adjusted in the same way as
initial excavation. In this process, digging act as a negative feed-back that control
nest enlargement. Experiments revealed that this negative control is directly due to the
volume of the nest as well as the physiological or behavioral modification of ants after
digging. Finally, amplification of activity was observed during the enlargement phase
suggesting the possible implication of self-organized processes in the volume control
mechanism.
KEY WORDS: ants, Lasius niger, nest, digging, size regulation, dynamics.
361
LA COMPLEXITÉ
Jean-Louis DENEUBOURG
12.1
Introduction
Among ants, nests are produced by the majority of species and are highly variable in
form and size [1,2]. Nonetheless, within a particular species and context (soil condi-
tions, for example), the relation between nest size and population is relatively constant.
This has been verified in epigeous ant nests [3,4,5] as well as in other groups of social
insects such as termites [6,7] and bees [8]. Very little is known about the regulation
of nest size except for simplified situations [1,9]. Even less is known about subter-
ranean nest structures. Nevertheless, the study of digging presents the advantage that
the volume excavated can be estimated at any time by weighing soil deposited at the
surface.
In this context, our objectives are to characterise the dynamics of nest excavation
as well as the adjustment of volume dug as the function of number workers. To achieve
this, Lasius niger is a convenient insect to use because this ant builds mainly subter-
ranean nests, much is known about its biology [10,11], and because it can easily be
collected and maintained in the laboratory.
We have developed standardised procedures for systematically examining the reg-
ulation of nest excavation.
12.2
Material and Methods
Ants were collected in April (1 colony) and in October 1995 (1 colony), and for con-
trols in April 1996 (2 colonies). Due to possible difference between these groups,
experiments (begun in May and December 1995 for the first set) have been system-
atically separated. The colonies (without queens and brood) were installed in boxes
with tubes, within which the ants could live. The photoperiod in the lab was about 12
hours, the light beginning at 0800 a.m. The food consisted of pieces of a solid prepara-
tion [12]. The group activity was followed during several weeks. In these experiments,
the ant groups were initially placed in a Petri dish (∅ = 15 cm) and had access to a
sand bucket through a single hole in the centre of the dish (∅ = 0.5 cm). In this humid
Brusselean sand (tertiary deposit, 5 kg humidified at 15% in mass), a group of workers
could establish their nest. In each experiment, we measured the daily weight of ex-
cavated sand brought to the surface by the ants and deposited in the Petri dish (1g of
dried sand = 0.67 cm3 ±0.0001 (s.d.)).
12.3
Experimental procedures
12.3.1
Dynamics of digging by groups of different sizes
As our reference group, we first studied the digging activity of 50 workers (n = 11
(5 experiments in May + 6 in December 1995)). Then we studied the excavation of
362
12. Lasius Niger’s Nest Excavation
LA COMPLEXITÉ
groups composed of 25 (n = 11 (7May95 + 4Dec.95)), 100 (n = 6 (4May95 + 2Dec.95)) and
150 ants (n = 1May95).
12.3.2
Effect of population increases on nest volume
After several weeks, the nest volume stabilised. We then artificially increased the
population and followed the resulting volume of material excavated. We added groups
of different numbers of ants (25, 50, 100, 150 ants of the same colony) to the initial
groups (25, 50, 100 and 150 ants). To catch a global view of the phenomenon, we have
favoured the using a diversity of combinations rather than repeating a few (Table 12.1).
Table 12.1: The different combinations of initial and added populations in the nest. The number of
repetitions is indicated for each experiment.
Moreover, after the addition of ants to the initial groups, we followed the activity
of both groups: the initial one (experiments with 50 workers, not marked) and the
added individuals (marked). The combinations studied were 50 ants+10, 25, 50, 100
and 150. Among these, two were repeated twice (50 + 50 and 50 + 150). We marked
ants the day before their introduction in the nest, with scale model paint (Humbrol-
Enamel ) applied on their abdomen. Measurements were taken from a video recorder
that regularly filmed the nest entrance before and after the ants were introduced, the
recordings lasting 10 minutes. To focus on the digging activation, the intervals were
of 1 hour during the first 8 hours. Afterwards, the intervals were of 3 hours during 10
days.
To check if these measurements were compatible with quantities of sand daily
weighed, (corresponding to the effective activity) we have estimated from the video
recording the volume dug in 10 days. In order to achieve this, we have extrapolated
the excavated mass during one transport (3.13 10 − 4 g (2336 grains excavated weigh
0.73 g)) to the 10 days of the experiment. This was accomplished by averaging day by
day the number of transports during the 10 minutes of recording every 3 hours. In the
7 experiments the ratio between the weighed sand and calculated quantity excavated
was close to 1 (0.89 ± 0.32 (s.d.)), validating the method.
363
LA COMPLEXITÉ
Jean-Louis DENEUBOURG
12.3.3
Examination of factors responsible for the digging dynam-
ics
Three series of experiments allowed us to evaluate the effects of previous experience
and to assess the influence of a previously dug nest on naïve ants. The experimental
set-up was similar to the previous one except for the number of workers (10), the Petri
dish’s diameter (∅ = 8.5 cm) and the sand bucket which only contained 250 g of moist
sand.
SERIES 1 consisted of experiments after which ants were removed from the dug nest
and replaced in the initial condition (in the Petri dish of a new setup) (n = 8July96).
SERIES 2 consisted of naïve ants placed in a container in which a nest had been pre-
viously dug by another group which was removed before (n = 8July96).
SERIES 3 consisted of experiments after which ants were also extracted from their
nest, but either replaced in another dug nest (n = 8August96), or replaced in their own
nest (n = 4).
12.4
Results
12.4.1
Dynamics of digging
For the reference group of 50 ants, the nest volume grows monotonically, reaching
a plateau value after few dozen days (volume VS), while the digging rate reaches its
maximum on the first day and decreases as the volume V approaches the plateau value
(Fig. 12.1). With other group sizes, the dynamics are similar to that seen with 50 indi-
viduals (Fig. 12.1), while quantitatively the maximal rate (α) and the plateau volume
(VS) grows with the population.
Such dynamics suggests that the digging rate (dV /dt) decreases as the volume
V approaches VS. The excavated volume acts as a negative feedback on the dynamics
and the dynamics suggests that the relationship between digging rate and the excavated
volume is:
dV
V
= α 1
.
(12.1)
dt
− Vs
Which when solved gives:
V = Vs 1 − e−αt/Vs .
(12.2)
The fitting reveals the high values of the regression coefficient (see Table 12.2),
which validate the model. Afterwards, to characterise quantitatively the relationship
between group size (N ) and the volume (VS), and the maximal rate (α), we first fit
with the following equation:
Vs = γNβ
(12.3)
where γ is the volume dug/individual and β, characterises the relation between group
size (N ) and VS (proportional relationship if β = 1). One obtains β close to (statisti-
cally not different from) 1, while smaller than 1 (0.77) (Fig. 12.3A, Table 12.3).
364
12. Lasius Niger’s Nest Excavation
LA COMPLEXITÉ
Figure 12.1: A) Time evolution of the accumulated mass excavated by three groups of different sizes:
25, 50 and 100 workers: May 1995. B) Dynamics resulting from nest enlargement after population
increases: 25 + 25, 25 + 50 and 25 + 100 workers.
Table 12.2: For each population (pop.), we present the number of experimental repetitions (n), the
mean final volume excavated (VS) ± s.d., the maximal rate (α) and its minimal r2.
As far as the volume and its relation to the population are concerned, we can deal
similarly with the maximal digging rate, using the same kind of equation:
α = δN ε.
(12.4)
It appears that α is also almost proportional to the group size although it grows
with a smaller rate than the number of ants (ε = 0.73). Nevertheless ε can not be
considered to be statistically different from 1 (Fig. 12.2C, Table 12.3).
Seasonal or colonial effects: experiments of May and December 1995, realized
with two different colonies, present significant differences (Fig. 12.2A,C). The max-
imal digging rate and the plateau volume are higher in May. One can also notice that
if the season or the colony does not affect β and ε, it does for individual volume γ,
(which is 4 times smaller in December) and for individual rate δ (which is 8 times
smaller in December) (Fig. 12.2A,C; Table 12.3).
365
LA COMPLEXITÉ
Jean-Louis DENEUBOURG
Figure 12.2: A) log-log representation of nest size (VS) as a function of population (N), for May
(y = 0.77x + 1.88; r2 = 0.93) and December 1995 (y = 0.75x + 0.51; r2 = 0.55). B) Log-
Log representation of the nest size as a function of total population after population increases. May
(y = 0.78x + 1.64; r2 = 0.87) and December 1995 (y = 0.84x + 0.24; r2 = 0.79). C) Log-
Log representation of the maximal digging rate (α) as a function of population (N ), for May (y =
0.73x + 0.52; r2 = 0.77) and December 1995 (y = 0.86x −2.64; r2 = 0.44). D) log-log representation
of maximal digging rate during the nest adaptation as a function of total population after population
increase. May (y = 0.56x + 0.88; r2 = 0.48) and December 1995 (y = 1.22x − 4.57; r2 = 0.48).
366
12. Lasius Niger’s Nest Excavation
LA COMPLEXITÉ
Table 12.3: Values of the parameters characterizing the final volume (VS) and the maximal rate (α)
as function of population. The parameters are obtained by fitting ln(Vs) and the ln(α) as function of
ln(pop.).
Therefore these experiments indicate that: 1) the volume excavated exerts a nega-
tive feedback on the activity; 2) the final equation volume VS and the maximal digging
rate α, are proportional to the group size, although there is a tendency for both to grow
more slowly than number of workers.
12.4.2
Effect of population increases on nest volume
Experiments in which we increased the number of workers show that: 1) the digging is
triggered immediately - starting on first day - (Fig. 12.1B) even when the added number
is low (10%) compared to initial group size; 2) the dynamics of the nest enlargement
is the same as for the initial excavation, which means that the activity is maximal at
the beginning and decreases afterwards (Fig. 12.1B); 3) both volume and speed grow
as a function of the total number of ants (Fig. 12.2B, D).
In this situation the model (equation (12.1)) is adapted into equation (12.5), taking
into account that the initial volume is now equal to VS, digging leading to a new final
volume (= VStot corresponding to the new population Ntot = N + Nadd).
dVtot
V
= α
tot
.
(12.5)
dt
rec
1 − VStot
Which gives:
V
t
tot = VStot − (VStot − VS)e− αrec
VStot .
(12.6)
In these equations, time 0 corresponds to the increase in population. We proceed
then as we did for the initial phase and fit the dynamics of experiments to equa-
tion (12.6). The different fittings confirm the feedback hypothesis of the model (all
r2 > 0.80 (s.d. = 0.08 for both seasons). Once again the final volume (VStot) is al-
most proportional to the total population in the nest (Ntot), β = 0.77 (Fig. 12.2A, B;
equation (12.7)). Individual volumes (γ) obtained after enlargements remain constant
367
LA COMPLEXITÉ
Jean-Louis DENEUBOURG
confirming the model.
VStot = γNβtot
(12.7)
αrec = δNε .
tot
(12.8)
The estimation of the non-linearity for the maximal rate αrec, obtained by fittings
(equation (12.8)), shows that the tendency observed during the initial excavation is not
so clear any more, ε being equal to 0.56 (r2 = 0.48).
Seasonal or colonial effects: the absence of difference between May and December
still remained for the exponent β (= 0.77 and 0.84), while the difference remains for
γ, its value being 4 times higher in May. For ε, it is equal to 0.56 and 1.22, respec-
tively for May and December 1995 but no difference can be statistically considered
(Fig. 12.2D). Indeed the maximal digging rate presents an erratic tendency that is even
more present when we compare the initial values (α) to those of population increase
(αrec) (Table 12.3).
Thus, these experiments reveal that: 1) ants adapt their nest volume to the new
population, even when the increase remains weak; 2) as well as for initial digging, the
total volume grows proportionally to the population (however somewhat less) and 3)
the maximal rate grows approximately in proportion to the number of ants again, but
the volume adjustment Vs is more stable.
When looking at the recorded activities of distinguished groups (initial (non-
marked) and added (marked)), the workers’ addition reveal immediately that ants in
both groups work almost in proportion of their representation in the total population.
Nevertheless, the fitting reveals a tendency for initial ants to be less active when the
added number increases (Fig. 12.3).
The initial workers are thus still active, the decrease of their digging rate along
time being not caused only by exhaustion. These results are moreover compatible with
the measurements of mortality, since no more than 10% to 20% dead ants rejected on
the surface were counted. Furthermore, the destruction of the two nests after 120 days
showed that both initial populations (50) were slightly affected (42 and 48 ants alive).
If we focus on the first hours (20 hours) after the addition, we observe that added
ants accelerate their activity more, when they are more numerous (Fig. 12.4). Nev-
ertheless this acceleration, defined as the increase of transport/ant/minute, is not ob-
served in the initial group. We suppose that the aggregation of initial ants in the nest
and their spatial distribution, which contrast with the mobility and the spreading of
added ants, is responsible for this erratic activation.
In conclusion, we can say that there is an adjustment of the nest size to population,
corresponding to a new steady volume (VStot). We now have shed some light on the
mechanisms of this activity decrease.
368
12. Lasius Niger’s Nest Excavation
LA COMPLEXITÉ
Figure 12.3: Ratio of sand grains carried by initial and added ants group as a function of the ratio of
initial and added workers in the nest population. The number of such transports is an average calculated
from 10 minutes, 8 times a day during 10 days after population increasing: y = 0.40x + 0.33 (r2 =
0.99).
Figure 12.4: Linear approximation of transport increase/ants added, as a function of number of
ants added during first 18 hours after population increases. It is realized for the added workers:
y = −510 4
4
− x − 8 10− (r2 = 0.96), December 1995.
369
LA COMPLEXITÉ
Jean-Louis DENEUBOURG
12.4.3
Factors responsible of the digging dynamics
First, the initial excavation of groups of 10 ants in smaller sand buckets remains similar
(Fig. 12.5A-a and 12.5B-a), the nest volume reaching a plateau after about 20 days for
both series. The use of the model (equation (12.1)) gives us the final volume Vs and
the maximal rate α (Table 12.4). When compared to the values obtained in May 1995,
the volumes Vs/individual appears to be comparable (from Table 12.2 and 12.4).
Figure 12.5: A) Mean dynamics (n = 8 experiments) of initial excavation by 10 ants (a) and when
they are put back in initial conditions (new setup) (c). We also measure this evolution when naive
workers are introduced in an already dug nest (b). B) Mean dynamics (n = 12) of initial excavation by
10 ants (a) and when they are extracted and reintroduced in their own nest (b) or in an other nest (c)).
Table 12.4: Parameters estimated from the initial digging by 10 workers in small sand buckets (±
s.d.). Are also presented those estimated from excavation after dropping old ants in new setups (SERIES
1) and naive ants in already dug setups (SERIES 2). In parallel we present the results of ants having dug
once and introduced in other dug nests (SERIES 3) or re-introduced in their own (SERIES 4).
For SERIES 1, the volume excavated by ants after having already dug is about
48% lower than for the first time (taking the mortality into account) (Fig. 12.5A-c).
The maximal rate (α) is also lower, reduced by 82% on average (Table 12.4). This
shows that ants are affected directly by digging (physiological or behavioural change).
370
12. Lasius Niger’s Nest Excavation
LA COMPLEXITÉ
In SERIES 2, when 10 naive workers are introduced in an empty nest, they dug
less than during the initial nest excavation (Fig. 12.5A-b), as well for the VS (54%
less) and α (75% less) (Table 12.4). This reveals that the volume already excavated is
an inhibitor. Nevertheless, the inhibition is not absolute since the naives still excavate.
In SERIES 3, when ants are extracted from their initial nest and introduced in
another nest (emptied), they dig very weakly (95% weaker) (Fig. 12.5B-b). We obtain
a similar result in SERIES 4, when, as a control, ants are placed back in their original
nest (99% weaker) (Fig. 12.5B-c). In this case, the absence of activity results from the
combination of both effects (previous experience and volume).
In addition, no correlation between the masses initially excavated and either that
dug in the new set-up (r2 = 0.11) or the mortality was found. This means that if the
digging affects ants, it can also depend heavily on the general experimental conditions.
In conclusion, this part of our work reveals that the decrease of activity in the
digging dynamics is mostly generated by the combination of two effects: modification
of workers’ state and inhibition by the volume. This latter is nevertheless not absolute
since we detect the tendency for naive workers to dig although the nest has reached the
proper size.
12.5
Discussion
This study of nest digging regulation has highlighted that: 1) digging activity always
decreases through time, the nest reaching a plateau volume. The rate of digging is
initially high and then decreases after a few days, becoming negligible; 2) both this
plateau (VS) and maximal digging rate (α) are closely proportional to group size; 3)
nests are systematically enlarged when the ant population is increased. This enlarged
nest volume is adapted to the new group size, while the dynamics of excavation re-
mains similar to the initial one.
The quantitative variation between different group of experiments may be due to
colonial differences, but our results suggest a dominance of seasonal effect. The final
volume/ant being systematically higher for experiments with ants collected in spring
(1995 and 1996, experimented in spring and summer) than in autumn (1995, experi-
mented in winter). Such an effect is in agreement with enlargement and repair of the
nest under natural conditions occurring in spring and summer. It could be due either to
weather conditions which in this period could stimulate the activity level of individu-
als, or to the physiological state of the ants, those collected in autumn being less active
before their hibernation.
During the enlargement, ants of the initial population and added ants were observed
to dig at the same maximal rate. This means that the initial ants can still excavate, but
as we have also shown, digging activity is reduced after a previous experience, point-
ing to the possibility of physiological [13] or behavioral changes. This contribution
is not exclusive since we have revealed that volume inhibits digging. Experimental
and theoretical results suggest that the digging rate decreases proportionally when the
371
LA COMPLEXITÉ
Jean-Louis DENEUBOURG
volume approaches the plateau (VS).
If we now look at the final volume (VS) that must be considered, for each set of ex-
periments, as statistically proportional to population (December-May; initial digging-
enlargement), we observe that all the fittings present the same tendency which suggest
that final volume grows slower than the number of workers (population 0.75).
The relation between maximal digging rate and the population size is also nearly
linear, but presents a higher variability (particularly for the nest enlargements) than
that linking the excavated volume and group size. This may be due to the fact that
rate is a daily value while volume is a sum of all these daily activities. Thus, the
rate (and its calculation) is more sensitive to perturbation from outside or inside the
nest. When new ants are added, spatial distribution of workers within the nest can thus
be the origin of the variability observed during the enlargement. Such effect would
be present in particular if a digging recruitment is involved. These results lead us to
the mechanisms of this regulation and the information treated by ants. Two types of
mechanisms can be hypothesized:
In the first hypothesis the individual volume triggers directly the digging activity.
This could occur if the ants perceive a particular cue that varies in proportion to the
density of workers within the nest. The workers modulate their building activities -
always with respect to a given individual volume - as a function of number ants and
the volume available [14]. After the enlargement, the population density decrease will
be detected and excavation will be stopped. The regulation in the Leptothorax’s nest,
or in the case of size regulation of royal cell in termites Macrotermes [1,5,14,15,16]
are examples of such process.
One possibility for such a cue would be a chemical signal whose concentration
increases with the group size and decreases with the nest enlargement. The higher
the chemical concentration, the more individuals would be stimulated to dig. As a
consequence of the digging activity, the concentration and then the stimulation would
decrease when the nest reaches a particular volume. Possible chemical cues could be
a pheromone or a metabolic by-product of the ants’ activity, such as carbon dioxide.
CO2 would then be a global state indicator for the worker activity, physiological state
as well as population size. In particular, its production depends on physiological state
of workers, colonial and seasonal situations [17,18,19]. In regard with data on this
hypothesis, L. Gallé [20] has shown that oxygen consumption in a group increases
as the volume does, proportionally to group size0.75. It is then reasonable to suppose
that if the consumption decreases, the metabolism and its by-products are affected
too. Nevertheless, the same non-linear tendency suggested by our measures must be
highlighted as well as the influence of CO 2 on Lasius niger. Other signals could also
be involved in this volume regulation, such as the frequency of contacts [21].
In the second hypothesis, the population density does not affect directly the dig-
ging behavior of ants. Due to the presence of digging amplification processes such as
recruitment for excavation activity, the propagation of excavation itself (i.e. collective
enlargement) depends on population density. Our results strongly suggest the presence
of such a recruitment. Indeed, the measurements during population increase show an
372
BIBLIOGRAPHY
LA COMPLEXITÉ
acceleration of the activity with the number of ants added. Besides, the exponential
dynamics of excavation during first hours (initial phase) constitutes the typical signa-
ture of such positive feedback [22]. Moreover, in experiments where ants are given the
opportunity to dig at two sites (Rasse and Deneubourg, in prep), groups focused their
activity systematically on one of them, which can only be interpreted in terms of am-
plification processes [16,23,24]. Finally, these recruitments are well known in building
activities of many social species (e. g. [16,25,26,27,28,29]), and occurs in Lasius niger.
In the nest regulation, a global nest enlargement will result from the propagation at the
collective level of digging activity, favored by a high workers density. The population
density decrease resulting from this excavation will affect this recruitment, until the
activity stops. Thus, in this regulation model, such spatial amplification mechanisms
could play an essential role in the activation and coordination of digging activity of
workers.
These results suppose that regulation or a part of it, would be based on an indirect
inhibition of the positive feedback without involving explicit coding of nest size as in
the previous scenario [14]. Population density would not be measured but would be a
condition for the initiation, propagation and collective nest enlargement. The nest size
would then be implicitly coded within the dynamics of the propagation itself and these
mechanisms would contain their own inhibition.
In regards of our results, the first hypothesis is not verified while the second would
be plausible only when considering the two opposite feedback. The first is the ampli-
fication for digging which acts as a short-term positive feedback. It would be inhibited
afterwards by the volume effect. This long-term inhibiting effect could hide another
positive feedback, antagonistic to the recruitment: self-aggregation. The tendency of
individuals to self-aggregate would be an example of such an inhibitor that would be
favored by the nest enlargement. From this point of view, the quantitative descrip-
tion of the long-term activation by our theoretical model and its inability’s to describe
short-term activation confirms these two scales of actions.
Acknowledgments.
This research was generously supported by the Institut pour la
Recherche Scientifique dans l’Industrie et l’Agronomie, by the Fonds pour la forma-
tion à la Recherche dans l’Industrie et dans l’Agriculture, the Future & Emerging
Technologies Program of the European Commission and by the Fondation David et
Alice van Buuren. We also thank the referees and especially Scott Camazine for his
suggestions and the time he spent in editing the paper.
12.6
Bibliography
[1] M. V. Brian, Social insects: ecology and behavioural biology, Chapman and Hall,
London (1983).
[2] M. H. Hansell, Animal architecture and building behavior, Longman Press, Lon-
don (1984).
373
LA COMPLEXITÉ
Jean-Louis DENEUBOURG
[3] W. R. Tschinkel, Sociometry and sociogenesis of colonies of the fire ant Solenop-
sis invicta during one annual cycle, Ecological Monographs 63(4): 425-457
(1993).
[4] W. R. Tschinkel, Sociometry and sociogenesis of colonies of the Florida har-
vester ant (Hymenoptera: Formicidae) , Ann. Entomol. Soc. Am. 92(1): 80-89
(1999).
[5] N. R. Franks, A. Wilby, B. Silverman, and C. Toft, Self-organizing nest con-
struction in ants: Sophisticated building by blind buldozing, Anim. Behav. 44,
357-375 (1992).
[6] M. Lepage, Les termites d’une savane saharienne (Ferlo septentrional, Sénégal)
peuplement, population, consommation, rôle dans l’écosystème, Thèse sciences,
Université de Dijon (1974).
[7] N. M. Collins, Population; age structure and survivorship of colonies of
Macrotermes bellicosus (Isoptera, Macrotermitinae) , J. Anim. Ecol. 50, 293-311
(1981).
[8] R. Darchen, Les techniques de construction chez Apis mellifica, Thèse sciences,
Paris (1959).
[9] N. R. Franks, and J. L. Deneubourg, Self-organizing nest construction in ants:
individual workers behaviour and nest’s dynamics, Anim. Behav. 54, 779-796
(1997).
[10] A. Lenoir, Le comportement alimentaire et la division du travail chez Lasius
niger, Bull. Bio. de la France et de la Belgique 13, 2-3 (1979).
[11] B. Hölldobler, and E. O. Wilson, The Ants, Springer-Verlag, Berlin (1990).
[12] A. P. Bathkhar, and W. H. Withcombs, Aritifical diet for rearing various species
of ants, Florid. Ent. 53(4): 229-232 (1970).
[13] J. H. Fewell, and R. E. Page, The emergence of division of labour in forced asso-
ciations of normally solitary ant queens, Evol. Ecol. Res. 1(5): 537-548 (1999).
[14] J. L. Deneubourg, and N. R. Franks, Collective control without explcit coding:
the case of communal nest excavation, J. Ins. Behav. 4, 417-432 (1995).
[15] E. Bonabeau, G. Theraulaz, J. L. Deneubourg, N. R. Franks, O. Rafelsberger,
J. L. Joly, and S. Blanco, The emergence of pillars, walls and royal chambre in
termite nests, Philosophical Transactions of the Royal Society of London Series
B (in press) (1998).
[16] S. Camazine, J. L. Deneubourg, N. R. Franks, J. Sneyd, E. Bonabeau, and
G. Theraulaz, Self-organization in biological systems, In press, Princeton Uni-
versity Press (2001).
374
BIBLIOGRAPHY
LA COMPLEXITÉ
[17] W. Hangartner, Carbon dioxide, a releaser for digging behavior in Solenopsis
geminata (Hymenoptera : Formicidae) , Psyché 76(1): 58-67 (1969).
[18] C. Kleindeidam, and J. Tautz, Perception of carbon dioxide on other
"air-conditions" parameters in leaf cutting ant Atta cephalotes, Natuurwis-
senschafften 83, 566-568 (1996).
[19] J. F. Burkhardt, Intividual flexibility and tempo in the ant, Pheidole dentata, the
influence of group size, J. Ins. Behav. 11(4): 493-505 (1998).
[20] L. Gallé, Respiration as one of the manifestation of the group effect in ants, Acta
Biologica Szegd. 24(1-4): 111-114 (1978).
[21] D. M. Gordon, R. E. H. Paul, and K. Thorpe, What is the function of encounter
patterns in ant colonies? , Anim. Behav. 45, 1083-1100 (1993).
[22] P. Rasse, and J. L Deneubourg, Collective decision-making during nest gallery
excavation by the ant Lasius niger (Hymenopterae, Formicidae) , (in prep.)
[23] J. L. Deneubourg, and S. Goss, Collective patterns and decision making,
Ethol. Ecol. Evol. 1, 295-311 (1989).
[24] E. Bonabeau, G. Theraulaz, and J. L. Deneubourg, Quantitative study of the
fixed threshold model for the regulation of division of labour in insect societies,
Proc. R. Lond. 263, 1565-1569 (1996).
[25] P.-P. Grassé, La reconstruction du nid et les coordinations inter-individuelles chez
Bellicositermes natalensis et Cubitermes sp. La théorie de la stigergie: Essai
d’interprétation du comportement des termites constructeurs, Ins. soc. 6, 41-83
(1959).
[26] O. H. Bruinsma, An analysis of building behaviour of the termite Macroter-
mes subhyalinus (PhD thesis), Landbouwhogeschool te Wageninge, Netherlands
(1979).
[27] V. Skerka, J. L. Deneubourg, and M. R. Belic, Mathematical model of building
behavior of Apis mellifera, J. Theor. Biol. 147, 1-16 (1990).
[28] J. L. Deneubourg, Application de l’ordre par fluctuations à la description de cer-
taines étapes de la construction du nid chez les termites, Ins. Soc. 24, 117-130
(1977).
[29] E. Bonabeau, G. Theraulaz, J. L. Deneubourg, S. Aron, and S. Camazine, Self-
Organization in social insects, Tree. 12, 188-193 (1997).
375
Chapitre 13
Spatial Patterns in Ant Colonies
Guy Theraulaz1, Eric Bonabeau2,3, Stamatios C. Nicolis1,
Ricard V. Solé2,5, Vincent Fourcassié1, Stéphane Blanco6,
Richard Fournier6, Jean-Louis Joly6, Pau Fernández5, Anne Grimal1,
Patrice Dalle7, and J. L. Deneubourg4
1Laboratoire d’Ethologie et Cognition Animale, CNRS FRE 2382, Université Paul Sabatior,
118 route de Narbonne, 31062 Toulouse Cédex 4, France
2Santa Fe Institute, 1399 Hyde Park Road, Santa Fe, NM 87501, USA
3Eurobios, 9 rue de Grenelle, 75007 Paris, France
4Center for Nonlinear Phenomena and Complex Systems, Université Libre de Bruxelles,
Campus Plaine, CP231, 1050 Brussels, Belgium
5Complex Systems Group, Dept. Física i Enginyeria Nuclear, Universitat Politècnica de
Catalunya, Sor Eulàlia d’Anzizu s/n, Campus Nord, Mòdul B4, 08034 Barcelona, Spain
6Equipe Modélisation des Systèmes Fortement Couplés (ZOOM), LESETH, Université Paul
Sabatier, 118 route de Narbonne, 31062 Toulouse Cédex 4, France
7Equipe Traitement et Compréhension d’Images, IRIT, Université Paul Sabatier, 118 route de
Narbonne, 31062 Toulouse Cédex 4, France
Abstract
The origins of large-scale spatial patterns in biology has been an important source of
theoretical speculation since the pioneering work by Turing (1952) on the chemical
basis of morphogenesis. To understand how the mechanisms implied in the emer-
377
LA COMPLEXITÉ
Jean-Louis DENEUBOURG
gence of these patterns and their functional role is important to our understanding of
the evolution of biocomplexity. However, so far conclusive evidence for this type of
mechanisms in real biological systems has been elusive, in spite of the strong evidence
in favor of these mechanisms. Here a well-defined experimental and theoretical analy-
sis of the pattern formation dynamics exhibited by clustering behavior in ant colonies
is presented. These experiments and a simple mathematical model show that these
colonies use indeed this type of mechanisms. All microscopic variables have been
measured and provide the first evidence for this type of self-organized behavior in
complex biological systems, supporting early conjectures about its role in the organi-
zation of insect societies.
378
13. Spatial Patterns in Ant Colonies
LA COMPLEXITÉ
13.1
Introduction
Many biological systems display large-scale features involving some characteristic
scale that is much larger than the size of its individual components [1]. These structures
are observed in a broad range of systems and scales, from animal coats [2] shell pat-
terns [3,4] and neural structures [5] to the spatial distribution of individuals in ecosys-
tems [6]. In many cases, they reflect functionality and adaptation and in all of them
they provide clues for the underlying rules that generate them. In most cases, it is clear
that the information available to individual units is gathered from a local neighbor-
hood much smaller than the resulting structures, suggesting some type of amplification
mechanism that relies on collective behavior.
The first theoretical explanation of these types of structures was suggested in 1952
by Alan Turing [1,7,8]. The basic mechanism at work involves local amplification of
fluctuations (activation) and long-range inhibition and actually falls within a general
class of mechanisms [9,10,11,12]. These mechanisms have been identified in physi-
cal [13] and chemical [14] systems, in ecosystems [6,10,12,15,16,17,18] and morpho-
genesis [3,4,5,11,12,19,20,21,22,24,25,27]. In the slime mold [27,29] the evidence is
also strong. Critics have argued that a proof requires the identification and measure-
ment of the microscopic mechanisms at work, and this is obviously a rather difficult
task in biology.
In this context, it was early suggested that social insects might actually use these
types of mechanisms to build their nests [29,30] and produce a wide variety of spa-
tiotemporal structures [31,32,33,34]. Here we use social insects and their behavioral
patterns of organization as our reference system. We follow a standard approach, us-
ing a well-defined and controlled experimental setup in which the whole set of pa-
rameters can be measured and therefore all the microscopic rules can be identified.
We show that the formation of cemeteries in ants [35,36,37,38] falls within the fam-
ily of local activation-long range inhibition (LALI) processes originally suggested by
Gierer and Meinhardt [9], the inhibition resulting from the depletion of the substrate.
In experiments carried out with the ant Messor sancta, we confirm the presence of
self-organization dynamics as responsible for the regular structures generated by the
clustering process and a mathematical model is presented, consistently reproducing
the experimental observations.
13.2
Methods
13.2.1
Colony Collection and Ant Maintenance
Experiments were carried out with colonies of the ant Messor sancta. Ants were col-
lected in the Southwest of France, near Narbonne and then reared in the laboratory
at 25◦C with 12 h light / 12 h dark. Colonies were housed in several glass test tubes
placed in 27 x 27 cm plastic boxes whose sides were coated with Fluon
to prevent
ants from escaping. Ants were provided with water in the form of moist cotton and fed
379
LA COMPLEXITÉ
Jean-Louis DENEUBOURG
ad libitum with a mixture of seeds and twice a week with bits of crickets.
13.2.2
Experiments
The experimental arena is a circular structure (of two possible diameters ∅ = 25 cm or
50 cm) below which the nest-box is located. The ants can access the arena by climbing
on a wood rod placed in a hole at the center of the arena and randomly walk to the
periphery. The experimental setup was designed in order to reduce the problem to a
one-dimensional system with periodic boundary conditions: as the ants exhibit strong
thigmotactism (a tendency to follow the inner walls), their paths can be considered to
be confined to one dimension. Corpses are initially homogeneously distributed along
the periphery, close to the inner wall (Fig. 13.1a). Two different initial numbers of
corpses are used in both arena sizes: 100/200 and 200/400 for the small and large
arena, that is corresponding to 127 and 255 corpses m−1 respectively. The average size
of the corpses is 3 mm, the initial mean distance between them being 4.9 mm and 0.9
mm for the small and high density, respectively. The duration of the experiments was
set to 24 hours with the small arena and 48 hours with the large arena. 15 replications
were performed for each density with the small arena and 25 replications with the
large arena. Another set of 10 experiments were performed with the large arena and
a small initial number of corpses corresponding to 13 corpses m−1 in order to test the
existence of a critical density of corpses. The duration of these experiments was set to
24 hours. The floor of the arena was washed with diluted alcohol and hexane before
each experiment.
Recording and data analysis. The experiments were videotaped by means of a
SONY DCR-VX1000E high-definition camera allowing the regular sampling of the
aggregation process. Two seconds of images were recorded every ten minutes. A
video analysis was then performed with a specially designed software that calculated
the position and the size of the piles at each time interval. Two neighboring corpses
are considered to belong to the same pile when the distance between them is less than
1.5 mm (half the average size of a corpse). A pile is defined as a cluster of at least 5
corpses. The individual behavior of ants was studied with a separate set of experiments.
The spontaneous probabilities for an ant to drop a corpse or to make a U-turn during
walking were estimated by calculating the regression line of the survivorship curves
of these events. The probabilities to pick-up and drop a corpse as a function of the
size of the pile encountered by an ant were estimated by a series of experiments during
which piles with predefined sizes were created. The size of the piles was kept constant
during these experiments. Ants’ trajectories were digitized using a GrafBar GP-7 sonic
digitizer (Science Accessories Corporation, Southport, Connecticut 06490, USA). We
put a glass plate over the active area of the digitizer and placed behind it a 13-inch
video monitor. As an ant moved on the screen, it was followed with the digitizer cursor
and its path was input into a microcomputer as a series of X-Y Cartesian coordinates
at a rate of 5 points per second. As the speed at which the ants were moving on the
screen was relatively slow, ants could be followed with the videotapes played at normal
speed. Digitized trajectories were used to compute the running velocity of ants, defined
380
13. Spatial Patterns in Ant Colonies
LA COMPLEXITÉ
Figure 13.1: An example of aggregation dynamics observed for an arena of ∅ = 50 cm and with
N = 400 corpses. a. at t = 0. b. after 6h. c. after 12h. d. after 45h.
381
LA COMPLEXITÉ
Jean-Louis DENEUBOURG
as the ratio of total trajectory length over the time the animal spent moving during the
trajectory.
13.3
Results
13.3.1
Clustering behavior: collective and individual levels
After having reached the arena, workers pick up corpses and drop them to form piles.
After a few hours, several clusters are formed. Over time, some clusters grow and
others disappear, leading to an apparent steady state with a stable number of clusters
over the duration of the experiment (Fig. 13.1b-d). The sigmoidal growth of surviving
clusters, an illustration of which is given in figure 13.2, suggests that cluster formation
is auto-catalytic. The number of clusters initially grows to reach a maximum after
about three hours, and then decreases and stabilizes.
Figure 13.2: An example of growth of a surviving cluster of corpses for an arena of ∅ = 25 cm and
with N = 100 corpses.
The above results suggest a LALI mechanism: since the addition of corpses to a
cluster is more likely as the cluster increases in size, cluster growth is locally self-
enhancing; and cluster growth is inhibited by the depletion of corpses in the cluster’s
neighborhood. This type of LALI model, coined "activator-substrate" [9], has been
suggested in the formation of certain sea shell patterns [4]. In order to confirm this
conjecture, the underlying microscopic rules have to be identified. Observation of the
382
13. Spatial Patterns in Ant Colonies
LA COMPLEXITÉ
ants’ behavior shows that workers pick or drop corpses with a probability that de-
pends on the local density (c) of corpses. Picking and dropping probabilities and their
functional form have been estimated from experimental data (Fig. 13.3a,b). Unladen
ants pick up corpses with a probability that decreases with cluster size, while corpse-
carrying ants drop corpses with a probability that increases with cluster size. The latter
ants are also characterized by a spontaneous dropping probability that has been esti-
mated from experimental data (Fig. 13.3c). Trajectory measurements show that the
ants move randomly along the arena’s periphery (one-dimensional random walk) and
allowed the identification of two additional microscopic characteristics: individual ve-
locity and mean free path. The mean velocity of ants is 1.6 ±0.7 cm.s−1 (N = 25) and
for such parameter range, random walk can be shown to be only little influenced by
the velocity distribution. Further discussion will therefore assume a constant velocity
of walking at the average velocity value. Ants are also characterized by a constant
probability per unit of time of making a U-turn during their walk (0.10 s−1), and the
corresponding mean free path (l = 15.8 cm) is significantly smaller than the size of the
arena’s periphery (78.5 cm and 157.1 cm for the arena sizes used in the experiments).
13.3.2
Model description
These estimates of microscopic behavioral parameters and the response functions have
been used to build a macroscopic mathematical model that falls within the activator-
substrate class of LALI models which thus confirmed our previous assumptions. The
model involves two variables: the density of corpse-carrying ants a(x, t) and the den-
sity of corpses c(x, t), where x and t stand for space and time, respectively. ρ is
the density of non-carrying ants. At any given time, their proportion in experiments
is large ( ρ/(a + ρ) = 0.94 ± 0.07, estimated over 135 observations; mean density
ρ ± SD = 200 ± 7 m−1). Owing to the diffusion process resulting from the random
walk of non carrying ants, ρ is assumed to remain uniform and constant over time in the
model. Ants’ behavior can then be approximated by the following reaction-diffusion
equations:
∂c = Ω(c, a)
(13.1)
∂t
∂a
∂2a
=
,
(13.2)
∂t
−Ω(c,a) + D∂x2
where Ω(c, a) is the sum of three terms:
α
α
Ω(c, a) = v
k
1aφc
3ρc
da +
.
(13.3)
a
−
2 + φc
α4 + φc
I
II
III
In equation (13.3), v is the linear velocity of the ants, part I represents spontaneous
dropping (with kd the spontaneous dropping rate per laden ants), and parts II and III
represent density-dependent dropping and picking, respectively. I and II are propor-
tional to the density of corpse-carrying ants (a), and III is proportional to non-carrying
383
LA COMPLEXITÉ
Jean-Louis DENEUBOURG
Figure 13.3: Density-dependent probabilities of dropping (a) and picking (b) a corpse, as estimated
from experiments and theoretical fittings of the dropping and picking rates (continuous line). The to-
tal number of ants dropping and picking up corpses for each size of pile is indicated in brackets. The
theoretical fitting is obtained using the equation (13.1)-(13.3). A pile of corpses is introduced in the the-
oretical set-up to reproduce the experimental procedure. The fraction of corpse-carrying ants crossing
the pile and dropping their load gives the rate of dropping for this pile. This fraction is computed for
different pile sizes. The comparison between this theoretical fraction and the corresponding experimen-
tal one provides an estimate of the parameters of the dropping function α1 and α2. The same procedure
is used to adjust the picking rate (α3 and α4), for which the fraction of laden ants leaving the cluster
was measured. Adjusted values α
1
1
1
1
1 = 31.75 m− , α2 = 1000 m− , α3 = 3.125 m−
and α4 = 50 m−
have were obtained with k
1
1
2
1
2
d = 0.75 m− , ρ = 40 m− , ∆ = 1 cm, v = 1.610−
m s− , l = 15.810−
π∅
m and D = vl/2 = 1.310 3
1
−
m2 s− . c. The natural log of the proportion of ants (N = 127) still
carrying a corpse as a function of the distance covered since they had picked it up. The relationship is
best described by the natural log of the proportion of ants that did not yet dropped the corpse they carry
= 0.026 − k
1
dx with kd = 0.75 m−
(r2 = 0.975; x is the distance in m).
384
13. Spatial Patterns in Ant Colonies
LA COMPLEXITÉ
ants (α1, α2, α3, and α4 are empirical constants. φc is a non-local term which intro-
duces a short-range interaction between workers and corpses:
1
x+∆
φc =
dz c(z)
2∆ x−∆
where ∆ is a small radius of perception within which workers can detect corpses (ded-
icated experimental measurements lead to a characteristic radius of 0.5 cm < ∆ < 1.0
cm). The dropping rate per laden ants (II) increases with Φc and reaches the asymp-
totic value vα1. The picking rate per non carrying ants (III) results from the presence
of non-carrying ants picking available corpses. It decreases when φc increases. There-
fore, according to III, cluster size acts as a negative feedback on the picking rate,
since φc is a local indicator of cluster size. As a result of II and III, clusters form and
their growth inhibits the further growth of other clusters. A standard stability analysis,
where a perturbation around the unique homogeneous steady state (cs, as) is introduced
(c = cs + δc0eωt+iλx; a = as + δa0eωt+iλx, leads to the characteristic equation:
ω2 + (−Γ + Phi + Dλ2)ω − ΓDλ2 = 0
(13.4)
where
sin(λ∆)
α
α
α
γ =
1α2as
+
3ρcs
3ρ
λ∆
(α
α
2 + cs)2
(α4 + cs)2
− 4 + cs
α
Φ = k
1cs
d + α2 + cs
Solving equation (13.4) for ω yields the rate of growth ω(λ) of the perturbation for
a given wave number λ. Here ω(λ) exhibits a finite range of unstable modes that
includes the marginally stable mode ω(0) = 0 (Fig. 13.4). This is a well-known
property of systems involving a conservation law. Furthermore, as is usual with such
models, the most unstable wave number, that is the one for which ω(λ) is maximum,
is proportional to corpse density. In other words the analysis predicts (1) that in the
vicinity of the homogeneous state, doubling corpse density should lead to twice as
many piles; this situation may change over time as the system relaxes away from the
homogeneous state as other unstable wave numbers may become amplified; (2) that
doubling the arena’s diameter while keeping the density constant should lead to twice
as many piles; (3) that a critical density of corpses exists (corpses m-1) below which
no aggregation occurs.
Comparison of the model’s predictions with experimental results. As shown in
figure 13.5, the dynamics of the average number of piles with time and the time at
which the maximum number of piles is reached given by the model are in close agree-
ment with the experiments in the four conditions studied. In particular, predictions of
the stability analysis are confirmed in the initial phase (up to maximum pile number):
(1) doubling the density leads to a doubling of the number of piles; (2) doubling the
arena’s diameter while keeping the same density also leads to twice as many piles; (3)
In experiments performed with an initial density of corpses (13 corpses m−1) below cc,
385
LA COMPLEXITÉ
Jean-Louis DENEUBOURG
Figure 13.4: Stability analysis of the steady states. Solution of the characteristic equation as a func-
tion of the wave number λ for the experimental conditions ∅ = 25 cm, 100 corpses and 200 corpses.
The parameter values are those of figure 13.3 caption.
386
13. Spatial Patterns in Ant Colonies
LA COMPLEXITÉ
Figure 13.5: Evolution as a function of time of the mean number of clusters (a cluster contains at
least 5 corpses) obtained from 20 integrations of the model equations (full lines) and of the number of
clusters obtained experimentally (average and SD are given for 6 experiments per conditions) in four
experimental conditions. a: ∅ = 25 cm, 100 corpses; b: ∅ = 25 cm, 200 corpses; c: ∅ = 50 cm,
200 corpses; d: ∅ = 50 cm, 400 corpses). The initial conditions (spatial distribution of corpses) are
randomly set around the value cs. The parameter values used in the model are those of figure 13.3
caption.
387
LA COMPLEXITÉ
Jean-Louis DENEUBOURG
no stable clusters were observed. In situations where several piles coexist after 24 or
48 hours (far from the homogenous state), although no strict regularity may be noticed,
a critical distance exists between two consecutive piles below which only one of them
can "survive" in the long term as shown in figure 13.6. After 24 hours, with the small
arena and whatever the initial density of corpses, the presence of two consecutive piles
within 20 cm of each other is very unlikely. In any case, the distance between piles is
never less than 10 cm. The most frequent distribution, with piles located on opposite
sides of the arena, is observed in 50% of the cases. The corresponding theoretical dis-
tribution is not significantly different from the experimental one, and both distributions
differ significantly from a random distribution (Fig. 13.6).
Figure 13.6: Comparison between theoretical (N = 47) and experimental (N = 21) distributions of
distances between two consecutive clusters in the conditions ∅ = 25 cm, 100 corpses and ∅ = 25 cm,
200 corpses in the case where only two clusters remain after 24 hours. There is no statistical difference
between experimental and theoretical distributions (Kolmogorov-Smirnov test performed on distances,
P > 0.334, Z = 0.945) and both distributions are statistically different of a random distribution,
N = 5.103 (Kolmogorov-Smirnov test performed on distances, P < 0.05, Z = 1.543 and P < 0.001,
Z = 3.636 respectively). The random distribution is generated as follows: the positions of the two
clusters are independent of each other, except that they cannot overlap. The probability that the distance
is less than the length of the pile (L = 4 cm) is 0 and the probability P (l) to have a distance l greater
than L and smaller than 0.5π∅ is P (l) = 1/(0.5π∅ − L).
388
BIBLIOGRAPHY
LA COMPLEXITÉ
13.4
Discussion
The observation of cemetery formation in ant colonies suggests a LALI mechanism
based on individual worker behavior. It is a peculiar example of such mechanisms
in that it involves animal behavior and not physical and chemical morphogens. All
the behavioral parameters of the corresponding model were quantified in dedicated
experiments. When loaded with the experimental parameter values, the model not only
leads to the formation of patterns that reproduce the properties of cemetery formation,
but also predicts how the pattern is affected by such experimental characteristics as
corpse density and arena size. Experiments aimed at testing the model’s predictions
show that the predictions are indeed satisfied. This is a strong indication that the
formation of cemeteries in ants is an example of LALI morphogenesis, which makes
it one of the first convincing documented biological examples and certainly the first
involving higher organisms. Our work should encourage researchers to look for such
mechanisms in other collective behavioral patterns such as network formation [33,34],
nest construction [29,30,31,39] or herd patterns [40,41], where it could be easier to
identify the underlying activation and inhibition mechanisms than in other systems.
Acknowledgments.
This work was supported by the Santa Fe Institute, by a grant
from the Conseil Régional Midi-Pyrénées and a grant from the Groupement d’Intérêt
Scientifique "Sciences de la Cognition". We thank S. Foucaud and F. Villeneuve-
Séguier for technical assitance and discussions and P. Borckmans, G. Dewel and R.
Lefever for helpful comments on an earlier draft.
13.5
Bibliography
[1] P. Ball, The Self-Made Tapestry (Oxford, New York) (1998).
[2] J. D. Murray, Mathematical Biology (Springer, Berlin) (1990).
[3] B.Ermentrout, J. Campbell, and G. F. Oster, Veliger 28, 369-388 (1986).
[4] H. Meinhardt, The Algorithmic Beauty of Sea Shells (Springer, Berlin) (1995).
[5] N. V. Swindale, Proc. Roy. Soc. London B 208, 243-264 (1980).
[6] J. Bascompte, and R. V. Solé, Trends Ecol. Evol. 13, 173-174 (1998).
[7] A. Turing, Phil. Trans. Roy. Soc. London B 237, 37-72 (1952).
[8] P. Glansdorff, and I. Prigogine, Thermodynamics of Structure, Stability, and
Fluctuations (Wiley, London) (1971).
[9] A. Gierer, and H. Meinhardt, Kybernetik 12, 30-39 (1972).
[10] L. A. Segel, and J. L. Jackson, J. theor. Biol. 37, 545-549 (1972).
389
LA COMPLEXITÉ
Jean-Louis DENEUBOURG
[11] H. Meinhardt, Models of biological pattern formation, (Academic Press, Lon-
don) (1982).
[12] G. F. Oster, Math. Biosc. 90, 265-286 (1988).
[13] P. Manneville, Dissipative structures and weak turbulence (Academic Press, San
Diego) (1990).
[14] V. Castets, E. Dulos, J. Boissonade, and P. De Kepper, Phys. Rev. Lett. 64, 2953-
2957 (1990).
[15] R. Lefever, and O. Lejeune, Bull. Math. Biol. 59, 263-294 (1997).
[16] P. Kareiva, and G. Odell, Am. Nat. 130, 233-270 (1987).
[17] P. Kareiva, in Mathematical Ecology, S. Levin and T. Hallam eds Springer Lec-
tures Notes in Biomathematics, 54, 368-389 (1984).
[18] P. Kareiva, Nature 326, 388-390 (1987).
[19] M. Freeman, Nature 408, 313-319 (2000).
[20] H. Meinhardt, Int. J. Dev. Biol. 45, 177-188 (2001).
[21] W. J. Buikema, and R. Haselkorn, Proc. Natl. Acad. Sci. 98, 2729-2734 (2001).
[22] H. S. Yoon, and J. W. Golden, Science 282, 935-938 (1998).
[23] E. F. Keller, and L. A. Segel, J. theor. Biol. 30, 235-248 (1971).
[24] H. F. Nijhout, Proc. Roy. Soc. London B 239, 81-113 (1990).
[25] A. Kondo, and R. Asai, Nature 376, 765-768 (1995).
[26] K. J. Painter, P. K. Maini, and H. G. Othmer, Proc. Natl. Acad. Sci. 96, 5549-5554
(1999).
[27] E. F. Keller, and L. A. Segel, J. theor. Biol. 26, 399-415 (1970).
[28] S. Sawai, Y. Maeda, and Y. Sawada, Phys. Rev Lett. 85, 2212-2215 (2000).
[29] J. L. Deneubourg, Ins. Soc. 2, 117-130 (1977).
[30] E. Bonabeau, G. Theraulaz, J. L. Deneubourg, N. Franks, O. Rafelsberger,
J. L. Joly, and S. Blanco, Phil. Trans. R. Soc. Lond. B 353, 1561-1576 (1998).
[31] S. Camazine, J. Sneyd, M. J. Jenkins, and J. D. Murray, J. theor. Biol. 147, 553-
571 (1990).
[32] B. J. Cole, and D. Cheshire, Am. Nat. 148, 1-15 (1996).
[33] L. Edelstein-Keshet, J. Math. Biol. 32, 303-328 (1994).
390
BIBLIOGRAPHY
LA COMPLEXITÉ
[34] L. Edelstein-Keshet, J. Watmough, and G. B. Ermentrout, Behav. Ecol. Soc. 36,
119-133 (1995).
[35] C. P. Haskins, and E. F. Haskins, Psyche 81, 258-267 (1974).
[36] D. F. Howard, and W. R. Tschinkel, Behaviour 56, 157-180 (1976).
[37] , H. Ataya, and A. Lenoir, Ins. Soc. 31, 20-33 (1984).
[38] D. Gordon, J. Chem. Ecol. 9, 105-111 (1983).
[39] V. Skarka, J. L. Deneubourg, and M. R. Belic, J. theor. Biol. 147, 1-16 (1990).
[40] S. Gueron, S. A. Levin, J. theor. Biol. 165, 541-552 (1993).
[41] J. K. Parrish, and L. Edelstein-Keshet, Science 284, 99-101 (1999).
391
Septième Partie
Synthèse et conclusions
Michel Droz
Département de Physique Théorique, Université de Genève
24, quai Ernest Ansermet, CH-1211 Genève 4
393
Chapitre 14
Conclusion
14.1
Synthèse
Les cours des divers orateurs nous ont montré que la complexité pouvait revêtir
des formes très variées et que chaque classe de problèmes avait ses spécificités. Néan-
moins, certains concepts unificateurs émergent.
Considérons tout d’abord le problème de la simulation de systèmes physiques en
termes d’automates cellulaires introduit par B. Chopard. Comme nous l’avons vu, des
règles conceptuellement très simples conduisent à des comportements complexes. Le
cas du jeu de la vie est un exemple typique. L’application synchrone d’une règle lo-
cale particulièrement simple à formuler conduit à une machine de Turing [1]. Une
abondante littérature existe sur le jeu de la vie et de nombreux auteurs ont découvert
des configurations initiales qui sous itération, résolvent des problèmes non-triviaux.
Une information assez complète sur le sujet peut être obtenue sur internet. Des logi-
ciels tournant sous divers systèmes opérationnels et permettant de simuler beaucoup
de règles d’automates peuvent être téléchargés [2]. Parmi les configurations initiales
conduisant à la solution de problèmes non-triviaux, celle générant tous les nombres
premiers est particulièrement intéressante. Un autre automate cellulaire remarquable
est celui de Langton conduisant à des motifs auto-reproductibles [3].
Une difficulté inhérente à cette approche est qu’il y a un très grand nombre de
règles possibles et que la majorité d’entre elles sont sans intérêt. Un problème com-
plexe est lié à la sélection des règles qui conduisent à des comportements non-triviaux.
C’est à ce niveau qu’une approche de type algorithme génétique peut se montrer utile.
La notion de symétrie (ou loi de conservation) joue un rôle fondamental dans les
phénomènes naturels et le respect des propriétés de symétrie impose des contraintes
importantes dans le choix de règles “raisonnables” lors d’une modélisation. Pourtant
cela ne suffit pas comme l’étude des modèles de gaz sur réseau, simulant l’hydrody-
namique, l’a montré. Pour rendre des règles d’automate crédibles, il est nécessaire de
monter par une analyse théorique (souvent difficile) que le bon comportement sera
reproduit aux échelles plus grossières [4].
395
LA COMPLEXITÉ
Michel DROZ
L’approche des automates cellulaires a donc ses limites et ses faiblesses. Néan-
moins, “l’esprit” de la méthode est très important pour la compréhension des phéno-
mènes complexes. Cette approche permet souvent d’identifier les processus essentiels
gouvernant un phénomène complexe sans nécessairement en décrire tous les détails.
Cette démarche est bien résumée par la pensée de A. Einstein, “Everything should be
made as simple as possible, but not simpler”.
Un autre aspect de la complexité est apparu dans l’étude des systèmes critiques
auto-organisés (SOC) présentée par P. De Los Rios. Il est un fait que de nombreux
systèmes naturels exhibent des comportements en loi de puissances [5] (avalanches,
tremblements de terre, marchés financiers, phénomènes critiques. . . ). L’apparition de
ces phénomènes est spontanée, c’est-à-dire sans qu’il y ait besoin d’ajuster des para-
mètres de contrôle à des valeurs particulières. Les systèmes SOC se comportent de
manière complexe et il est raisonnable d’espérer que l’étude de ces systèmes apportera
des éléments importants à la compréhension de la complexité.
Malheureusement, peu de résultats exacts ont été obtenus pour de tels systèmes et
aucun mécanisme générique a été mis en évidence.
Le concept de système SOC a néanmoins été très utile en exportant les concepts
de “scaling” ou de “lois d’échelle” à des disciplines extérieures à la physique. Le
simple fait d’analyser des phénomènes bien connus en terme de variables bien choisies
(comme les variables de scaling) permet de déduire des propriétés génériques qui ne
sont pas visibles dans une analyse plus naïve.
Une autre classe importante de problèmes complexes est formée par les problèmes
pour lesquels on cherche une “solution optimale”, c’est-à-dire une solution qui mi-
nimalise une certaine fonction coût. La difficulté vient du fait qu’il y a un nombre
astronomiquement grand d’états possibles du système et qu’il n’est pas possible de les
dénombrer tous pour choisir le meilleur. De plus, les divers états à coût minimum ne
sont pas “voisins” dans l’espace des états. Il faut donc faire appel à des algorithmes
stochastiques qui vont explorer de manière “intelligente” une fraction restreinte des
états possibles conduisant à une bonne solution approximative du problème. De tels
problèmes sont communs en physique des systèmes désordonnés [6], en recherche
opérationnelle [7] ou en théorie de l’évolution par exemple.
Comme l’a montré M. Tomassini, les algorithmes génétiques sont de ce type. Dans
ce cas la complexité n’émerge pas de la dynamique de nombreux agents en interaction,
mais de la sélection, parmi un ensemble d’algorithmes, des candidats les plus affûtés
pour résoudre un problème posé. Cette stratégie, basée sur les principes de l’évolu-
tion en biologie utilise les concepts de mutation, combinaisons et sélection. Une telle
approche est donc applicable à une très large classe de problèmes. Néanmoins, cette
approche possède des faiblesses. La mise en oeuvre de la méthode est souvent très
heuristique et donc mal contrôlée. La vitesse de convergence peut être très faible et
donc l’évolution très lente.
Une autre classe de systèmes bio-inspirés sont les réseaux de neurones. M. To-
massini à beaucoup discuté des réseaux dits “en couches”. Il existe d’autres types de
396
BIBLIOGRAPHIE
LA COMPLEXITÉ
réseaux de neurones dits “à mémoire associative” [8] qui sont également très intéres-
sants et qui se prêtent mieux à une étude théorique que les réseaux en couches. Il y a
de fortes analogies avec la théorie des “verres de spins” familière aux physiciens [6].
Ici encore, l’étude de ces systèmes bio-inspirés montre combien il est important
de savoir parfois sortir de son domaine de spécialisation pour emprunter à d’autres
disciplines des outils qui ont fait leurs preuves.
Vint l’immense problème de l’étude du génome [9] et de la protéomique, via la bio-
informatique. Comme nous l’a montré R. Gras, certains problèmes (reconnaissance
des gènes dans l’ADN par exemple) sont du type recherche de solution avec le coût
minimum discutés ci-dessus.
Les problèmes posés en génomique me semblent non seulement complexes mais
en plus “compliqués”. Pour toute règle énoncée on trouve immédiatement un grand
nombre d’exceptions. Ces difficultés sont-elles intrinsèques au sujet ou cette situation
est-elle le reflet du fait que la génomique est encore une science jeune?
Existe-t’il divers niveaux de complexité pour le vivant? Un homme est-il plus com-
plexe qu’un haricot bien que l’ADN du haricot soit beaucoup plus grand que celui de
l’homme? Je n’ai pas de réponses à ces questions.
Les derniers exemples de complexité nous ont été fournis par J.-L. Deneubourg.
L’étude des comportements des insectes sociaux a l’avantage de réunir conjointement
les méthodes expérimentales et théoriques. Les divers exemples discutés ont montré de
manière spectaculaire comment des agents obéissant à des règles simples et ne sachant
rien de la dynamique globale de la société, peuvent conduire à des comportements
émergents riches. D’autre exemples de comportement collectifs surprenant peuvent
être observés tous les jours lors du vol de groupes d’oiseaux pour lesquels de simples
modélisations ont été proposées [10].
Comme nous l’avons vu durant ces cours, la complexité est un sujet vaste, prenant
de multiples aspects, et donc qui par nature n’est pas précisément défini. C’est néan-
moins un sujet universel, auquel une communauté de plus en plus grande s’intéresse.
La liste (non exhaustive) de quelques livres généraux [11,12,13,14] ainsi que de sites
majeurs sur internet [15] consacrés à la complexité donnée ci-dessous en atteste.
En conclusion, ces cours nous ont montré que, indépendamment de son domaine
de recherches, il est bon de temps à autre de lever la tête et de profiter des accomplis-
sement réalisés dans d’autres domaines.
14.2
Bibliographie
[1] E.R. Berlekamp and J.O. Conway and R.K. Guy, Winning Ways for your Ma-
thematical Plays, Academic Press, New York (1982).
[2] Il y a un très grand nombres de sites intéressants. Par exemple:
http://www.math.com/students/wonders/life/life.html
397
LA COMPLEXITÉ
Michel DROZ
http://radicaleye.com/lifepage.
[3] C. G. Langton, Self-reproduction in CA , Physica D 34, 259 (1984).
[4] B. Chopard and M. Droz, Cellular Automata and Modeling of Physical Sys-
tems, Cambridge University Press (1998).
[5] Per Bak, How Nature Works: The Science of Self Organized Criticality,
Springer-Verlag (1996).
[6] K. Binder and A. P. Young, Spin glasses: Experimental facts, theoretical
concepts, and open questions, Rev. Mod. Phys. 58, 801-976 (1986).
[7] S. Kirkpatrick, C.D. Galatt and M. P. Vecchi, Optimization by Simulated An-
nealing, Science 220, 671 (1983).
[8] B. Müller and J. Reinhardt, Neural netwoks: an introduction, Springer-Verlag
(1990). Ce livre contient divers logiciels simulant des réseaux de neurones.
[9] Une bonne introduction pour le non spécialiste se trouve dans: T. Brown, Ge-
nomes, BIOS Scientific Publis., Oxford (2002).
[10] G.W. Flake, The computational beauty of Nature, MIT Press, Cambridge
(1998). De nombreuses simulations numériques de systèmes complexes dé-
crits dans se livre peuvent être téléchargés sur le site
http://mitpress.mit.edu/flake/.
[11] Complexity: Methaphors, Models and Reality, G. A. Cowan, D. Pines and D.
Melzer edts, Addison Wesley (1994).
[12] Y. Bar-Yam, Dynamics of Complex Systems, Addison Wesley (1997).
[13] S. Y. Auyang, Foundations of Complex-Systems Theories, Cambridge Univer-
sity Press, (1998).
[14] Physique de la Complexité , T. Daussiox et M. Droz, edts, Edt. Frontières,
(1995).
[15] http://physics.pdx.edu/~semuraj/complex1.htm
398