Natural and Artificial Evolution

Evolution is the biological theory that animals and plants have their origin in other types, and that the distinguishable differences are due to modifications in successive generations [1]. Natural evolution is not a random process. On the contrary, it is based on random variations, but some are rejected while others preserved according to objective evaluations. Only changes that are beneficial to the individuals are likely to spread into subsequent generations. Darwin in his On the Origin of Species by Means of Natural Selection, or the Preservation of Favoured Races in the Struggle for Life (1859) [2] called this principle “natural selection”, a quite simple process where random variations “afford materials”.

When natural selection causes variations to be accumulated in one specific direction the result strikingly resembles the outcome of an intelligent optimization process. However, the process only requires to assess the effect of random changes, not the ability to design intelligent modifications. Several researchers, independently, tried to replicate such a characteristic to solve difficult problems more efficiently. Accordingly, an evolutionary algorithm (EA) is an algorithm loosely inspired by the theory of evolution, and evolutionary computation (EC) is the offshoot of computer science focusing on such algorithms. The definition is deliberately vague since the boundaries of the field are not, and cannot be, sharply defined. Evolutionary computation is a branch of computational intelligence, and it is also included into the broad framework of bio-inspired heuristics.

Evolutionary computation does not have a single recognizable origin. Some scholars identify its starting point in 1950, when Alan Turing wrote “Computing machinery and intelligence” [3], drawing attention to the similarities between learning and evolution. Others pointed out the inspiring ideas that appeared later in the decade, despite the fact that the lack of computational power impaired their diffusion in the broader scientific community [4]. More commonly, the birth of evolutionary computation is set in the 1960s with the appearance of three independent research lines: John Holland’s genetic algorithms (GA); Lawrence Fogel’s evolutionary programming (EP); Ingo Rechenberg’s and Hans-Paul Schwefel’s evolution strategies (ES). These three paradigms monopolized the field until the 1990s, when John Koza entered the arena with genetic programming (GP).

In evolutionary algorithms a single candidate solution is termed individual; the set of all candidate solutions that exists at a particular time is called population. Evolution proceeds through discrete steps called generations. In each of them, the population is first expanded and then collapsed, mimicking the processes of breeding and struggling for survival. Some evolutionary algorithms do not store a collection of distinct individuals, and evolution is depicted through the variation of the statistical parameters that describe the population.

Most of the jargon of evolutionary computation mimics the precise terminology of biology. The ability of an individual to solve the target problem is measured by the fitness function, which influences the likelihood of a solution to propagate its characteristics to the next generation. In some approaches individuals may die of old age, while in other they remain in the population until replaced by fitter ones.

The word genome denotes the whole genetic material of the organism, although its actual implementation strongly differs from one approach to another. The gene is the functional unit of inheritance, or, operatively, the smallest fragment of the genome that may be modified in the evolution process. Genes are positioned in the genome at specific positions called loci. The alternative genes that may occur at a given locus are called alleles.

Biologists need to distinguish between the genotype and the phenotype: the former is all the genetic constitution of an organism; the latter is the set of observable properties that are produced by the interaction between the genotype and the environment. In many implementations, EC practitioners do not require such a precise distinction. The single numerical value representing the fitness of an individual is sometimes assimilated to its phenotype.

To generate the offspring, evolutionary algorithms implement sexual and asexual reproduction. The former is named recombination; it involves two or more participants, and implies the possibility for the offspring to inherit different characteristics from different parents. When recombination is achieved through an exchange of genetic material between the parents, it often takes the name of crossover. Asexual reproduction may be named replication, to indicate that a copy of an individual is created, or, more commonly, mutation, to stress that the copy is not exact. All operators exploited during reproduction can be cumulatively called evolutionary operators, or genetic operators because they act at the genotypical level. Almost no evolutionary algorithm takes gender into account; hence, individuals do not have distinct reproductive roles.

An evolutionary algorithm performs better than a pure random approach. This rather simple consideration is probably the main reason why EAs are sometimes exploited in the industrial world. Nevertheless, they have show interesting characteristics:

Evolutionary algorithms provide an effective methodology for trying random modifications, where no preconceived idea about the optimal solution is required.
Evolutionary algorithms are more robust than pure hill climbing (being based on a population).
Both small and large modifications are possible, but with different probabilities.
Sexual recombination allows merging useful characteristics from different solutions, exploring efficiently the search space.
Evolutionary algorithms are quite simple to set up, and require no human intervention when running.
Evolutionary algorithms are inherently parallel: a nearly-linear speed-up may be easily achieved on multiple instruction/multiple data (MIMD) architectures.
It’s easy to trade-off between computational resources and quality of the results.

Unfortunately, several hidden and rather obscure details may significantly impair evolutionary algorithms’ efficacy. This may explain the relative slow acceptance, compared to other bio-inspired heuristics, such as simulate annealing or artificial neural network [5].