Genetic programming

Updated on Sep 24, 2024

Edit

Comment

In artificial intelligence, genetic programming (GP) is a technique whereby computer programs are encoded as a set of genes that are then modified (evolved) using an evolutionary algorithm (often a genetic algorithm, "GA") – it is an application of (for example) genetic algorithms where the space of solutions consists of computer programs. The results are computer programs able to perform well in a predefined task. The methods used to encode a computer program in an artificial chromosome and to evaluate its fitness with respect to the predefined task are central in the GP technique and still the subject of active research.

History

In 1954, pioneering work on what is today known as artificial life was carried out by Nils Aall Barricelli using the very early computers. In the 1960s and early 1970s, evolutionary algorithms became widely recognized as optimization methods. Ingo Rechenberg and his group were able to solve complex engineering problems through evolution strategies as documented in his 1971 PhD thesis and the resulting 1973 book. John Holland was highly influential during the 1970s. The establishment of evolutionary algorithms in the scientific community allowed, by then, the first concrete steps to study the GP idea.

In 1964, Lawrence J. Fogel, one of the earliest practitioners of the GP methodology, applied evolutionary algorithms to the problem of discovering finite-state automata. Later GP-related work grew out of the learning classifier system community, which developed sets of sparse rules describing optimal policies for Markov decision processes. In 1981 Richard Forsyth evolved tree rules to classify heart disease. The first statement of modern "tree-based" genetic programming (that is, procedural languages organized in tree-based structures and operated on by suitably defined GA-operators) was given by Nichael L. Cramer (1985). This work was later greatly expanded by John R. Koza, a main proponent of GP who has pioneered the application of genetic programming in various complex optimization and search problems. Gianna Giavelli, a student of Koza's, later pioneered the use of genetic programming as a technique to model DNA expression.

In the 1990s, GP was mainly used to solve relatively simple problems because it is very computationally intensive. Recently GP has produced many novel and outstanding results in areas such as quantum computing, electronic design, game playing, cyberterrorism prevention, sorting, and searching, due to improvements in GP technology and the exponential growth in CPU power. These results include the replication or development of several post-year-2000 inventions. GP has also been applied to evolvable hardware as well as computer programs.

Developing a theory for GP has been very difficult and so in the 1990s GP was considered a sort of outcast among search techniques.

Program representation

GP evolves computer programs, traditionally represented in memory as tree structures. Trees can be easily evaluated in a recursive manner. Every tree node has an operator function and every terminal node has an operand, making mathematical expressions easy to evolve and evaluate. Thus traditionally GP favors the use of programming languages that naturally embody tree structures (for example, Lisp; other functional programming languages are also suitable).

Non-tree representations have been suggested and successfully implemented, such as linear genetic programming which suits the more traditional imperative languages [see, for example, Banzhaf et al. (1998)]. The commercial GP software Discipulus uses automatic induction of binary machine code ("AIM") to achieve better performance. µGP uses directed multigraphs to generate programs that fully exploit the syntax of a given assembly language

Most non-tree representations have structurally noneffective code (introns). Such non-coding genes may seem to be useless, because they have no effect on the performance of any one individual. However, experiments seem to show faster convergence when using program representations — such as linear genetic programming and Cartesian genetic programming — that allow such non-coding genes, compared to tree-based program representations that do not have any non-coding genes.

Other approaches

The basic ideas of genetic programming have been modified and extended in a variety of ways:

Extended compact genetic programming (ECGP)

Embedded Cartesian genetic programming (ECGP)

Probabilistic incremental program evolution (PIPE)

Strongly typed genetic programming (STGP)

Meta-genetic programming

Meta-genetic programming is the proposed meta learning technique of evolving a genetic programming system using genetic programming itself. It suggests that chromosomes, crossover, and mutation were themselves evolved, therefore like their real life counterparts should be allowed to change on their own rather than being determined by a human programmer. Meta-GP was formally proposed by Jürgen Schmidhuber in 1987. Doug Lenat's Eurisko is an earlier effort that may be the same technique. It is a recursive but terminating algorithm, allowing it to avoid infinite recursion.

Critics of this idea often say this approach is overly broad in scope. However, it might be possible to constrain the fitness criterion onto a general class of results, and so obtain an evolved GP that would more efficiently produce results for sub-classes. This might take the form of a meta evolved GP for producing human walking algorithms which is then used to evolve human running, jumping, etc. The fitness criterion applied to the meta GP would simply be one of efficiency.

For general problem classes there may be no way to show that meta GP will reliably produce results more efficiently than a created algorithm other than exhaustion.

References

Genetic programming Wikipedia

(Text) CC BY-SA

Contents

History

Program representation

Other approaches

Meta-genetic programming

References