Skip to content

Deception

Daniele Gravina edited this page Jun 28, 2019 · 3 revisions

[In progress]

In this page we describe the notions of deception. In particular, we review what are the major source of hardness for evolutionary computation and why deception is one of the most difficult to address.

Deception

The term deception in the context of evolutionary computation was introduced by Goldberg (1987) to describe instances where highly-fit building blocks, when recombined, may guide search away from the global optimum. Since that first mention, the notion of deception (including deceptive problems and deceptive search spaces) has been refined and expanded to describe several problems that challenge evolutionary search for a solution. Whitley (1991) a argues that ``the only challenging problems are deceptive''. However, a debate on what makes a problem difficult is still ongoing, and the role of deception contentious. For instance, Mitchell et al. (1992) argue that deception is only one of the many components that can make fitness landscape difficult, and Grefenstette (1993) points out that deception is neither a necessary or a sufficient condition to define a problem hard, as it depends on the dynamic of the sampling strategy. Admittedly, it is difficult to define what makes a problem hard. Given a generic description of the algorithm and the instance of the problem, a priori measure of its performance does not exist, and the only way to measure the algorithm's effectiveness is by running it (Rice, 1953).

For instance, other measures of EC-hardness are the sampling error (Liepins and Vose, 1990) and a rugged fitness landscape (Kauffman, 1989). In combinatorial optimization problems, the fitness landscape can affect optimization when performing local search. Such algorithmic process assumes a high correlation between the fitness of neighboring points in the search space and independent genes in the chromosome. The latter assumption is commonly referred as epistasis (Davidor, 1991) which is a factor of GA-hardness: when epistasis is high (i.e., where too many genes are dependent on other genes in the chromosome), the algorithm searches for a unique optimal combination of genes, but no substantial fitness improvements are noticed (Davidor, 1991). As noted, epistasis is evaluated from the perspective of the fitness function and thus is susceptible to deception; Naudts and Verschoren (1999) argue that deceptive functions cannot have low epistasis, although fitness functions with high epistasis are not necessarily deceptive. Such approaches are often based on the concepts of correlation, i.e., the degree to which an individual's fitness score is well correlated to its neighbors' in the search space, or epistasis, i.e., the degree of interaction among genes' effects. Another interesting take on the EC hardness is from Borenstein and Poli (2004). They argue that the analysis of the difficulty for a certain problem can be improved by assessing the properties of the problem's fitness distribution. The problems' classification is often based on properties given by human definitions: this entails a human bias, while the analysis of the difficulty should be less dependent from the choices of the designer. Therefore, by sampling over a set of different representation, it is possible to isolate the problem's properties from a particular representation. Jones and Forrest (1995) propose instead to use a fitness distance correlation, i.e., the relationship between the fitness function and an ideal goal, but this can lead to some mispredictions as the correlation measure may be too simplistic. Guo and Hsu (2003) advance three key aspects to identify a hard problem. The first one is when the defined fitness lacks any information about the goal (e.g., needle-in-the-haystack problem), and therefore an objective-based approach cannot be useful. The second reason is when the search algorithm does not exploit helpful information given by the fitness function intentionally. Even if this might be considered a non-optimal use of the algorithm (it would perform worse than random search on a set of GA-easy problems), it can be regarded as a solution for deceptive problems, the third cause of problem hardness. In this case, the fitness function returns conflicting information regarding the goal and ignoring the objective could be beneficial. In conclusion, as noted by Lehman and Stanley (2011a), it seems that most of the factors of EC-hardness originate from the fitness function itself; however, poorly designed genetic operators and poorly chosen evolutionary parameters can exacerbate the problem.

Clone this wiki locally