Generally, branch and bound methods (in particular, interval methods, dc methods, and mixed integer methods based on piecewise linear approximations) are more reliable since, to the extent they work (which depends on the difficulty of the problem), they have built in guarantees; however, they require more or less detailed access to global information about the problem.

On the other hand, stochastic or heuristic methods are generally easier to program, depend only on black box function (and sometimes gradient) routines, and therefore can never be sure of not having missed the global optimizer.

Comparisons on relative efficiency are virtually missing, and the quality depends a lot on details of the implementation. At the present stage of knowledge, if you need global optimization you must experiment with the known methods (or packages) until you have something satisfying you. Unfortunately, this is a rather unsatisfactory state of the art...

In its original form, the simulated annealing method is provably convergent (in a probabilistic sense) but exceedingly slow; various ad hoc enhancements make it much faster. In particular, except for simple problems, success depends very much on the implementation used.

The ability to leave regions of attraction to local minimizers is, however, drastically enhanced by crossing over. This is an advantage if, with high probability, the crossing rules produce offspring of similar or even better fitness (objective function value); if not, it is a severe disatvantage. Therefore the efficiency of a genetic algorithm (compared with simulated annealing type methods) depends in a crucial way on the proper selection of crossing rules. The effect of interchanging coordinates is beneficial mainly when these coordinates have a nearly independent influence on the fitness, whereas if their influence is highly correlated (such as for functions with deep and narrow valleys not parallel to the coordinate axes), genetic algorithms have much more difficulties. Thus, unlike simulated annealing, successful tuning of genetic algorithms requires a considerable amount of insight into the nature of the problem at hand.

Both simulated annealing methods and genetic algorithms are, in their
simpler forms, easy to understand and easy to implement, features
that invite potential users of optimization methods to experiment
with their own versions. The methods often work, if only slowly,
and lacking better alternatives, they are very useful tools for
applications, where the primary interest is to find (near-)solutions
*now*, even when the reliability is uncertain and only subglobal
optima are reached.

To make simulated annealing methods and genetic algorithms efficient, clever enhancements are essential. However, theoretical work on explaining the effectiveness of useful enhancements is completely lacking. I also haven't seen careful comparisons of the various options available and their comparative evaluation on standard collections of test problems.

While it is quite possible for such a method to miss the global minimum (so that full reliability cannot be guaranteed, and is not achieved in the tests reported by their authors), a proper implementation of this idea at least gives very good local minima with a fraction of the function evaluations needed for the blind annealing and genetic methods.

A conceptually attractive smoothing technique is the *diffusion
equation method*,
where the original objective function V(x) is smeared out by
artificial diffusion. The solution V(x,t) of the diffusion equation
V_{xx}(x,t)=V_t(x,t)
with initial condition V(x,0)=V(x)
gets smoother and smoother as t gets larger; for
large enough t, it is even unimodal. Thus V(x,t) can be minimized
by local methods when t is sufficiently large, and using the
minimizer at a given t as a starting point for a local optimization
at a smaller t, a sequence of local minimizers of V(x,t) for
successively smaller t is obtained until finally, with t=0, a
minimizer of the original potential is reached.
It should be possible to increase the reliability of the method by
using a limited amount of global search in each optimization stage.

The diffusion equation can be solved by convolution with a Gaussian kernel and hence is explicitly available, e.g., when V(x) is an expolynomial, i.e., a linear combination of arbitrary products of powers of components x_i and exponentials exp(c^T x). For other partially separable functions, one has to use numerical integration, making function evaluation more expensive. For functions lacking partial separability, the method cannot be used in practice.

Again, there is no theoretical work on conditions that would ensure convergence to the global minimum.

While the above techniques are motivated by nature it is important to
remember that processes in nature need not be the most efficient ones;
at best it can be assumed to be efficient *given the conditions
under which they have to operate* (namely an uncertain and
changing environment that is potentially hazardous to those operating
in it). Indeed, much of our present
technology has vastly surpassed natural efficiency by unnatural means,
and it would be surprising if it were different in global optimization.

Even assuming that nature solves truly *global* optimization
problems (a disputable assumption), simple lower estimates for the
number of elementary steps -- roughly corresponding to function
evaluations -- available to natural processes to converge are
(in chemistry and in biology) in the range of 10^15 or even more.
Thus to be successful on the computers of today or the near future,
we must find methods that are much faster, exploring the configuration
space in a planned, intelligent way, not with a blind combination of
chance and necessity. And the challenge is to devise methods that can
be analyzed sufficiently well to guarantee reliablility and success.

In the worst case, branch and bound methods take an exponential amount of work; but many problems arising in practice are far from worst case. Many of the heuristic techniques used currently for searching the conformational space of global optimization problems can be adapted to or combined with the branch and bound approach to take advantage of the structural insights of specific applications.

Branch and bound methods will ultimately allow one not only to calculate the global minimum reliably, but also to find all local minima and saddle points within a certain margin of the global minimum. (Saddle points describe transition states in molecular modeling.) This is essentially the problem of finding all zeros of the nonlinear system V'(x)=0 that satisfy a constraint on the signature of the Hessian and on the value of V. Such problems can already be handled in moderate dimensions by branch and bound methods, combined with techniques from interval analysis. It should be possible to combine these techniques with underestimation techniques provided by dc-methods.

Global Optimization

Protein Folding

my home page (http://www.mat.univie.ac.at/~neum)

Arnold Neumaier (Arnold.Neumaier@univie.ac.at)