a person writing on a whiteboard

Nonlinear Optimization – Topic

Idea

In general, I think most people have an intuitive idea of what “optimization” and “optimal” mean: to find the solution that minimizes all constraints (cost, time, materials, labor, energy, etc.). But mathematically in the generalized nonlinear case, it is extremely difficult to find a true global minimum on a function’s surface with iterative methods. This means short of more advanced techniques like reformulating the problem, we have to be content with a local minimum rather than a global minimum. In these cases, we have no way to mathematically assure we’ve reached a true global minimum.

To find these minimums we can construct the problem statement.

Problem Statement

In the unconstrained nonlinear program, the following problem statement exits:

minuc(u)s.t.un\min_u c(u) \; s.t. \; u \in \mathbb{R}^n

or verbally stated: Find the u which makes the cost function (labeled c) as low as possible. This u must be a vector of length n. We need u to be of length n so that the cost is fully defined. For example, if we had a three dimensional cost and only a 2 dimensional u vector, then this contraint would be unsatisfied. This just keeps us from considering invalid solutions.

some other notes specifics for the problem setup:

c:mc:\mathbb{R}^m \rightarrow \mathbb{R}

And the u vector is our “decision” variable. Very often a control input in the case of controls engineering.

So, our goal becomes to find this minimizer:

us.t.c(u)<c(u)uuu^* \; s.t. \; c(u^{*}) < c(u) \;\forall \; u \neq u^{*}

In other words, find a u such that the cost is mininimal at that point rather than any other. This is the global case, which as stated above is impossible to find in a nonlinear situation without reformulation, at least with iterative methods. Since this is the case, we can restrict the minimzation to a local region which is absolutely possible for use to solve.

to do this, we restrict the neighborhood of u:

|uu|<ϵ|ϵ>0|u -u^{*}| \lt \epsilon \; | \; \epsilon \gt 0

If the above is true, then u* is a local minimizer in the cost function. This is the best we can hope for.

NOTE: All of this analysis depends on the cost function being smooth and differentiable in this neighborhood if not globally.

Approaches

Two main types of approaches exist (though the lines between them blur):

  1. Iterative Methods
  2. Randomization Methods

Iterative Methods

iterative methods in general exploit the smooth structure of the cost function, and its derivatives, to find a decreasing direction. They then take a step in that direction and repeat. Generally, this is achieved numerically through setting either a max step count, or a difference between current and last cost function values. This method requires some knowledge of the cost function, because, the first derivative must not be small due to a local maximum (maximum cost) or an inflection point (unknown cost). for a true minimum, the minimizing u* must be compared between the function and its derivative to ensure minimum has been reached.

These methods take the form of the following:

u+=uγDc(u)u^+ = u-\gamma D c(u)

which is best known as the form of gradient descent.

In general these methods are sample-efficient (less compute time to find minimum) but require a differentiable cost function. Some examples include:

  1. Gradient Descent
  2. Newton-Raphson
  3. Zeroth Order Methods

Randomization Methods

Randomization methods take samples from random locations along the cost function, and update their neightborhood of interest based on results from probing the system.

They generally take the form of the following:

u+=p(u)u^{+} = p(u)

where p is the random distribution function about the neighborhood at input u.

These methods may be faster than iterative methods, but are sample inefficient (many samples required) and have no mathematical guarentee. However, it is possible that they may approximate a global minimizer u*.

Examples of these methods include:

  1. Genetic / Swarm Algorithms
  2. Simulated Annealing
  3. Markov Chain Monte Carlo

Future Notes

There’s a lot to say about this topic, and this note is only the tip of the iceberg. and there is plenty to explore. I’ll write some follow up notes exploring these methods, and am beginning to code these methods on my github as it develops.


Comments

Leave a Reply

Discover more from Andrew Adie

Subscribe now to keep reading and get access to the full archive.

Continue reading