blast burn launch light

Model Predictive Control (MPC)

Model Predictive Control (MPC) is an optimal control architecture rooted in optimization. Most often, formulations of MPC are quadratic programs (QP’s) and result in powerful predictive models across some time horizon in the near future. Where other trajectory planning methods like Sequential Quadratic Programming (SQP) might predict the entire trajectory, MPC stays local to the agent as it moves. With each step, the trajectory is recalculated according to the updated constraints at each step. While this is computationally more expensive in some cases, this is a much faster control method, and creates autonomous behaviors such as creating formations, evading obstacles, controlling approach attitude and position, or enabling rendevouz operations.

Formulation

MPC’s share a general root formulation which allows the above behavior. The root quadratic program formulates a cost objective along with several standard constraints. Additional constraints can be added to the formulation to create autonomous behaviors. Tuning of the cost matrices also affects overall behavior of the agent, preventing or encouraging desired behavior. Therefore it is the GNC engineer’s responsibility to intelligently tune cost matrices.

The root formula for the MPC algorith is the following:

such that

This root formulation forms the backbone of MPC formulation. Note that since this is formally a QP, all the terms within must be either quadratic or linear. Therefore the constraints may need to be linearized to find a viable solution.

Each reference figure may or may not be included if there exists a reference input, reference trajectory, or reference target. If these do not exist, then each of those quanities may be set to zero. This depends on the application and use case. This formulation is a receding horizon application which finds basis in the Bellman equation. So, it can be thought of as the following:

minx,u[l(xk,uk)+Vk+1(f(xk,uk)]\min_{x_,u} [l(x_k,u_k) + V_{k+1}(f(x_k,u_k)]

where l and V represent the running cost and final cost respectively.

In the formulation above, Q, R, and Qf are cost matrices which tune the agent’s behavior. N defines the horizon over which this optimal control problem will be solved.

However, this basic formulation may be vectorized for increased computational efficiency in two forms as follows:

Condensed MPC Form

Dynamics and Objective Function

The goal is to create a vectorized QP so that optimization libraries are able to efficiently find a solution. The condensed form of MPC is a strong choice for small to medium sized or simple problems. The form can be found as follows:

First we incorporate the dynamics directly into the QP.

X=Sxx0+SuUX = S_xx_0 + S_uU

Notice in this case that X, U, and their coefficient matrices have the following forms:

X=[x1,x2,...,xN]TX = [x_1, x_2, … ,x_N]^T
U=[u0,u1,...uN1]TU = [u_0, u_1, … u_{N-1}]^T
Sx=[A,A2,A3,...AN]TS_x = [A, A^2, A^3, … A^N]^T
Su=[[B,0,0,...,0],[AB,B,0,...,0],[...],[ANB,AN1B,...B]]TS_u = \begin{matrix} \left[ [B, 0, 0, …, 0], \\ [AB, B, 0, …, 0], \\ […], [A^NB, A^{N-1}B, … B] \right]^T \end{matrix}

This comes from the direct applicatino of recursively applying the dynamics. For two steps, the formulation becomes the following and can be used to find the matrices above:

xk+1=A(A(xk1)+B(uk1))x_{k+1} = A(A(x_{k-1}) + B(u_{k-1}))

With this reformulation, the objective function for the optimized control algorithm can be rewritten:

here Q hat and R hat hold all the Q’s and R’s for all timesteps within the horizon, N. We can substitute X from above into the objective function:

(Sxx0+SuUXref)TQ^(Sxx0+SuUXref)+(UUref)TR^(UUref)(S_xx_0 + S_uU – X_{ref})^T\hat{Q}(S_xx_0 + S_uU – X_{ref}) + (U – U_{ref})^T\hat{R}(U – U_{ref})

we’ll define an offset, d, as the “free response error.”

d=Sxx0Xrefd = S_xx_0 – X_{ref}
(SuU+d)TQ^(SuU+d)+(UUref)TR^(UUref)(S_uU + d)^T\hat{Q}(S_uU+d) + (U-U_{ref})^T\hat{R}(U-U_{ref})
(SuU)TQ^(SuU)+(SuU)TQ^(d)+(d)TQ^(SuU)+dTQ^d+UTR^U(U)TR^(Uref)(Uref)TR^(U)+(Uref)TR^(Uref)(S_uU)^T \hat{Q} (S_uU) + (S_uU)^T\hat{Q}(d) +(d)^T\hat{Q}(S_uU) + d^T\hat{Q}d + U^T\hat{R}U -(U)^T\hat{R}(U_{ref}) – (U_{ref})^T\hat{R}(U) +(U_{ref})^T\hat{R}(U_{ref})

Since U reference and the error are not optimized inthis formulation (which will now exclusively and more naturally depend on U) these will become 0 during the minimization and may be discarded. They have no bearing on the output.

(SuU)TQ^(SuU)+(SuU)TQ^(d)+(d)TQ^(SuU)+UTR^U(U)TR^(Uref)(Uref)TR^(U)(S_uU)^T \hat{Q} (S_uU) + (S_uU)^T\hat{Q}(d) +(d)^T\hat{Q}(S_uU) + U^T\hat{R}U -(U)^T\hat{R}(U_{ref}) – (U_{ref})^T\hat{R}(U)
UT(SuTQ^Su+R^)U+(SuU)TQ^(d)+(d)TQ^(SuU)UTR^UrefUrefTR^UU^T(S_u^T\hat{Q}S_u + \hat{R})U +(S_uU)^T\hat{Q}(d)+(d)^T\hat{Q}(S_uU) – U^T\hat{R}U_{ref} -U_{ref}^T\hat{R}U

Note that the following quantities are scalar, and equal their transpose:

UTSuTQ^d=(dTQ^SuU)=(UTSuTQ^d)TU^TS_u^T\hat{Q}d = (d^T\hat{Q}S_uU) = (U^TS_u^T\hat{Q}d)^T
UTR^Uref=(UrefTR^U)=(UTR^Uref)TU^T\hat{R}U_{ref} = (U_{ref}^T\hat{R}U) = (U^T\hat{R}U_{ref})^T

These equivalences are possible because Q and R are symmetric.

Therefore, we can write the equation as:

UT(SuTQ^Su+R^)U+2(dTQ^SuUrefTR^)U+0U^T(S_u^T\hat{Q}S_u +\hat{R})U +2(d^T\hat{Q}S_u – U_{ref}^T\hat{R})U + 0

Which we want to match to the standard quadratic form:

minU12UTHU+fTU+constant\min_{U}\frac{1}{2} U^T HU +f^TU + constant

Substituting d back into the equation we get the final form:

minU12UT(2SuTQ^Su+2R^)U+2(SuTQ^(Sxx0Xref)R^Uref)TU\min_{U}\frac{1}{2}U^T(2S_u^T\hat{Q}S_u + 2\hat{R})U +2(S_u^T\hat{Q}(S_xx_0 – X_{ref})-\hat{R}U_{ref})^TU

Constraints

The objective function now contains the dynamics of the system in a condensed form. However, the constraints above also must be reformulated into the following form:

AineqUbineqA_{ineq} U\leq b_{ineq}

Looking at the general constraints below:

UminUUmaxU_{min} \leq U \leq U_{max}
UUmaxU \leq U_{max}
UUmin,UUminU \geq U_{min}, \, \, \,-U \leq -U_{min}

Recalling our formulation of X which encodes the dynamics:

XminXXmaxX_{min} \leq X \leq X_{max}
XminSxx0+SuU,SuUXminSxx0X_{min} \leq S_xx_0 + S_uU, \, \, \, -S_uU \leq X_{min} -S_xx_0
Sxx0+SuUXmax,SuUXmaxSxx0S_xx_0 +S_uU \leq X_{max}, \, \, \, S_uU \leq X_{max} -S_xx_0

so, we an reformulate the quanities as follows for standard form:

Aineq=[I,I,Su,Su]TA_{ineq} = [I, -I, S_u, -S_u]^T
bineq=[Umax,Umin,[XmaxSxx0],[Sxx0Xmin]]Tb_{ineq} = [U_{max}, -U_{min}, [X_{max} – S_xx_0], [S_xx_0 – X_{min}]]^T

Final Condensed Equation Set

The final condensed form equations are:

Sparse MPC Form

Objective Function

Sparse MPC form is useful for larger, nonlinear, or time-varying MPC formulations. In these cases it’s more computationally efficient.

Starting with the initial formulation to create the following forms:

minz12zTHz+fTz\min_{z} \frac{1}{2} z^THz + f^Tz
Aeqz=beqA_{eq}z = b_{eq}
AineqzbineqA_{ineq}z \leq b_{ineq}

z is defined as the following:

z=[x,u]Tz = [x, u]^T

which allows a natural construction of H:

H=[[2Q1...0],[0,2Q2,...0]...[0…2QN..0],[0…0,2R1,0…0][0,...,0,2R2,0…0],...[0…2RN1]]TH = [[2Q_1 …0],[0,2Q_2,…0]…[0…2Q_N.. 0],[0… 0, 2R_1,0 …0][0,…,0,2R_2,0…0],…[0…2R_{N-1}]]^T

Where H is a matrix with N Q’s and N-1 R’s sequentially defined along the diagonal. By factoring 1/2 out from the objective function, we find we must multiply 2 along the diagonal matrix to maintain equality.

In order to find f, we need to walk through the general, vectorized equation once more.

cost=(xkxk,ref)TQ^(xkxk,ref)+(UkUk,ref)TR^(UkUk,ref)cost = (x_k-x_{k,ref})^T\hat{Q}(x_k-x_{k,ref}) + (U_k -U_{k,ref})^T\hat{R}(U_k -U_{k,ref})
xkTQ^xk2xk,refTQ^+UkTR^Uk2Uk,refTR^+constantsx_k^T\hat{Q}x_k – 2x_{k,ref}^T \hat{Q} + U_k^T\hat{R}U_k -2U_{k,ref}^T\hat{R} + constants

This depends on the same properties as the condensed derivation: Cost matrix symmetry, and defining quantities as scalars to find linear factors of 2. After we distribute this form, we can rearrange to fit the formula in z above:

f=[[2QTxk,ref],[2RTuk,ref]]Tf = [[-2Q^Tx_{k,ref}],[-2R^Tu_{k,ref}]]^T

Constraints

We can also embed the dynamics and initial position for the analysis into the equality constraint:

x(t)=x0x(t) = x_0
xk+1AkxkBkuk=0x_{k+1} – A_kx_k – B_ku_k=0

When turned into a matrix becomes the following form:

Aeqz=beqA_{eq} z = b_{eq}
Aeq=[[I,0,0,...0|0…0],[A,I,0,...,0|B,0,...,0],[0,A,I,0…|0,B,0,...,0]]TA_{eq} = [[I, 0, 0, … 0 | 0…0],[-A, I, 0, …, 0| -B, 0,…,0], [0, -A, I, 0…|0, -B,0,…,0]]^T
beq=[x(t),0,...,0]Tb_{eq} = [x(t), 0, … , 0]^T

Note that the pipe in the A matrix is to show the dimension separation, it should be a matrix of the form [A section | B section], not a mathematical operation.

Finally, we can encode state and control bounds into the inequality constraint form:

xxmaxx \leq x_{max}
xxmin,xxminx \geq x_{min}, \, \, \, -x \leq-x_{min}
uumaxu \leq u_{max}
uumin,uuminu \geq u_{min}, \, \, \, -u \leq -u_{min}
AineqzbineqA_{ineq} z \leq b_{ineq}
Aineq=[[I,0],[I,0],[0,I],[0,I]]TA_{ineq} = [[I, 0],[-I, 0],[0, I],[0, -I]]^T
bineq=[xmax,xmin,umax,umin]Tb_{ineq} = [x_{max}, x_{min}, u_{max},u_{min}]^T

Final Sparse Equation Set

This gives us the final equation set for sparse form:

where the matrices are defined as:

While not nearly as simple as the condensed form, it serves a purpose for complex, large dimension, and nonlinear system MPC derivation.

Additional Autonomy Constraints

Constraints for Behavior

With the core elements now derived and vectorized for computational efficiency, we can define autonomous behavior through the implementation of constraints on this system. Approach corridors, obstacle evasions, formations, Line-of-sight control, keep-out zones, and stay-on-side conditions, and speed/accelrations bounds are mathematically represented as constraints in MPC. this leaves a broad range of behaviors which the engineer can manager. Whie the controllers themselves need tuning in cost and application, there are a staggering number of options which may be used at any given time. This may be a good application of Reinforcement Learning as a system to autnomously manage these maneuvers and behaviors.

Input Bounds:

Speed Bounds:

Acceleration Bounds:

Keep-Out Zone:

Continuous Linearized Control Barrier Function (arbitary geometry):

Note: h is the barrier function itself encoding geometry. It could be the same as shape as keep out zones.

Line of Sight:

Note: f here references the targets forward direction. This indicates the agent must keep line of sight within a max angle from the darget’s centerline.

Approach Corridor:

Note: f here references the target’s desired approach direction. This indicates the agent must approach from the rear of the target

Formation (Multi-Agent):

Note: a of i points in the desired direction from agent i to agent j. The formula above assumes i leads

Stay on Side:

Note: n is the normal vector to the boundary that you’re interested in. In the above formation, x of k is supposed to stay away from an edge by a distance b.

Cost Formulation for Behavior

It’s worth noting that the redesign of the cost matrix (perhaps such that the final position is a linear rather than quadratic term within certain ranges) allow autonomous navigation prioritization as well. In this linear final position example, the agent may, for example, more effectively avoid softened collision constraints so as to effectively make moving quickly through an obstacle costly.

Engineers may also adjust the states and control aspects of the function. For example, to limit rapid oscillation, a quadratic or linear cost may be introduced in order to heavily penalize rapid changes in control input, accelerations, or other aspects of the system.

Sometimes the collision constraints must be softened to find a feasible solution to the convex problem. This ultimately must become an engineering decision.


Comments

Leave a Reply

Discover more from Andrew Adie

Subscribe now to keep reading and get access to the full archive.

Continue reading