Constrained Model Predictive Control (MPC) Optimization Formulation and Quadratic Programming Equations.

The purpose of this article is to provide a practical, expert-level derivation of the standard constrained Model Predictive Control (MPC) optimization problem, show how it becomes a quadratic program (QP) for linear models, and explain how to implement constraints reliably in real process control applications.

1. Why MPC is an optimization problem with constraints.

Model Predictive Control is a control method that computes the next control move by solving a finite-horizon optimization problem at every sampling instant.

The controller predicts future trajectories using a model, chooses a sequence of future manipulated-variable moves, enforces constraints explicitly, and applies only the first move before repeating the procedure at the next sample.

This receding-horizon structure is what makes MPC different from classical PID when constraints matter, because constraint handling is built into the decision step rather than added as external logic.

1.1 What “constraints” mean in MPC for real plants.

Constraints represent hard physical or safety limits, operating requirements, and actuator limitations that must be respected during transient operation.

Typical industrial constraints include manipulated variable bounds, rate-of-change limits, output limits, and sometimes internal state limits such as temperatures, pressures, or inventories.

MPC can also include soft constraints using slack variables to avoid infeasibility when disturbances or model mismatch make perfect satisfaction impossible.

Note : If constraints are implemented without consistent engineering units, scaling, and a clear definition of what is measured versus estimated, the optimizer can become numerically ill-conditioned and produce unreliable moves.

2. Discrete-time prediction model used by MPC.

Most MPC implementations use a discrete-time model with sampling time Ts, either obtained by discretizing a continuous-time model or identifying a discrete-time model directly.

A standard linear time-invariant state-space model is used to illustrate the core optimization equations.

x_{k+1} = A x_k + B u_k + B_d d_k. y_k = C x_k + D u_k.

Here x_k is the state, u_k is the manipulated variable vector, y_k is the controlled output vector, and d_k represents measured disturbances when available.

If output disturbances or unmeasured disturbances are important, an augmented model with disturbance states or an observer is typically used, but the MPC optimization structure remains the same.

2.1 Horizons and decision variables.

Define a prediction horizon Np and a control horizon Nc with 1 ≤ Nc ≤ Np.

The optimizer chooses a sequence of future control moves, often represented as either future inputs u_{k+i} or input increments Δu_{k+i}.

Using increments is common because it makes rate constraints and move suppression natural.

Δu_{k+i} = u_{k+i} - u_{k+i-1}. Decision vector: ΔU = [Δu_k^T, Δu_{k+1}^T, ..., Δu_{k+Nc-1}^T]^T.

After Nc, a move-blocking assumption is often applied, such as holding Δu_{k+i} = 0 for i ≥ Nc, which reduces decision dimension while keeping predictions over Np.

3. Standard MPC objective function for tracking and effort.

The objective is designed to track a reference trajectory while avoiding excessive control action.

A typical quadratic cost penalizes predicted output tracking error and penalizes input increments.

J(ΔU) = Σ_{i=1..Np} (y_{k+i|k} - r_{k+i})^T Q (y_{k+i|k} - r_{k+i}) + Σ_{i=0..Nc-1} (Δu_{k+i})^T R (Δu_{k+i}) + (x_{k+Np|k} - x_ref)^T P (x_{k+Np|k} - x_ref).

Q is a positive semidefinite output-weight matrix, R is a positive definite move-weight matrix, and P is a terminal weight that can improve stability and performance.

For regulation problems, r_{k+i} is often constant, and x_ref is the steady-state that matches the reference under constraints when feasible.

Note : If Q and R are not scaled relative to the engineering magnitudes of outputs and manipulated variables, tuning becomes misleading because one variable dominates the cost purely due to units.

4. Constraint sets used in constrained MPC.

MPC constraints are usually written as linear inequalities for linear MPC, but they can also be nonlinear for nonlinear MPC.

The most common constraints in process control are listed below.

Constraint type. Typical form. Common implementation detail.
Manipulated variable bounds. u_min ≤ u_{k+i} ≤ u_max. Implemented over i = 0..Np-1 or i = 0..Nc-1 depending on definition.
Rate-of-change bounds. Δu_min ≤ Δu_{k+i} ≤ Δu_max. Natural when ΔU is the decision vector.
Output bounds. y_min ≤ y_{k+i|k} ≤ y_max. Often softened with slack variables to prevent infeasibility.
State bounds. x_min ≤ x_{k+i|k} ≤ x_max. Used when states have safety meaning, or for constraint tightening with estimation uncertainty.
Terminal constraint. x_{k+Np|k} ∈ X_f. Used for recursive feasibility and stability in rigorous designs.

4.1 Hard constraints versus soft constraints.

Hard constraints must never be violated, such as actuator saturation or safety interlocks that represent non-negotiable limits.

Soft constraints allow limited violation at a penalty cost, which is essential for outputs when disturbances can make strict enforcement impossible.

Soft output constraint example with slack ε ≥ 0. y_min - ε ≤ y_{k+i|k} ≤ y_max + ε. Add penalty: ρ ||ε||_1 or ε^T S ε with large weights.
Note : Softening every constraint is not safe, because some bounds represent physical limits, so softening should be applied only to constraints that have a defined operational tolerance.

5. Stacked prediction equations for linear MPC.

To convert MPC into a standard QP, stack the predicted outputs across the horizon as an affine function of the current state and the future input increments.

Define stacked vectors Y and ΔU.

Y = [y_{k+1|k}^T, y_{k+2|k}^T, ..., y_{k+Np|k}^T]^T. ΔU = [Δu_k^T, Δu_{k+1}^T, ..., Δu_{k+Nc-1}^T]^T.

For a linear model with prediction starting from estimated state x_k, the stacked output prediction can be written as.

Y = F x_k + G ΔU + g_d.

F maps the current state to future outputs, G maps future increments to future outputs, and g_d collects known terms such as measured disturbances, previous input u_{k-1}, and reference offsets depending on the chosen formulation.

Different textbooks and software packages define F and G slightly differently depending on whether the optimization uses u or Δu, and whether D is included, but the structure remains affine in ΔU for linear models.

Symbol. Meaning. Typical dimensions.
x_k. Current estimated state. n_x × 1.
ΔU. Stacked future input increments. (N_c n_u) × 1.
Y. Stacked predicted outputs. (N_p n_y) × 1.
F. State-to-output prediction matrix. (N_p n_y) × n_x.
G. Decision-to-output prediction matrix. (N_p n_y) × (N_c n_u).

6. Quadratic program (QP) form of constrained linear MPC.

When the model is linear and the cost is quadratic with linear constraints, the optimization becomes a convex quadratic program.

Write the stacked reference vector R similarly to Y, and define the tracking error E = Y − R.

Using block-diagonal weighting matrices Q̄ and R̄ across horizons, the cost can be written in quadratic form in ΔU.

Q̄ = diag(Q, Q, ..., Q) over Np steps. R̄ = diag(R, R, ..., R) over Nc steps. J(ΔU) = (F x_k + G ΔU + g_d - R)^T Q̄ (F x_k + G ΔU + g_d - R) + ΔU^T R̄ ΔU + terminal terms.

Expanding yields the standard QP objective.

min_{ΔU} (1/2) ΔU^T H ΔU + f(x_k)^T ΔU subject to A_ineq ΔU ≤ b_ineq(x_k) and possibly A_eq ΔU = b_eq(x_k).

For the common tracking plus move-suppression cost, the Hessian and gradient are.

H = 2 (G^T Q̄ G + R̄) and f(x_k) = 2 G^T Q̄ (F x_k + g_d - R).

H is positive definite if R̄ is positive definite and the formulation is well-posed, which makes the QP strictly convex and yields a unique optimizer.

6.1 How constraints become linear inequalities in ΔU.

Rate constraints are directly bounds on ΔU.

Input bounds are linear in ΔU because u_{k+i} is the cumulative sum of increments starting from the known previous input u_{k-1}.

Output bounds are linear in ΔU because Y is affine in ΔU.

Δu_min ≤ Δu_{k+i} ≤ Δu_max. u_{k+i} = u_{k-1} + Σ_{j=0..i} Δu_{k+j}. u_min ≤ u_{k+i} ≤ u_max. y_min ≤ y_{k+i|k} ≤ y_max. y_{k+i|k} is a row block of (F x_k + G ΔU + g_d).

All of these can be stacked into a single inequality matrix A_ineq and vector b_ineq(x_k) that depends affinely on the current state estimate and known signals.

Note : If the model includes integrating states or disturbance augmentation, constraints should be applied to physically meaningful variables, which may require selecting or transforming predicted variables before constraining them.

7. Sparse versus condensed QP formulations.

There are two widely used ways to form the QP in practice.

A condensed formulation eliminates the predicted states and outputs so the decision variables are only ΔU, which gives a dense Hessian but fewer variables.

A sparse formulation keeps states and sometimes outputs as decision variables with dynamic equality constraints, which increases variable count but preserves a banded sparse structure.

Modern solvers often exploit sparsity to achieve predictable computation time for long horizons and multivariable systems.

7.1 Practical selection rules for industrial MPC.

If horizons are short and the number of inputs is small, condensed QPs are often adequate and simple.

If horizons are long, the model is large, or the controller must run at high sampling rates, structure-exploiting sparse solvers are usually preferable.

Warm-starting with the shifted previous solution is a standard technique to reduce computation time in both forms.

8. Nonlinear MPC (NMPC) formulation with constraints.

When the process model is nonlinear, MPC is formulated as a constrained optimal control problem over the horizon.

The objective is similar, but predictions require nonlinear simulation, and the optimization becomes a nonlinear program.

min_{u(·)} ∫_{t_k}^{t_k+T} ℓ(x(t), u(t), r(t)) dt + V_f(x(t_k+T)) subject to ẋ(t) = f(x(t), u(t), d(t)) and g(x(t), u(t)) ≤ 0 and x(t_k) = x̂_k.

In implementation, this continuous-time problem is transcribed into a finite-dimensional nonlinear program using direct multiple shooting or direct collocation.

Constraints can be enforced at grid points, and sometimes between points using conservative tightening if necessary.

For safety-critical constraints, additional feasibility mechanisms such as constraint back-offs and robust margins are common.

Note : In NMPC, feasibility can be lost more easily due to model mismatch and local minima, so constraint softening and good initial guesses are essential for reliable closed-loop operation.

9. A step-by-step implementation template for constrained MPC.

The workflow below is a practical template that maps directly to the equations above.

At each sampling instant k. 1) Measure plant outputs y_k and measured disturbances d_k. 2) Update state estimate x̂_k using an observer or a Kalman filter. 3) Build prediction matrices F, G and known-term vector g_d, or build sparse dynamics constraints. 4) Assemble QP matrices H and f(x̂_k), and assemble inequality constraints A_ineq and b_ineq(x̂_k). 5) Solve the constrained optimization problem to obtain optimal ΔU*. 6) Apply u_k = u_{k-1} + Δu_k* to the plant. 7) Shift horizon and warm-start next step with the shifted solution.

9.1 Common engineering details that determine success.

Reference handling should be consistent with constraints, which often requires computing a feasible steady-state target and then tracking that target rather than a raw setpoint.

Offset-free control typically requires disturbance modeling or integral action via state augmentation so that steady-state errors vanish under constant disturbances.

Constraint tightening or safety margins should be used when measurement noise, estimation error, or actuator deadbands can cause unintended violations.

Move suppression and output filtering should be tuned to avoid aggressive behavior that can excite unmodeled dynamics.

FAQ

What is the exact optimization problem that linear MPC solves at each sample.

Linear MPC typically solves a convex quadratic program where the decision variables are the future input moves or input increments over a control horizon.

The objective is a quadratic function of predicted tracking errors and move sizes, and constraints are linear inequalities representing input, rate, output, and sometimes state bounds.

Why do many MPC designs use Δu as the decision variable instead of u.

Using Δu makes rate limits and smoothness penalties straightforward because they become direct bounds and quadratic penalties on the decision variables.

It also reduces steady oscillations caused by aggressive absolute input changes because the optimizer is explicitly discouraged from large moves.

How are soft constraints added without changing the overall QP structure.

Soft constraints are added by introducing nonnegative slack variables and penalizing them in the objective with large weights.

If the slack penalty is quadratic and the constraints remain linear, the problem remains a convex QP with an expanded decision vector that includes the slack variables.

What makes an MPC optimization infeasible, and how is it handled.

Infeasibility occurs when no sequence of future inputs can satisfy all constraints given the current estimated state, disturbances, and model.

Common remedies include softening selected output constraints, adding safety margins consistently, reducing horizon aggressiveness, and ensuring the state estimator and model are aligned with the real plant.

When should nonlinear MPC be used instead of linear MPC.

Nonlinear MPC is used when important process behavior is strongly nonlinear across the operating envelope and linearization leads to unacceptable prediction error.

If constraints are tight and nonlinearities determine feasibility or safety, NMPC can significantly improve performance, but it requires careful solver setup and computational resources.

: