Daniel Porumbel
We take a first step towards showing how
the constrained optimum of a function has to satisfy
the (Karush-Kuhn-Tucker) KKT conditions. We will not prove that the presented KKT
conditions are necessary in the most general sense, because we will not discuss the regularity conditions
(Footnote 3). However, towards the end of this manuscript we will show the KKT
conditions are necessary and sufficient for a barrier linear program used in
interior point methods.
Consider differentiable functions
and the program:
![]() |
![]() |
|
![]() |
||
![]() |
||
![]() |
Suppose this program has a constrained maximum at point
such that
,
and
.
We will show this point satisfies the KKT conditions by first addressing the
constraints individually and then joining the respective conditions. In fact, we
will see that the KKT conditions are verified by all local maxima of
in above
program.
1) Let us first address the equality constraint individually.
Consider any (contour) curve
on the surface of
with
and
.
This curve has to satisfy
. Differentiating this with respect to
we obtain via the chain rule (and a notation abuse discussed below):
However, the gradient of in
is thus perpendicular to (the tangent of) any curve1
that belongs to the surface of
. Consider now the
function
and
observe its derivative with respect to
in 0 needs to be zero, because
otherwise one could move away from
in some direction along
and increase the value of
.
Using the chain rule as in (1),
we obtain:
2) We now address the first inequality constraint individually.
Consider as above a curve
such that
with
and
. We stated that we consider
. The intuition is that the curve starts at the constrained
optimum
and then it goes inside (or on the surface of) the constraint. As
such, the right derivative in
at point 0 satisfies:
.
Using the chain rule as in (1), we obtain:
Since is a constrained maximum, the function
needs to be decreasing
as we move along
from
. This means that the right
derivative in 0 satisfies
.
Analogously to (2), we obtain:
This shows there exists some such that
, because
is the only direction such that
(2) holds for all feasible curves
. Observe we need to state
because otherwise the signs of the above inequalities would be
reversed.
3) We now address the last inequality constraint. Since
, the evolution of
around
does not depend on
. We can
ignore
, as it plays no role in ensuring that
is a local optimum.
Let us now join the arguments of 1), 2) and 3).
The point 1) shows that any
objective function such that
(with
) allows
to be maximum. By moving along any curve
on the level
surface of
, we have
.
The point 2) shows that any objective function such that
(with
) allows
to be maximum.
By moving along any curve
feasible with respect to
, we
have
: by going inside the constraint, the objective
function decreases.
Consider now . By moving from
along any curve
with
that satisfies all constraints, we have
, because the first term yields
given that
is feasible with respect to
and
the second term yields
because
is feasible with respect to
.
We conclude that can be a constrained maximum for any
objective
function
whose gradient
has the form:3
The joining argument can be generalized to more functions
and
and we obtain the KKT conditions:
The first condition is often expressed as follows: the stationary point of the
Lagrangian
in
is zero. This condition can also be found by applying the Lagrangean
duality and by writing the Wolfe dual problem.
Notice that if all functions are linear, it is superfluous to evaluate the
gradient in , because the value of the gradient is the same in all points.
(1,0)100
We now give a full proof of the KKT conditions for the following barrier problem
used by Interior Point Methods (IPMs) for linear programming (supposing the rows
are linearly independent because otherwise they can be filtered).
![]() |
![]() |
|
![]() |
There is no chance of finding an optimum solution
with some
close
to zero, because the log function exploses when its argument approaches zero.
Since all involved functions are convex, this program needs to have a unique solution
of minimum cost. We can prove that the KKT conditions are necessary and
sufficient to certify
is the optimal solution. The KKT conditions
discussed above actually reduce to:
The necessity Take the optimal
. The second condition is
clearly necessary from the definition of
. We prove the first condition
by contradiction. Assume for the sake of contradiction that
can not be written as a linear combination of vectors
,
,
. This means we can write
, where
belongs to the null space of
,
,
, i.e.,
.
Let us check what happens if one moves from
back or forward along
direction
. For this, it is enough to study the function
.
Using the chain rule, we can calculate
. By taking a sufficiently
small
step from
towards
, the function
becomes smaller and the
constraints
(
) remain valid. This is a contradiction.
The sufficiency For the sake of contradiction, we assume there is
some feasible
such that
that satisfies
both KKT conditions, i.e,
is primal feasible at it can be written
for some
.
Let us define the Lagrangian
. Considering this fixed
indicated above, this Lagrangian function is convex in
the variables
. Given that
, we obtain that
has to be the unique stationary point of the Lagrangian function in
, and
so,
has to be the
unique minimizer of the Lagrangian.
This is a contradiction, because we have
.