\input{template}
\input{macros}
\usepackage{color, graphicx}
\usepackage{amssymb, amsmath}
\usepackage{epsfig}

\begin{document}
\lecture{12} {The Simplex Algorithm III}{ B. Aditya Prakash}

In this lecture we will give the stopping condition of the Simplex algorithm and also prove that the algorithm is correct.

To recap, at a given extreme point $x_{0}$:

\begin{enumerate}
 \item There are matrices $A^{'}$, $b^{'}$, $A^{''}$ and $b^{''}$ (constructed from $A$ and $b$)  s.t. $A^{'}x_{0} = b^{'}$ and $A^{''}x_{0} \leq b^{''}$.
\item The directions of the neighbouring extreme points are the columns of the matrix $-{A^{'}}^{-1}$.
\end{enumerate}


\paragraph*{Stopping Condition}
\label{para:1}
\textit{The algorithm stops at an extreme point $x_{0}$ and returns it as optimal when the cost at all the neighbouring extreme points of $x_{0}$ is less than that at $x_{0}$. }


\section{Proof of Correctness}

We now try to prove that the Simplex algorithm is correct i.e. when it terminates, we indeed have found the globally optimal point.

First of all note that this is \textit{not} the same as saying that $x_{0}$ is local maximum and hence by a previously proved theorem, it is also a global maximum. This is so because we just know that the cost at $x_{0}$ is maximum as compared to its \textit{neighbours} - not compared to a small enough \textit{neighbourhood} around it.

\subsection{First Approach}
One approach would be to consider a small enough neighbourhood $N$ around $x_{0}$. Now suppose we somehow prove that any point $p \in N$ can be written as a convex combination of all the neighbours of $x_{0}$, i.e.: 
\begin{equation}
 p = \sum_{i=0}^n \lambda_{i}x_{i} 
\end{equation}
\begin{equation}
\sum \lambda_{i} = 1
\end{equation}

Then we are clearly done because we know that $\forall{i}$  $c^{T}x_{i} \leqslant c^{T}x_{0}$. Hence, 

\begin{equation}
c^{T}p = \sum \lambda_{i}c^{T}x_{i}  \leqslant  c^{T}x_{0}(\sum \lambda_{i}) = c^{T}x_{0}
\end{equation}

Thus, $x_{0}$ is a local maximum and consequently a global maximum.

\subsection{Second Approach}

We adopt a different approach than the above for the proof. Assume that $x_{0}$, where the algorithm terminates, is not an optimal point. Also, suppose that there is some other optimal point $x_{opt}$. Therefore, $c^{T}x_{opt} > c^{T}x_{0}$.

Now as $x_{0}$ is an extreme point, $A^{'}$ has full rank. So, even $-{A^{'}}^{-1}$ has full rank. Or in other words, the $n$ columns form a basis of the space. Hence, the vector $x_{opt} - x_{0}$ can be written as a linear combination of these columns, i.e.:
\begin{equation}
\label{lec12:eq}
x_{opt} - x_{0} = \sum_{i} \beta_{i} (-{A^{'}}^{-1})^{(i)}
\end{equation}

where $B^{(i)}$ represents the $i^{th}$ column of a matrix $B$. Pre-multiplying with $A^{'}$ in the above equation, we get 
\begin{equation}
A^{'}x_{opt} - A^{'}x_{0} = \sum_{i} \beta_{i} A^{'}(-{A^{'}}^{-1})^{(i)}
\end{equation}
Note the following in the above equation:
\begin{enumerate}
 \item As $x_{opt}$ is a feasible point, $A^{'}x_{opt} \leqslant b^{'}$ whereas $A^{'}x_{0} = b^{'}$. Hence $A^{'}x_{opt} - A^{'}x_{0}$ will be a vector with each component $\leq 0$.
\item $A^{'}(-{A^{'}}^{-1})^{(i)} \leq \mathbf{0}$ or to be more specific it has a zero at all positions except at the $i^{th}$ row where it is $-1$.
\end{enumerate}

These two observations imply that $\forall{j}$ $\beta_{j} \geqslant 0$.

Now pre-multiply with $c^{T}$ in Equation~\ref{lec12:eq}. We get 
\begin{equation}
c^{T}x_{opt} - c^{T}x_{0} = \sum_{i} \beta_{i} c^{T}(-{A^{'}}^{-1})^{(i)} 
\end{equation}

We know that as $(-{A^{'}}^{-1})^{(i)}$ are directions of the neighbours, $(-{A^{'}}^{-1})^{(i)} = \alpha_i(x_{i} - x_{0})$ where $x_{i}$'s are the neighbours of $x_{0}$ and $\alpha_i \geq 0$. As we have stopped at $x_{0}$, $\forall{x_{i}}$ $c^{T}x_{i} - c^{T}x_{0} \leq 0$. Coupled with the fact that $\beta_{j} \geqslant 0$, the R.H.S. of the above equation is $\leq 0$. Hence, we get 
\begin{equation}
c^{T}x_{opt} - c^{T}x_{0} \leq 0 
\end{equation}

Comparing this with our assumption, we find that there is a contradiction. So, our assumption is wrong and  $x_{0}$ is indeed an optimal point. 
\end{document}