\documentclass{article}
\usepackage[margin=1in]{geometry}
\usepackage{amsmath,amsthm,amssymb,tikz,circuitikz}
\usetikzlibrary{arrows,positioning}
\usetikzlibrary{arrows.meta}
\usepackage{relsize}
\newcounter{lecnum}
\usepackage{graphicx}
\usepackage{MnSymbol}
\graphicspath{./}
\usepackage{caption}
\usepackage{subcaption}
\usepackage{parskip}
\setlength\parindent{0pt}
\newcommand{\abs}[1]{\lvert #1 \rvert}
\newcommand{\lecture}[4]{
   \newpage
   \setcounter{lecnum}{#1}
   \noindent

   \begin{center}
   \framebox{
      \vbox{\vspace{2mm}
    \hbox to 16cm { {\bf CS760 Topics in Computational Complexity
                        \hfill 2024-25 Sem I} }
       \vspace{4mm}
       \hbox to 16cm { {\Large \hfill Lecture #1: #2  \hfill} }
       \vspace{2mm}
       \hbox to 16cm { {\it Scribe: #4  \hfill  Lecturer: #3} }
      \vspace{2mm}}
   }
   \end{center}
   \vspace*{4mm}
}

\usepackage{amsmath,amsfonts,graphicx}
\usepackage{textcomp} % for \textquotesingle
\usepackage{listings}
\usepackage{tikz}
\usetikzlibrary{calc}
\usepackage{enumitem}
\usepackage{xcolor}
\usetikzlibrary{positioning}
%New colors defined below
\definecolor{codegreen}{rgb}{0,0.6,0}
\definecolor{codegray}{rgb}{0.5,0.5,0.5}
\definecolor{codepurple}{rgb}{0.58,0,0.82}
\definecolor{backcolour}{rgb}{0.95,0.95,0.92}

%Code listing style named "mystyle"
\lstdefinestyle{mystyle}{
  %backgroundcolor=\color{backcolour}, commentstyle=\color{codegreen},
  keywordstyle=\color{magenta},
  numberstyle=\color{codegray},
  stringstyle=\color{codepurple},
  basicstyle=\ttfamily\large,
  breakatwhitespace=false,         
  breaklines=true,                 
  captionpos=b,                    
  keepspaces=true,                 
  numbers=left,                    
  numbersep=5pt,                  
  showspaces=false,                
  showstringspaces=false,
  showtabs=false,                  
  tabsize=2
}

%"mystyle" code listing set
\lstset{style=mystyle}


\newtheorem{theorem}{Theorem}[lecnum]
\newtheorem{lemma}[theorem]{Lemma}
\newtheorem{proposition}[theorem]{Proposition}
\newtheorem{claim}[theorem]{Claim}
\newtheorem{corollary}[theorem]{Corollary}
\newtheorem{definition}[theorem]{Definition}

% custom
\usepackage{enumitem}
\usepackage{hyperref}
\usepackage{cleveref}
\usepackage{commath}
\newcommand{\N}{\mathbb{N}}
\newcommand{\R}{\mathbb{R}}
\newcommand{\F}{\mathbb{F}}
\newcommand{\E}{\mathbb{E}}
\newcommand{\restr}[2]{\ensuremath{\left.#1\right|_{#2}}}
\newcommand{\aw}{\operatorname{aw}}

\begin{document}

\lecture{15}{27-09-2024}{Rohit Gurjar}{Rishabh RP, Anand Narasimhan}

\section{Proof that Determinant is in ${\textbf{NC}}^\textbf{2}$}

In the previous class, we got a formula for determinant by representing permutations as cycle covers.

\[ \text{Det}(M) = \sum_{C \text{ is a cycle cover}} {(-1)}^{n + \text{\#cycles in } C} \prod_{(i,j) \in C} M_{ij} \]

The problem with our formula is that it is not clear how to compute this summation efficienty, which needs to iterate over exponentially many cycle covers.
What we'll do now is smartly add more terms to this summation, and magically the new terms we add won't change the value of the summation but will make it easier to compute.

\subsection{Clows and Clow Sequences}

A clow is a closed walk where both edges and vertices can repeat. Note that every cycle is a clow. We are going to extend the set of cycle covers, to that of clow sequences. 

We call $\mathfrak{C} = (C_1, C_2, \dots, C_k)$ (where $C_1, C_2, \dots, C_k$ are clows) a clow sequence if it satisfies the following properties:
\begin{itemize}
    \item Every clow has a head (default first vertex), which is given by a function $h$. $h(C_i)$ is the minimum vertex of $C_i$.
    \item $h(C_1) < h(C_2) < \dots < h(C_k)$.
    \item In any clow $C_i$, the head can't repeat (except at the end of the clow).  
    That is, if the vertex sequence of the clow is $(v_1, v_2, v_3, \dots, v_\ell)$, then for any $1 < j \leq \ell$, $v_1\neq v_j$ (the last edge of the clow is $(v_\ell,v_1)$).
\end{itemize}

These restrictions are placed so that we don't double count the clow sequences. The second property is just ordering the clows so that you can't just swap the order and get a new clow sequence. Why we define a head is so that we can't just cycle the elements to get a new clow, so we root every clow at a unique vertex. For example we want $(1, 2, 3)$, $(2, 3, 1)$ and $(3, 1, 2)$ to all be considered as the same clow with 1 as the head.

Note that every cycle cover is represented by a unique clow sequence. The cycles are uniquely ordered because of the second property, and every cycle is uniquely rooted at the minimum node.

We can represent every clow sequence as a list of tuples, where the minimum element of every sequence is placed at the beginning, and the beginning elements are in ascending order. We don't need to place the head at the end of every tuple as it is redundant.

\begin{figure}[ht]
\centering
\begin{tikzpicture}
    \tikzset{vertex/.style = {shape=circle,draw,minimum size=1.5em}}
    \tikzset{edge/.style = {->,> = latex'}}
    % vertices    
    \node[vertex] (1a) at (0,1) {1};
    \node[vertex] (2a) at (1,0) {2};
    \node[vertex] (3a) at (0,-1) {3};
    \node[vertex] (4a) at (-1,0) {4};
    \node[vertex] (5a) at (3,0) {5}; 
    %edges
    \draw[edge] (1a) to (2a);
    \draw[edge] (2a) to (3a);
    \draw[edge] (3a) to (4a);
    \draw[edge] (4a) to (1a);
    \draw[edge] (5a) to [loop right] ();
\end{tikzpicture}
\caption*{The clow sequence can be written as $[(1, 2, 3, 4), (5)]$}
\end{figure}

We consider the size of a clow with vertex sequence $(v_1, v_2, v_3, \dots, v_\ell)$ as $\ell$, that is, the number of edges in the clow.
The size of a clow sequence will be the sum of the sizes of the clows in the sequence. 
Let us denote size of a clow sequence $\mathfrak{C}$ as $|\mathfrak{C}|$.
Now we claim that if we extend our determinant formula to include terms of all clow sequences, not just cycle covers, the summation remains the same!

\subsection{Involution which proves Validity of Clow Sequence Summation}

We claim the following formula to be true: 

\[ \text{Det}(M) = \sum_{|\mathfrak{C}| = n} {(-1)}^{n + \text{\#clows in } \mathfrak{C}} \prod_{(i,j) \in \mathfrak{C}} M_{ij} \]

We have introduced extra terms to our summation but somehow all the extra terms ``cancel out". We shall rigorously prove this by finding a pairing function (involution) which makes sure that the term of a clow sequence cancels out with the clow sequence mapped to by our involution. And for cycle covers, our function will just be an identity mapping.

So we are looking for a function with the following properties: \\ 
$\varphi:$ set of clow sequences $\rightarrow$ set of clow sequences
\begin{itemize}
    \item $\varphi(\varphi(\mathfrak{C})) = \mathfrak{C}$
    \item $\prod_{(i,j) \in \mathfrak{C}} M_{ij} = \prod_{(i,j) \in \varphi(\mathfrak{C})} M_{ij}$ (infact our involution will maintain the same multi-set of edges)
    \item If $\mathfrak{C}$ is a cycle cover then $\varphi(\mathfrak{C}) = \mathfrak{C}$, else $\text{sgn}(\varphi(\mathfrak{C})) = -\text{sgn}(\mathfrak{C})$ (where $\text{sgn}(\mathfrak{C}) = {(-1)}^{n + \text{\#clows in } \mathfrak{C}}$)
\end{itemize}

Since we want our involution to preserve the edges but swap the parity of number of clows, our involution is going to merge/split clows in some form.
Consider the following example. 
\begin{figure}[ht]
    \centering
    \begin{tikzpicture}
        \tikzset{vertex/.style = {shape=circle,draw,minimum size=1.5em}}
        \tikzset{edge/.style = {->,> = latex'}}
        
        % vertices    
        \node[vertex] (1a) at (0,-0.5) {1};
        \node[vertex] (2a) at (1,-0.5) {2};
        \node[vertex] (3a) at (0,-2.5) {3};
        \node[vertex] (4a) at (0.5,-1.5) {4};
        \node[vertex] (5a) at (1,-2.5) {5};
        
        \node[vertex] (1b) at (4,0) {1};
        \node[vertex] (2b) at (5,0) {2};
        \node[vertex] (3b) at (4,-3) {3};
        \node[vertex] (4b) at (4.5,-1) {4};
        \node[vertex] (4c) at (4.5,-2) {4};
        \node[vertex] (5b) at (5,-3) {5};

        % edges
        \draw[red,edge] (1a) to (2a);
        \draw[red,edge] (2a) to (4a);
        \draw[red,edge] (4a) to (1a);
        \draw[blue,edge] (4a) to (3a);
        \draw[blue,edge] (3a) to (5a);
        \draw[blue,edge] (5a) to (4a);

        \draw[red,edge] (1b) to (2b);
        \draw[red,edge] (2b) to (4b);
        \draw[red,edge] (4b) to (1b);
        \draw[blue,edge] (4c) to (3b);
        \draw[blue,edge] (3b) to (5b);
        \draw[blue,edge] (5b) to (4c);

        \draw[edge] (1.5,-1.4) to [bend left] node[midway, above] {$\varphi$} (3.5,-1.4);
        \draw[edge] (3.5,-1.6) to [bend left] node[midway, above] {$\varphi$} (1.5,-1.6);
    
        
    \end{tikzpicture}
    \caption*{$\varphi([(1,2,4,3,5,4)]) = [(1,2,4), (3,5,4)]$ and  $\varphi([(1,2,4), (3,5,4)]) = [(1,2,4,3,5,4)]$}
    \end{figure}
    \pagebreak

Below is how the involution algorithm works. Let $\mathfrak{C} = (C_1, C_2, \dots, C_k)$. \footnote{In the following psuedocode we are using bold to represent a sequence of vertices. Also we return the clow sequence just as it is for cycle covers}

\par\noindent\rule{\textwidth}{0.4pt}
\begin{lstlisting}[mathescape]
for $i$ going from $k$ to $1$:
    for vertex $v \in C_i$ (in order they appear in $C_i$): 
        if $v \in C_j$ for some j > i:            (Merge Case)
            Let $C_i = \mathbf{s_i}.v.\mathbf{e_i}$ (for smallest possible $\mathbf{s_i}$) and $C_j = \mathbf{s_j}.v.\mathbf{e_j}$
            Replace $C_i$ with $\mathbf{s_i}.v.\mathbf{e_j}.\mathbf{s_j}.v.\mathbf{e_i}$ 
            Delete $C_j$
            Return the updated clow sequence.
        else if $v$ already appeared in $C_i$:         (Split Case)
            Let $C_i = \mathbf{s_i}.v.\mathbf{m_i}.v.\mathbf{e_i}$ (for smallest possible $\mathbf{s_i}, \mathbf{m_i}$)
            Replace $C_i$ with $\mathbf{s_i}.v.\mathbf{e_i}$
            Insert new clow $C'_i = v.\mathbf{m_i}$ at an appropriate position
            in the clow sequence, with an appropriate head.
            Return the updated clow sequence.
Return the clow sequence.
\end{lstlisting}
\par\noindent\rule{\textwidth}{0.4pt}


\begin{proposition}
\label{prop:disjoint}
    If the algorithm returns from clow $C_i$, then $C_{i+1}, C_{i+2}, \dots, C_k$ are disjoint cycles.
\end{proposition}
\begin{proof}
    The fact that Merge Case never happened before reaching $C_i$ means that all the following clows are disjoint. Since Split Case also never happened means that all the following clows are cycles.
\end{proof}

\begin{proposition}
    For any $v$, Merge Case and Split Case can't happen simultaneously in the algorithm.
\end{proposition}
\begin{proof}
    Suppose it did happen i.e $v$ appeared previously in $C_i$ and in some later $C_j$, where $j > i$. But then when we were at the previous occurence of $v$, Merge Case should have happened, hence we have a contradiction.
\end{proof}

\begin{proposition}
    When Merge Case happens, $C_j$ is unique and $v$ occurs only once in $C_j$.
\end{proposition}
\begin{proof}
    This follows from proposition \ref{prop:disjoint}, the fact that all clows after $C_i$ are disjoint cycles.
\end{proof}

Now we shall prove that $\varphi$ is indeed an involution i.e. $\varphi(\varphi(\mathfrak{C})) = \mathfrak{C}$

\textbf{Case 1}: Merge happened when applying $\varphi$ on $\mathfrak{C}$ (say the merge happened at $C_i$). Now when running the algorithm on $\varphi(\mathfrak{C})$, neither Merge nor Split will happen until $C_i$ as $C_{i+1}, \dots, C_k$ are disjoint cycles, and the clows after $C_i$ in $\varphi(\mathfrak{C})$ are the same except $C_j$ is deleted. While traversing $C_i$ of $\varphi(\mathfrak{C})$, neither Merge nor Split will happen until reaching the second occurence of $v$. 

Neither Merge nor Split will happen at $\mathbf{s_i}.v$, if it did it would have happened when we applied $\varphi$ on $\mathfrak{C}$. Merge Case won't happen at $\mathbf{e_j}.\mathbf{s_j}$ as $C_j$ which was deleted, is disjoint from all clows after $C_i$. The Split Case won't happen as $C_j$ was a cycle with no repetition of vertices, as well as the fact that if some vertex in $\mathbf{s_i}$ appeared in $C_j$, $C_j$ would have been merged before $v$.

And once we reach the second occurence of $v$, Split case will happen and the clow will split exactly how it was merged.

\textbf{Case 2}: Split happened when applying $\varphi$ on $\mathfrak{C}$ (say the Split happened at $C_i$). Note that the new clow $C'_i$ inserted will appear after $C_i$ as all the vertices in $C'_i$ are greater than the head of $C_i$. We can also say that $C'_i$ is disjoint from $C_{i+1}, \dots, C_k$. As otherwise when $\varphi$ was applied to $\mathfrak{C}$, a Merge would have happened first. It's also true that $C'_i$ is a cycle as else we would have had a Split before seeing the second occurence of $v$. Now since $C'_i$ when added to the set $C_{i+1}, \dots, C_k$ are all disjoint cycles, we can say $\varphi$ when applied to $\varphi(\mathfrak{C})$ reaches a Merge/Split case at $C_i$. 

No Split Case happens at $\mathbf{s_i}$ else it would have happened when applying $\varphi$ on $\mathfrak{C}$. Merge Case at $\mathbf{s_i}$ wouldn't have happened to any of the clauses ignoring $C_i'$, else it would have happened when applying $\varphi$ on $\mathfrak{C}$. It won't happen at $C'_i$ either as $\mathbf{s_i}, v, \mathbf{m_i}$ are disjoint (else a Split Case would have happened earlier than $v$ when applying $\varphi$ to $\mathfrak{C}$)

This merge happens exactly how the clow was split, so you get back $\mathfrak{C}$.

Now that we have proven its an involution, we know that it satisfies the other properties, it preserves the edges so the product term is preserved. It changes the number of clows by 1 so the sign reverses.

\subsection{Computing the Summation using Dynamic Programming}

Now that we have a formula to get the determinant, we need an efficient way to sum over all clow sequences. 

Define $P(h, t, \ell)$ to be the sum of all terms\footnote{The term corresponding to a walk (where edges and vertices can repeat) will be a product of $M_{ij}$'s where $(i,j)$ is an edge in the walk} which correspond to a walk starting at $h$, ending at $t$, with length $\ell$ and all the vertices of the walk are greater than the head $h$. Every walk to $t$ of length $\ell$ will be a walk to some vertex $i$ with length $\ell-1$, and then from that $i$ to $t$. So all terms of this type need to be multiplied by $M_{it}$ and summed.

\[ P(h, t, \ell) = \sum_{i=h+1}^{n} P(h, i, \ell-1)M_{it} .\]

Base cases are $P(h,t,1) = M_{ht}$ if $t > h$ else 0.

Now for a clow with head $h$ of length $\ell$, it's nothing but a walk to  a vertex $i$ of length $\ell-1$ and then an edge back to $h$. Let this term be $T(h,\ell)$.

\[ T(h,\ell) = \sum_{i=h+1}^{n} P(h,i,\ell-1) M_{ih}. \]

Our base case is $T(h,1) = M_{hh}$.

Now let $S(h,\ell)$ determine the (signed) sum of all terms of a clow sequence of size $\ell$, 
where the head of the first clow is at least $h$. We can case split based on 
whether there's a clow with head $h$ or not, and if so case split on its size.

\[ S(h,\ell) = S(h+1,\ell) - \sum_{i=1}^{\ell-1} T(h,i) S(h+1,\ell-i) - T(h,\ell). \]

Our base cases are $S(n,\ell) = -T(n,\ell)$ (as if the smallest head has to be at least $n$ you can have only 1 clow) and the negative sign comes because of the parity of the number of clows.

\subsection{Computing Dynamic Programming as Iterated Matrix multiplication}

We now want to compute this dynamic programming approach using a circuit. Since we already know that matrix multiplication is in ${NC}^1$ we shall try to write everything as matrix multiplication.
The first DP can be written as the following:

\begin{equation*}
\begin{aligned}
\left[\begin{array}{c}
P(h, h+1, \ell) \\
P(h, h+2, \ell) \\
\vdots \\
P(h, n, \ell)
\end{array}\right]
=
\left[\begin{array}{cccc}
M_{h+1,h+1} & M_{h+2,h+1} & \dots & M_{n,h+1} \\
M_{h+1,h+2} & M_{h+2,h+2} & \dots & M_{n,h+2} \\
\vdots & \vdots & \ddots & \vdots \\
M_{h+1,n} & M_{h+2,n} & \dots & M_{n,n}
\end{array}\right]
\left[\begin{array}{c}
P(h, h+1, \ell-1) \\
P(h,h+2, \ell-1) \\
\vdots \\
P(h, n, \ell-1)
\end{array}\right]
\end{aligned}
\end{equation*}

So infact to get all terms of the form $P(*,*,\ell)$ from the base case of $P(*,*,0)$ all we have to do is multiply the matrix $\ell-1$ times to the base case vector. This iterated matrix multiplication is in ${NC}^2$ as matrix multiplication is in ${NC}^1$ and iterated multiplication can be done by divide and conquer.

Similarly, the second DP terms can be found by a simple dot product from the terms calculated by the first DP, hence this part adds only a logarithmic depth.

The third DP can be done just like the first one, but we argument a bit in order to take care of the 2 terms $S(h+1, \ell)$ and $T(h,\ell)$.

\begin{equation*}
\begin{aligned}
\left[\begin{array}{c}
1 \\
S(h,1) \\
\vdots \\
S(h,n-1) \\
S(h,n)
\end{array}\right]
=
\left[\begin{array}{ccccc}
1 & 0 & \dots & 0 & 0 \\
-T(h,1) & 1 & \dots & 0 & 0 \\
\vdots & \vdots & \ddots & \vdots & \vdots \\
-T(h,n-1) & -T(h,n-2) & \dots & 1 & 0 \\
-T(h,n) & -T(h,n-1) & \dots & -T(h,1) & 1
\end{array}\right]
\left[\begin{array}{c}
1 \\
S(h+1,1) \\
\vdots \\
S(h+1,n-1) \\
S(h+1,n)

\end{array}\right]
\end{aligned}
\end{equation*}
    
The final answer we require is ${(-1)}^{n} S(1,n)$ which can be found by multiplying the matrix $n-1$ times to the base case vector which has terms of the form $S(n,*)$.

Overall the depth of the circuit we need to compute this dp is $\mathcal{O} (\log^2 n) + \mathcal{O} (\log n) + \mathcal{O} (\log^2 n) = \mathcal{O} (\log^2 n)$,
so determinant is in ${NC}^2$.

\section{Parity is not in AC$^0$}

This is a very brief overview of the next topic, which is to find and prove lower bounds on AC$^0$ circuits. The problem we focus on here is the \emph{Parity} problem.

We look at an important and non-trivial lemma, which is essential in proving that Parity is actually not in AC$^0$ (Unbounded fan-in, constant depth circuits).

\subsection{Håstad's Switching Lemma}

This essentially says that a DNF of \emph{small width} can be converted into a CNF (with high probability), also of small width, under a random restriction of some of the variables in the formula (vice-versa also holds).

Small width here means that the number of literals in each clause is small,  of the order of $\mathcal{O}(\log n)$.

What is a \emph{Random Restriction}? Given a constant $0 \leq \rho \leq 1$, a random restriction $R_{\rho}$ is a choice of a random subset of size $\rho n$, where $n$ is the total number of variables in the formula. These $\rho n$ variables are kept as variables, whereas the remaining variables are either set to 0 or 1, with probability $0.5$.

The following figure describes the switching lemma.

\begin{figure}[h]
\centering
\begin{subfigure}[b]{0.3\textwidth}
\begin{tikzpicture}
\begin{scope}[every node/.style={circle,thick,draw}]
    \node (A1) at (0,0) {$x_1$};
    \node (B1) at (1,0) {$x_2$};
    \node (C1) at (2,0) {$x_3$};
    \node (D1) at (4,0) {$x_n$};
\end{scope}

\begin{scope}[every node/.style={circle,thick,draw,minimum width = 0.5cm}]
\node (A2) at (0.5,2) {$\bigwedge$};
\node (B2) at (1.5,2) {$\bigwedge$};
\node (C2) at (3.5,2) {$\bigwedge$};
\node (A3) at (2,4) {$\bigvee$};
\end{scope}

\path (C1) -- node[auto=false]{ \bf \ldots} (D1);
\path (B2) -- node[auto=false]{ \bf \ldots} (C2);


\begin{scope}[>={Stealth[black]},
              every node/.style={fill=white,circle},
              every edge/.style={draw=red,very thick}]
    \path [->] (A1) edge (A2);
    \path [->] (B1) edge (A2);
    \path [->] (B1) edge (B2);
    \path [->] (C1) edge (B2);
    \path [->] (D1) edge (B2);
    \path [->] (C1) edge (C2);
    \path [->] (D1) edge (C2);
    \path [->] (A2) edge (A3);
    \path [->] (B2) edge (A3);
    \path [->] (C2) edge (A3);
    
    
    
    % \path [->] (B) edge[bend right=60] node {$1$} (E); 
\end{scope}
\end{tikzpicture}
\caption{A small-width DNF before random restriction}
\end{subfigure}
\hfill
\begin{subfigure}[b]{0.3\textwidth}
\begin{tikzpicture}
\begin{scope}[every node/.style={circle,thick,draw}]
    \node (A1) at (0,0) {$x_1^{*}$};
    \node (B1) at (1,0) {0};
    \node (C1) at (2,0) {1};
    \node (D1) at (4,0) {$x_n^{*}$};
\end{scope}

\begin{scope}[every node/.style={circle,thick,draw,minimum width = 0.5cm}]
\node (A2) at (0.5,2) {$\bigwedge$};
\node (B2) at (1.5,2) {$\bigwedge$};
\node (C2) at (3.5,2) {$\bigwedge$};
\node (A3) at (2,4) {$\bigvee$};
\end{scope}

\path (C1) -- node[auto=false]{ \bf \ldots} (D1);
\path (B2) -- node[auto=false]{ \bf \ldots} (C2);


\begin{scope}[>={Stealth[black]},
              every node/.style={fill=white,circle},
              every edge/.style={draw=red,very thick}]
    \path [->] (A1) edge (A2);
    \path [->] (B1) edge (A2);
    \path [->] (B1) edge (B2);
    \path [->] (C1) edge (B2);
    \path [->] (D1) edge (B2);
    \path [->] (C1) edge (C2);
    \path [->] (D1) edge (C2);
    \path [->] (A2) edge (A3);
    \path [->] (B2) edge (A3);
    \path [->] (C2) edge (A3);
    
    
    
    % \path [->] (B) edge[bend right=60] node {$1$} (E); 
\end{scope}
\end{tikzpicture}
\caption{The DNF after random restriction}
\end{subfigure}
\hfill
\begin{subfigure}[b]{0.3\textwidth}
\begin{tikzpicture}
\begin{scope}[every node/.style={circle,thick,draw}]
    \node (A1) at (0,0) {$x_1^{*}$};
    \node (B1) at (1,0) {0};
    \node (C1) at (2,0) {1};
    \node (D1) at (4,0) {$x_n^{*}$};
\end{scope}

\begin{scope}[every node/.style={circle,thick,draw,minimum width = 0.5cm}]
\node (A2) at (0.5,2) {$\bigvee$};
\node (B2) at (1.5,2) {$\bigvee$};
\node (C2) at (3.5,2) {$\bigvee$};
\node (A3) at (2,4) {$\bigwedge$};
\end{scope}

\path (C1) -- node[auto=false]{ \bf \ldots} (D1);
\path (B2) -- node[auto=false]{ \bf \ldots} (C2);


\begin{scope}[>={Stealth[black]},
              every node/.style={fill=white,circle},
              every edge/.style={draw=red,very thick}]
    \path [->] (A1) edge (C2);
    \path [->] (A1) edge (A2);
    \path [->] (A1) edge (B2);
    \path [->] (B1) edge (A2);
    \path [->] (B1) edge (B2);
    \path [->] (C1) edge (B2);
    \path [->] (D1) edge (B2);
    \path [->] (C1) edge (C2);
    \path [->] (D1) edge (C2);
    \path [->] (D1) edge (A2);
    \path [->] (A2) edge (A3);
    \path [->] (B2) edge (A3);
    \path [->] (C2) edge (A3);
    
    
    
    % \path [->] (B) edge[bend right=60] node {$1$} (E); 
\end{scope}
\end{tikzpicture}
\caption{A small width equivalent CNF exists with high probability}
\end{subfigure}
\end{figure}

The main theorem to prove is : 

\begin{theorem}
    Assume that $n \geq 2^{\mathcal{O}(k)}$ and $k << \log (n)$. Computing the parity of $n$ bits using depth $k$ circuits requires size at least $2^{\Omega (n^{{1}/{(k-1)}})}$
\end{theorem}

\begin{figure}[h!]
\centering
\begin{tikzpicture}
\begin{scope}[every node/.style={circle,thick,draw}]
    \node (A1) at (0,0) {$x_1$};
    \node (B1) at (2,0) {$x_2$};
    \node (C1) at (4,0) {$x_3$};
    \node (D1) at (6,0) {$x_4$};
    \node (E1) at (10,0) {$x_n$};
\end{scope}

\begin{scope}[every node/.style={circle,thick,draw,minimum width = 1cm}]
\node (A2) at (1,2) {$\bigwedge$};
\node (B2) at (4,2) {$\bigwedge$};
\node (C2) at (9,2) {$\bigwedge$};
\node (A3) at (2,4) {$\bigwedge$};
\node (B3) at (8,4) {$\bigwedge$};
\end{scope}

\begin{scope}[every node/.style={circle,thick,draw=white}]
\node (A4) at (5,6) {};
\end{scope}

\path (D1) -- node[auto=false]{\Huge \bf \ldots} (E1);
\path (B2) -- node[auto=false]{\Huge \bf \ldots} (C2);
\path (A3) -- node[auto=false]{\Huge \bf \ldots} (B3);
\path (A3) -- node[auto=false]{\Huge \bf $\udots$} (A4);
\path (B3) -- node[auto=false]{\Huge \bf $\ddots$} (A4);


\begin{scope}[>={Stealth[black]},
              every node/.style={fill=white,circle},
              every edge/.style={draw=red,very thick}]
    \path [->] (A1) edge (A2);
    \path [->] (B1) edge (A2);
    \path [->] (B1) edge (B2);
    \path [->] (C1) edge (B2);
    \path [->] (D1) edge (B2);
    \path [->] (C1) edge (C2);
    \path [->] (D1) edge (C2);
    \path [->] (E1) edge (C2);
    \path [->] (A2) edge (A3);
    \path [->] (B2) edge (A3);
    \path [->] (B2) edge (B3);
    \path [->] (C2) edge (B3);
    
    
    
    % \path [->] (B) edge[bend right=60] node {$1$} (E); 
\end{scope}
\end{tikzpicture}
\caption{An example of an $AC^0$ circuit}
\end{figure}


The idea of the proof is to take the parity function, apply a random restriction on it, and see that it remains a parity function of the remaining bits (either that or the complement, which are both computationally equally hard). 


Assume there exists a circuit (of class AC) to evaluate the parity of $n$ bits, and push all the NOT gates into the literals using the fact that $\neg(a \wedge b) \equiv (\neg a) \vee (\neg b)$

Note that in such a circuit with unbounded fan-in, the layers will alternate between AND and OR layers, cause if they don't alternate, you can either split the layer into two (one containing only AND gates and the other containing OR gates) or merge two layers (both having AND or both having OR). The figures below describe the merging step, if two consecutive layers are either both AND or both OR.

Splitting layers can be done in such a way that the depth of the circuit increases by atmost 2, and the size of the circuit increases by a polynomial factor.

\begin{figure}[h]
\centering
\begin{tikzpicture}
\begin{scope}[every node/.style={circle,thick,draw}]
    \node (A1) at (0,0) {$x_1$};
    \node (B1) at (2,0) {$x_2$};
    \node (C1) at (4,0) {$x_3$};
    \node (D1) at (6,0) {$x_4$};
    \node (E1) at (10,0) {$x_n$};
\end{scope}

\begin{scope}[every node/.style={circle,thin,draw=gray,minimum width = 1cm}]
\node (A2) at (1,2) {$\bigwedge$};
\node (B2) at (4,2) {$\bigwedge$};
\node (C2) at (9,2) {$\bigwedge$};
\end{scope}

\begin{scope}[every node/.style={circle,thick,draw,minimum width = 1cm}]
\node (A3) at (2,4) {$\bigwedge$};
\node (B3) at (8,4) {$\bigwedge$};
\end{scope}

\begin{scope}[every node/.style={circle,thick,draw=white}]
\node (A4) at (5,6) {};
\end{scope}

\path (D1) -- node[auto=false]{\Huge \bf \ldots} (E1);
\path (B2) -- node[auto=false]{\Huge \bf \ldots} (C2);
\path (A3) -- node[auto=false]{\Huge \bf \ldots} (B3);
\path (A3) -- node[auto=false]{\Huge \bf $\udots$} (A4);
\path (B3) -- node[auto=false]{\Huge \bf $\ddots$} (A4);


\begin{scope}[>={Stealth[black]},
              every node/.style={fill=white,circle},
              every edge/.style={draw=red,very thick}]
    \path [->] (A1) edge (A3);
    \path [->] (B1) edge[bend left=30] (A3);
    \path [->] (B1) edge (B3);
    \path [->] (C1) edge[bend left=30] (B3);
    \path [->] (D1) edge[bend left=30] (B3);
    \path [->] (B1) edge[bend right=30] (A3);
    \path [->] (C1) edge (A3);
    \path [->] (D1) edge (A3);
    \path [->] (C1)[bend right=30] edge (B3);
    \path [->] (D1)[bend right=30] edge (B3);
    \path [->] (E1) edge (B3);
    % \path [->] (A2) edge (A3);
    % \path [->] (B2) edge (A3);
    % \path [->] (B2) edge (B3);
    % \path [->] (C2) edge (B3);
    
    
    
    % \path [->] (B) edge[bend right=60] node {$1$} (E); 
\end{scope}
\end{tikzpicture}
\caption{Merged circuit from Fig 1, bypassing the second layer altogether (can remove the duplicate edges, makes no difference)}
\end{figure}

Now, the idea is as follows : Look at the bottom two layers, and assume that it is a DNF (the CNF case also holds similarly). Now, looking at our parity function (after having applied a random restriction) we now swap the two layers, applying the switching lemma, noting that this holds with a high probability.

We then merge the second and third layers into one (both of them are OR layers) and continue this process, until we get a circuit with just two layers, which is a DNF (or a CNF)

Each step holds with a high probability, which means that this will continue to hold if the depth of our original circuit was constant (AC$^0$). However, this implies that there is a depth 2 (AC$^0$) circuit for computing parity, which we know is not the case (Can be proved by contradiction and adversarial examples, as you must have seen each of the input bits to have computed the parity)

This is the rough outline of the proof, the actual proof will be discussed in the next scribe.



\end{document}