reset("0","0","1000","780","Introduction to the course") B = TextPanel("5","5","500","515","black","white","red") S = TextPanel("5","525","500","160","black","white","red") B2 = TextPanel ("510","5","475","515","black","white","blue") B2s = TextPanel ("510","5","475","375","black","white","blue") \macro(Sum,∑) \macro(le,≤) \macro(ge,≥) \macro(ne,≠) \macro(times,∏) \macro(cdot,·) \macro(times,×) \macro(infty,∞) \macro(cap,∩) \macro(m1comman,"M1, M2, ... , Mn") \macro(a0n,"a0, a1, ... , an") \macro(m1n,M1 * M2 * ... * Mn) \macro(mi,Mi) \macro(miplus1,Mi+1) B2s.hide() B!

Introduction: S! Let us start with a fundamental question: given a certain problem, - how do you solve it on a computer? S:: For many problems, it is relatively easy to design algorithms that - will somehow solve them. However, quite some cleverness is needed - in designing algorithms that are also fast, i.e. give the answers - quickly. S:: This will be the major challenge in the course. S! B- Main course goal: Study techniques for designing fast algorithms. S! Designing anything, cars, houses, clothes, and even algorithms, is - an art. However, it is still possible to identify good techniques - and principles. We will study several such techniques, as also - examples of where these techniques have worked well and given us - fast algorithms. S:: After learning these techniques, our hope is that you too will be - able to design fast algorithms for the problems that come your way. S! This is not an elementary course, and here are the prerequisites - for taking it. S! B:: Prerequisites: Experience in writing programs. Course on - Data Structures. Background in Discrete Mathematics. S! The approach of the course is analytical. We will design - algorithms on paper, and attempt to prove how fast they will - run on a computer. S! B:: Approach: Analytical. Reason about algorithms and prove how - fast they will run. B:: Overview: S! We will begin by precisely defining the general framework we are - going to use in the course. If we want to mathematically reason about - algorithms, we need to have very clear definitions of terms such as - computer, algorithm and even fast. S! B-
  • Basic framework. Model of computer, algorithm. Notion of - fast. Understanding how mathematical models relate to reality. S! Once these definitions are in place we will survey algorithm design - techniques and see how they can be used for solving problems from a - variety of areas. S! B-
  • Survey of algorithm design techniques. How they can be used to - solve elementary problems such as sorting, as well as problems from - computer algebra, geometry, optimization. S! While our technqiues are extremely successful at solving some - problems, we will also encounter problems for which we dont know how - to devise fast algorithms. S:: A fairly intricate theory has developed concerning these - hard problems. We will study this theory, this so called theory of - NP-completeness. S! B-
  • Theory of NP-completeness.
S! During the course, we will give many exercises (mostly on - algorithm design and analysis). Doing these exercises is extermely - important if you wish to benefit from the course. S! B! B!

Topic for today S! The main purpose of today's lecture is to give you an example so - that you understand the spirit of the course. S! We will consider a simple problem: given two numbers, find their - greatest common divisor. B- Problem: Given two numbers, find their greatest common - divisor. B:: Greatest common divisor (GCD): Largest integer that divides - the two numbers without leaving a remainder. S:: We will see two algorithms for GCD. - One algorithm is a natural algorithm that even school children - know. But we will review it. S:: The second algorithm is due to Euclid. Although we havent yet - precisely defined what fast means, based on your knowledge from - the prerequisites, you will easily be convinced that it is faster - than the simple algorithm. S! You will also see that Euclid's algorithm is also cleverer. S:: And thus exciting... S! B-
  • Natural algorithm based on factoring. -
  • Euclid's algorithm -
  • Demonstration that Euclid's algorithm is fast. -
  • Proof that Euclid's algorithm is correct -
  • Informal estimate of the time taken by Euclid's algorithm. -
B! B!

Simple algorithm for finding GCD B- Input: Integers m,n. -
Output: GCD of m,n. S! The simple algorithm, which is even taught in schools, is to factor - m and n. Then collect the common factors, multiply them, and print - the product. B::Simple(m,n){
  1. Factorize m, i.e. find - prime numbers m1,m2,...mj such that - m = m1 \times m2 \times ... \times mj. S! B-
  2. Factorize n, i.e. find - prime numbers n1,n2,...nk such that - n = n1 \times n2 \times ... \times nk. B-
  3. Let {p1,p2,...,pl} = - {m1,m2,...mj} \cap - {n1,n2,...nk} S! Notice that prime factors may be repeated - i.e. the sets are really multisets. - The intersection is also a multiset intersection, i.e. each element - will be picked as many times as the minimum of the number of its - occurrences in the two sets. S! B-
  4. Return p1 \times p2 \times ... \times pl.
} S! We will take an example. Suppose m=36, n=48. B2!

Example m=36, n=48. B2:: Execution of Simple:
  1. Factorization of m: 36 = 2 \times 2 \times 3 \times 3 B2-
  2. Factorization of n: 48 = 2 \times 2 \times 2 \times 2 \times 3 B2-
  3. Intersection = {2, 2, 3} B2-
  4. GCD = 2 \times 2 \times 3 = 12
S! To complete the discussion of this algorithm we need to spell out - how to factor a number. S! We will not do this too carefully for now, but we will assume that - this is done as taught in school, i.e. starting from 2 test for each - prime whether it divides the number to be factored. S! We now turn to Euclid's algorithm. S! B:: Euclid(m,n){
    -
  1. while m does not divide n -
  2.         r = n mod m -
  3.         n = m -
  4.         m = r -
  5. end -
  6. return m
-} S! It is not obvious that this algorithm will actually produce the - GCD. We will give a proof later. First, let us see how this works - on our example, m=36, n=48. S! B2-Execution of Euclid: S! In the first iteration we check whether m=36 divides n=48. B2-
    1. m=36 does not divide n=48. S! So we will enter the loop and calculate as indicated. B2-
    2. r = 48 mod 36 = 12 B2: 3. n = m = 36 B2: 4. m = r = 12. S! Values after the first iteration: m=12, n=36. S:: Now when go back to the loop beginning and check whether m divides n. B2: 1. m=12 divides n=36. S! So we exit the loop and jump to step 6. B2: 6. Return m=12. S! Thus the algorithm seems to work on this example. S! Although we havent defined formally what fast is, it should - be clear that Euclid does much less work than Simple. S:: Euclid did essentially 2 divisions: dividing 48 by 36 in the first - iteration, and then 36 by 12 in the second. S:: Simple on the other hand, would have had to do at least one - division for every factor that it constructed. S! Let us take another example. B2! B2!

    Example: m=434, n=966 B2:: Execution of Simple: -
    1. Factorization of m: 434 = 2 \times 7 \times 31 -
    2. Factorization of n: 966 = 2 \times 7 \times 139 B2: 3. Intersection = {2, 7} B2: 4. GCD = 2 \times 7 = 14 B2::Execution of Euclid: S! The execution will proceed as.. B2: 1. m=434 does not divide n=966, so enter the loop. B2: 2. r = 966 mod 434 = 98 B2: 3. n = m = 434 B2: 4. m = r = 98 S! After first iteration: m=98, n=434, and we return to the beginning. B2::1. m=98 does not divide n=434, so enter the loop. B2: 2. r = 434 mod 98 = 42 B2: 3. n = m = 98 B2: 4. m = r = 42 S! After second iteration: m=42, n=98, and we retun to the beginning. B2:: 1. m=42 does not divide n=98, so enter the loop. B2: 2. r = 98 mod 42 = 14 B2: 3. n = m = 42 B2: 4. m = r = 14 S! After third iteration: m=14, n=42, and we return to the beginning. B2:: 1. m=14 does divide n=42, so exit the loop. B2: 6. return m=14 S! So we have the correct answer again! S! Now let us compare the two algorithms. S:: Euclid does one more division than the number of iterations. Thus - for this example, it did 4 divisions. S:: Simple, however, clearly did at least as many divisions as the - factors it found. That itself makes 6 divisions. S:: However, in order to determine that 139 is a factor, and there are - no factors between 7 and 139, Simple would have to do quite some - work! If this work is included, it will be seen that Euclid will be - even faster, much faster. S! We next explain why Euclid works. S! B2!

    Proof of correctness of Euclid S! The proof lies in a simple fact about division and remainders. S! B2-Claim: If m divides n then GCD(m,n) = m, else GCD(m,n)=GCD(n, - m mod n) S! You should be able to prove this using simple facts about - divisibility. S! B2: Proof of claim: Exercise. Hint: if g=GCD(m,n), then m=ag, and n=bg - where a and b do not have a common factor. S! But now, the proof of correctness is immediate. S! B2:: Proof of correctness of Euclid: In each iteration, the new m, n - calculated are guaranteed to have the same GCD as the old m,n. B2:: Further, r = n mod m is always smaller than m, and hence the - numbers must decrease in each iteration. B2:: But the numbers m, n must stay positive. Hence in a finite number - of iterations the algorithm must terminate. S! Done! S! We will next estimate the number of iterations that Euclid will - execute. B2:: Number of iterations: S! We will assume that at the time of call to Euclid, m < n. S:: This need not be true, and Euclid will work correctly if we call - it with larger m than n, e.g. Euclid(48,36). S:: However, then the first iteration of Euclid will only exchange the - values of m and n. Thus if our assumption is false, we will only - take one extra iteration. B2- Assume Euclid is called with arguments p, q such that p < q. S! We will show that the sum of values of m and n must rapidly decrease as the - algorithm executes. Since it cannot decrease indefinitely, this will - imply that the algorithm cannot take too many iterations. S:: This is a common analysis strategy that we will use in the course. S! B2:: Initially, the variables m and n take values p and q. Let - s=p+q. If p divides q, then we will immediately return, without - executing a single iteration. B2:: If p does not divide q, then we will enter the loop. Suppose - this happens. Let p' and q' denote the values taken by m and n - after 1 iteration. Let s'=p'+q'. S! We will show that s' is substantially smaller than s. B2:: We know that p' = q mod p, hence p' < p. B2:: We know that q' = p. B2:: Further p'+q' = (q mod p) + p \le q. B2:: Collecting these facts we have:
    1. p' < p
    2. q' = - p
    3. p'+q' \le q
    B2- Now we simply add twice the third inequality and the first two, and we - will get 3p'+3q' < 2p+2q B2:: Thus (p'+q') < 2(p+q)/3. S:: Thus s' < 2s/3. B2:: But this argument can be made at each iteration. Thus the sum of the - values of m and n decrease by a factor of 3/2 in each iteration. B2:: After log3/2(p+q) iterations the sum will decrease below 1. But m,n must take integer values throughout, and hence this cannot happen. B2:: Thus the algorithm must terminate before this. In other words, the number of iterations is at most log3/2(p+q). B2:: As noted before, an extra iteration will be needed if Euclid is called with p,q and p>q. S! Done! B! B2! S! B!

    Concluding Remarks B-
    • With more rigorous analysis, Euclid's algorithm can be seen to - be even faster than what we showed. B-
    • In our analysis we focussed only on division operations; - in the next lecture we will make an abstract model of a computer - which will identify other operations which we also need to consider. S! Even with that, Euclid's algorithm will be seen to be much, much - faster than the simple algorithm we learn in school. B-
    • How to discover fast algorithms? S! We could ask how did Euclid discover his algorithm? Obviously, he was not - content to use the definition of GCD (which is what the simple - algorithm uses), but derived and used the additional mathematical - properties of GCD. B- Need to understand the problem deeper by - examining the underlying mathematics.
    S! This will be very important, and will require creativity B2!

    Exercises
      -
    1. Suppose Euclid is used to find the GCD of 9287 and 3659. Trace - the execution of Euclid. Use the result we proved to determine a - bound on the number of iterations. Does the actual number agree with - the bound? -
    2. The number of iterations can be much less than what is given by - the bound in general. This suggests that perhaps the bound can be - improved, or that we will have to look harder to find numbers which - will force Euclid to execute as many iterations as predicted. Try to - find numbers which cause this. Hint: How many iterations are - taken by Euclid to compute the GCD of the nth and n+1 th Fibonacci - number? -
    3. The simple method for factorizing a number requires dividing it by all primes until its square root (approximate). For this, it is useful if the algorithm has a table of primes until some large value N. How large will this table be, i.e. What is the number of primes smaller than N? Search the internet or other sources to get an approximate answer to this. In light of this answer, how realistic is the scheme of keeping around a table of primes?
    \scene(End)