Sorting
and Searching
Applications
of Arrays
• one
of the most common problems
• sorting
– arrange a
set of numbers or character strings in ascending order
• searching
– given a
number or string, is it contained in the set?
• telephone
directory
Sorting
• array
marks contains marks of students
• arrange
them in non-decreasing order
• once
arranged, many questions can be answered easily
– find the top
10 marks
– find the
median marks
– how many
students have marks = m
– find the
mode ( the most frequent marks)
Simple
Sorting Algorithm
• find
the smallest number
• put
it in the first position
• from
remaining numbers find the smallest
• put
it in the second position
• repeatedly
fill in a position in the sorted array by finding smallest remaining number
• repetitions
have to be done n-1 times
• called
selection sort
Selection
Sort
do
i = 1, n-1
min_pos = i
do
j = i+1, n
if
(marks(j) < marks(min_pos)) min_pos
= j
end do
! min_pos is index of minimum in marks(i:n)
temp = marks(i)
marks(i) = marks(min_pos)
marks(min_pos) = temp
end do
Loop
Invariants
• write
loop invariants for selection sort
• outer
loop
– marks(1:i-1)
is in non-decreasing order
– marks(k)
>= marks(i-1) for all i<=k<=n
• inner
loop
– marks(min_pos)
<= marks(k) for all i<=k<=j-1
• together
they imply that after n-1 outer iterations, the array is sorted
Selection
Sort
• not
very efficient
• for
sorting n values, number of operations is proportional to n2
• the
number is insensitive to the input
– same number
of operations required even if the input is already sorted
• very
simple
• no
extra array required
• movement
of data is minimum possible
Bubble
Sort
• selection
sort is based on ensuring that a(i) <= a(j) for all 1 <= i < j <= n
• a
weaker condition is sufficient for sorting
– a(i) <=
a(i+1) for 1 <= i < n
• bubble
sort based on ensuring this
• compare
adjacent numbers and swap if larger number occurs before the smaller
• repeat
till no swaps are possible
Bubble
Sort
limit = n ! limit indicates section of array
to be sorted
do
if
(limit <= 1) exit
last_swap = 0 ! rightmost position of swap
do
i = 1, limit-1
if
(a(i) > a(i+1)) then
temp =a(i) ; a(i) = a(i+1) ; a(i+1) = temp;
last_swap = i
endif
end do
limit = last_swap
end do
Bubble
Sort
• in
the worst case bubble sort can still require n2
operations – when?
• better
in some cases like already sorted input
• what
is a loop invariant for bubble sort?
• a(1:limit)
is the section still to be sorted
• a(limit+1:n)
is already sorted
• a(i)
<= a(limit+1) for 1 <= i <= limit
Searching
• how
many students have scored m marks?
• marks
stored in an array
• if
array is unsorted value of every element would have to be checked
• can
do better if it is sorted
• how?
– use binary search
• find
indices i, j such that a(i-1) < m <= a(i) and a(j-1) <= m < a(j),
j-i is the answer
Binary
Search
• search
for index i such that a(i-1)<m<=a(i)
• initially,
possible values of i are from 1 to n+1
• compare
with a((n+1)/2) ( middle element)
• if
a((n+1)/2) < m then required i > (n+1)/2 else required i <= (n+1)/2
• possible
values of i have been halved
• repeat
for appropriate interval of values
• after
log2 n + 1 comparisons, i is
found
Binary
Search
left = 1; right = n+1
! left and right are endpoints of interval
containing i
do
if
( left == right ) exit
if marks((left+right)/2) < m) then
left = (left+right)/2+1
else
right = (left+right)/2
endif
end do
Binary
Search
• loop
invariant for binary search
• marks(1:left-1)
< m
• marks(right:n)
>= m
• when
left == right, required i = left =right
• in
each iteration, right-left is halved
• number
of iterations is at most log2
n + 1
• same
method can be used for many other problems
Faster
Sorting Algorithms
• there
are sorting algorithms which take less time than selection or bubble sort
• many
different sorting algorithms are known
• merge
sort
– divide
numbers to be sorted into two parts
– sort each
part using the same method
– merge the
two sorted sequences into a single sorted sequence
Summary
• sorting
and searching are basic tasks
• many
sorting algorithms
– selection
sort
– bubble sort
– merge sort (
details later)
• binary
search
– needs sorted
arrays
– faster than
simple search