I Foundations

1 The Role of Algorithms in Computing 5

1.1 Algorithms 5

An algorithm is thus a sequence of computational steps that transform the input into the output.

1.2 Algorithms as a technology 11

lg n < sqrt(n) < n < n lg(n) < n2 < n3 < n!

2 Getting Started 16

2.1 Insertion sort 16

1
2
3
4
5
6
7
8
9
10
11
  void insertionSort(double* A, int size) {
      for (int i = 1; i < size; ++i) {
          double key = A[i];
          int index = i - 1;
          while (index >= 0 && A[index] > key) {
              A[index+1] = A[index];
              -- index;
          }
          A[i+1] = key;
      }
  }

2.2 Analyzing algorithms 23

insertion sort has a worst-case running time of Θ(n2)

2.3 Designing algorithms 29

The divide-and-conquer approach and recursive

  • Divide: left and right subarrays
  • Conquer: recursively sort left and right
  • Combine: merge by comparing two numbers from the sorted left and right pair by pair
mergeSort.cpp
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
  //Input:
  // A array
  // size size of A
  // p starting index of the span waiting to be sorted
  // r ending index of the span waiting to be sorted
  //Procedure:
  // sort left
  // sort right
  // merge by comparing two numbers from the sorted left and right pair by pair
  //Output:
  // A sorted array 
    void mergeSort(int* a, int b, int e) {
      if (b < e) {
          int mid = (b+e)/2;
          mergeSort(a, b, mid);
          mergeSort(a, mid+1, e);
          // merge
          int* left = new int[mid-b+1 + 1];
          int* right= new int[e-mid + 1];
          for (int i = b; i <= mid; i++) {
              left[i-b] = a[i];
          }
          left[mid+1-b] = INT_MAX;
          for (int i = mid+1; i <= e; i++) {
              right[i-mid-1] = a[i];
          }
          right[e-mid] = INT_MAX;
          int i = 0, j = 0;
          for (int k = b; k <= e; k++) {
              a[k] = (left[i]<=right[j])?(left[i++]):(right[j++]);
          }
          delete[] left;
          delete[] right;
      }
    }

  • Θ(n lg n) but not operate in place

3 Growth of Functions 43

3.1 Asymptotic notation 43

  • O(). When we have only an asymptotic upper bound, we use O-notation.
    • O(g(n)) = {f(n): there exist positive constants c and n0 such that 0 <= f(n) <= c*g(n) for all n >= n0}.
    • if f(n) == c*g(n) is not possible, we use o()
  • Θ(). When we have both an asymptotic upper bound and an asymptotic lower bound, we use Θ-notation.
    • Θ(g(n)) = {f(n): there exist positive constants c1, c2, and n0 such that 0 <= c1g(n) <= f(n) <= c2g(n) for all n >= n0}
  • Ω(), When we have only an asymptotic lower bound, we use omega-notation.
    • uppercase_omega(g(n)) = {f(n): there exist positive constants c and n0 such that 0 <= c*g(n) <= f(n) for all n >= n0}
    • if f(n) == c*g(n) is not possible, we use ω()

3.2 Standard notations and common functions 53

4 Divide-and-Conquer 65

  • Three ways for solving recurrences
    • substitution method
    • recursive-tree method
    • master method
      • T(n) = a*T(n/b) + f(n), a>=1, b>1. It characterizes a divide- and-conquer algorithm that creates a subproblems, each of which is 1/b the size of the original problem, and in which the divide and combine steps together take f(n) time.

4.1 The maximum-subarray problem 68

In the stock market, you need to buy low and sell high. But the maximum profit does not always start at the lowest price or end at the highest price.

A brute-force solution

Choose 2 from n pairs of (date, price) and calculate the difference. Θ(n2)

A solution using divide-and-conquer

A transformation: create an array of the daily change in price, and find the maximum subarray.

  • find the max subarray on the left
  • find the max subarray on the right
  • find the max subarray crossing the middle
    • find max left, max right, and their sum
  • take the max of the above

TODO: codes

Θ(n*lg(n))

4.2 Strassen’s algorithm for matrix multiplication 75

O(n3) to O(n2.81)

4.3 The substitution method for solving recurrences

  1. Guess the form of the solution.
  2. Use mathematical induction to find the constants and show that the solution works.

4.4 The recursion-tree method for solving recurrences 88

depth * height

4.5 The master method for solving recurrences 93

Master Method

5 Probabilistic Analysis and Randomized Algorithms 114

5.1 The hiring problem 114 5.2 Indicator random variables 118 5.3 Randomized algorithms 122 5.4 Probabilistic analysis and further uses of indicator random variables

II Sorting and Order Statistics Introduction 147

worst-case running time average-cased running time
Insertion sort Θ(n^2) Θ(n^2)
Merge sort Θ(n lg n) Θ(n lg n)
Heapsort O(n lg n) -
Quicksort Θ(n^2) Θ(n lg n) (*)
Counting sort Θ(k+n) Θ(k+n)
Radix sort Θ(d(n+k)) Θ(d(n+k))
Bucket sort Θ(n^2) Θ(n)

6 Heapsort 151

O(n lg n),Our heap data structure is not garbage-collected storage.

heap.hpp
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
  #include <iostream>
  #include <cstdlib>

  template <class T>
  class Heap
  {
  public:
      Heap():m_array(0), m_size(0), m_maxSize(0){}
      Heap(int maxSize);
      ~Heap(){delete [] m_array;}
      bool insert(T item);
      bool isEmpty();
      T fetchMax();
      void deleteMax();
      void display();
      //Transfer Array<T> to a heap
      void arrayToHeap(T* array, int size);
      //Sort the heap
      void heapSort();

  private:
      void locate_down(int start, int end);
      void locate_up(int start, int end);

  private:
      T* m_array;
      int m_size;//num of elem stored
      int m_maxSize;//allocated memory
  };

  template <class T>
  Heap<T>::Heap(int maxSize)
  {
      m_array=new T[maxSize+1];
      m_size=0;
      m_maxSize=maxSize;
  }

  template <class T>
  T Heap<T>::fetchMax()
  {
      return m_array[1];
  }

  template <class T>
  bool Heap<T>::isEmpty()
  {
      return m_size==0;
  }

  template <class T>
  void Heap<T>::locate_down(int start, int end)
  {
      int c=start*2;
      int temp;
      while(c<=end)
      {
          if((c<end) && (m_array[c]<m_array[c+1])) //Pick the larger one in both childs. ATTENTION: c==end is not approved!!
              c=c+1
          if(m_array[start]<m_array[c])
          {
              temp=m_array[start];
              m_array[start]=m_array[c];
              m_array[c]=temp;
          }
          start=c;
          c=2*c;
      }
  }

  template <class T>  //overlap the top with the bottom, decrease the size, and locate_down
  void Heap<T>::deleteMax()
  {
      m_array[1]=m_array[m_size];
      --m_size;
      locate_down(1,m_size);
  }

  template <class T>
  void Heap<T>::locate_up(int start, int end)
  {
      int loc=end;
      int parent=end/2;
      int temp;
      while(parent>=start && m_array[loc]<m_array[parent])
      {
          temp=m_array[loc];
          m_array[loc]=m_array[parent];
          m_array[parent]=temp;
          loc=parent;
          parent=loc/2;
      }
  }

  template <class T>
  bool Heap<T>::insert(T item)
  {
      if(m_size==m_maxSize)
      {
          std::cerr<<"Heap is full";
          this->~Heap();
          return 0;
      }
      ++m_size;
      m_array[m_size]=item;
      locate_up(1, m_size);
      return true;
  }

  template <class T>
  void Heap<T>::display()
  {
      for(int i=1; i<=m_size; i++)
          std::cout<<m_array[i]<<" ";
      std::cout<<std::endl;
  }

  template <class T>
  void Heap<T>::arrayToHeap(T* array, int size)
  {
      m_maxSize=size;
      m_array=new T[m_maxSize+1];
      for(int i=1; i<=size; ++i)
          this->insert(array[i-1]);
      this->heapSort();
      for(int i=1; i<size; ++i)
          array[i-1]=m_array[i];
  }

  template <class T>
  void Heap<T>::heapSort()
  {
      T temp;
      int i;
      for(i=m_size; i>=2; --i)
      {
          temp=m_array[1];
          m_array[1]=m_array[i];
          m_array[i]=temp;
          locate_down(1, i-1);
      }
  }

6.1 Heaps 151

The (binary) heap data structure is an array object that we can view as a nearly complete binary tree.

1
2
3
4
5
6
7
8
9
10
  // calculate by shifting the binary representation
  int parent(i) {
      return i/2;
  }
  int left(i) {
      return 2*i;
  }
  int right(i) {
      return 2*i + 1;
  }
  • max-heap: for every node i other than the root, A[parent(i)]>=A[i].
  • min-heap: for every node i other than the root, A[parent(i)]<=A[i].
  • height of a node: number of edges on the longest simple downward path from the node to a leaf.
  • height of the heap: height of its root. Θ(lg n)

6.2 Maintaining the heap property 154

1
2
3
4
5
6
7
8
9
10
11
12
13
  // Input:
  //         A array
  //     i index
  //   Binary trees rooted at left(i) and right(i) are max-heaps
  //   but A[i] might be smaller than its children. 
  // Procedure:
  //     exchange A[i] with the larger one between A[left[i]] and A[right[i]]
  //     maxHeapify the new sub tree
  // Output:
  //     This routine lets the value at A[i] float down in the max-heap. 
  maxHeapify(A, i) {
      TODO;
  }

T(n) <= T(2*n/3) + Θ(1), T(n) = O(lg n) = O(height)

6.3 Building a heap 156

conduct maxHeapify upon the array from the rear to the head

O(n)

6.4 The heapsort algorithm 159

build max-heap, exchange the root with the end of heap, and maxHeapify the rest of the heap.

6.5 Priority queues 162

one of the most popular applications of a heap: as an efficient priority queue. It can be used to schedule jobs on a shared computer. Highest-priority job first.

prioqueue.hpp
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
  #include <iostream>
  template <class T>
  class PrioQueue //Min Heap
  {
  public:
      PrioQueue(int mSize=20);
      ~PrioQueue(){delete [] q;}
      bool IsEmpty() const {return size==0;}
      bool IsFull() const {return size==maxSize;}
      bool Append(const T &x);
      bool Serve(T &x);
  private:
      void AdjustDown(int start, int end);
      void AdjustUp(int start, int end);
      T *q;
      int size, maxSize;
  };

  template <class T>
  PrioQueue<T>::PrioQueue(int mSize)
  {
      maxSize=mSize;
      size=0;
      q=new T[maxSize];
  }

  template <class T>
  void PrioQueue<T>::AdjustDown(int start, int end)
  {
      if(start<end)
      {
          int parent=start;
          int child=2*parent+1;
          while(child<=end)
          {
              if((child<end)&&(q[child]>q[child+1]))
                  child=child+1;
              if(q[parent]>q[child])
              {
                  T temp=q[parent];
                  q[parent]=q[child];
                  q[child]=temp;
              }
              else break;
              parent=child;
              child=2*parent+1;
          }
      }
  }

  template <class T>
  void PrioQueue<T>::AdjustUp(int start, int end)
  {
      if(start<end)
      {
          int child=end;
          int parent=(child-1)/2;
          while((parent>=start)&& (q[parent]>q[child]))
          {
                  T temp=q[parent];
                  q[parent]=q[child];
                  q[child]=temp;
                  child=parent;
                  parent=(child-1)/2;
          }
      }
  }

  template <class T>
  bool PrioQueue<T>::Append(const T &x)
  {
      if(this->IsFull())
      {
          //throw std::Overflow;
          std::cout<<"overflow"<<std::endl;
         return(0);
      }
      size++;
      q[size-1]=x;
      this->AdjustUp(0, size-1);//ATTENTION: n-1
  }

  template <class T>
  bool PrioQueue<T>::Serve(T &x)
  {
      if(this->IsEmpty())
      {
          //throw std::Underflow;
          return(0);
      }
      x=q[0];
      q[0]=q[size-1];
      q[size-1]=0;
      size--;
      this->AdjustDown(0, size-1); //ATTENTION: n-1
      std::cout<<"After AdjustDown:"<<q[0]<<q[1]<<q[2]<<q[3]<<q[4]<<std::endl;
  }

7 Quicksort 170

Despite this slow worst-case running time Θ(n2) , quicksort is often the best practical choice for sorting because it is remarkably efficient on the average: its expected running time is Θ(n lg n), and the constant factors hidden in the Θ(n lg n) notation are quite small.

7.1 Description of quicksort 170

Divide-and-conquer paradigm

  • Divide: element A[q] and subarrays on the left and right, s.t. left <= A[q] <= right
  • Conquer: recursively sort left and right
  • Combine: NULL
quickSort.cxx
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
  void qSort(int* a, int b, int e) {
      if (b < e) {
          int midVal = a[e];
          // partition
          int j = b;
          for (int i = b; i <= e; i++) {
              if (a[i] <= midVal) {
                  int tmp = a[j]; a[j] = a[i]; a[i] = tmp;
                  j++;
              }
          }
          qSort(a, b, j-2);
          qSort(a, j, e);
      }
  }

7.2 Performance of quicksort 174

  • Worst-case partitioning: the partitioning routine produces one subproblem with n-1 elements and one with 0 elements.
    • T(n) = T(n-1) + T(0) + Θ(n)
    • Θ(n2)
  • Best-case partitioning: even possible split
    • T(n) = 2T(n/2) + Θ(n)
    • Θ(n lg n)
  • Balanced partitioning: Any split of constant proportionality yields a recursion tree of depth Θ(lg n), where the cost at each level is O(n).
  • Average case: PARTITION produces a mix of “good” and “bad” splits.
    • TODO

7.3 A randomized version of quicksort 179

pick the flag in the partition routine randomly (exchange with the one at the rear)

7.4 Analysis of quicksort 180

8 Sorting in Linear Time 191

comparison sorts: merge sort, heapsort, and quicksort sharing, must make Ω(n lg n) comparisons in the worst case to sort n elements.

However, counting sort, radix sort, and bucket sort…

8.1 Lower bounds for sorting 191

8.2 Counting sort 194

counting sort assumes that the input consists of integers in a small range

Create an extra array to record the occurrence of these number and put them back into a new array as the result. We usually use counting sort when we have k = O(n), in which case the running time is Θ(n).

8.3 Radix sort 197

TODO? 感觉像counting sort,只不过是存了整个数,而不是数数的个数

Radix sort tutorial

  • source: List of bytes
  • source_n: number of bytes to sort
  • dest[256]: 256 lists of bytes. each list should have enough space to hold source_n elements.

a pseudo-code would sort the list this way:

for i=0 to source_n do
   dest[source[i]].append(source[i])

8.4 Bucket sort 200

bucket sort assumes that the input is generated by a random process that distributes elements uniformly and independently over the interval [0, 1).

Bucket sort divides the interval [0, 1) into n equal-sized subintervals, or buckets, and then distributes the n input numbers into the buckets. Since the inputs are uniformly and independently distributed over [0, 1), we do not expect many numbers to fall into each bucket. To produce the output, we simply sort the numbers in each bucket (e.g. insertion sort) and then go through the buckets in order, listing the elements in each.

Θ(n)

9 Medians and Order Statistics 213

Selection problem selects i-st smallest number among a set A of n (distinct) numbers

9.1 Minimum and maximum 214

How to find the min in A? Best way is to iterate it once (n-1 comparisons).

How to find min and max simultaneously? In fact, we can find both the minimum and the maximum using at most 3n/2 comparisons. 每次取一对,自己先比较,然后大的跟max比,小的跟min比,总共只需要比3n/2次。

9.2 Selection in expected linear time 215

randomizedSelect.cxx
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
  #include <iostream>
  using namespace std;

  int randomizedSelect(double* A, int p, int r, int i) ;
  int randomizedPatition(double* A, int p, int r) ;
  int partition(double* A, int p, int r) ;

  // ----------------------------------------------------------------------------
  // select i-st largest number from A[p] to A[r]
  int randomizedSelect(double* A, int p, int r, int i) {
      if (p == r)
          return A[p];
      int q = randomizedPatition(A, p, r);
      int k = q-p+1;
      if (i == k) // the pivot value is the answer
          return A[q];
      else if (i < k) // on the left
          return randomizedSelect(A, p, q-1, i);
      else            // on the right
          return randomizedSelect(A, q+1, r, i-k);
  }
  // ----------------------------------------------------------------------------
  // get the index of the number in the middle 
  // s.t. subarray on the left <= A[index] <= subarray on the right
  int randomizedPatition(double* A, int p, int r) {
      int i = random()%(p-r+1) + p;
      double tmp = A[i]; A[i] = A[r]; A[r] = tmp;
      return partition(A, p, r);
  }
  // ----------------------------------------------------------------------------
  // pick A[r] as the flag
  // i for the left array <= A[r]
  // j for iterating the whole array 
  // finally, exchange A[r] in the middle
  // return the new middle index
  int partition(double* A, int p, int r) {
      double x = A[r];
      int i = p - 1;
      for (int j = p; j <= r-1; ++j)
          if (A[j] <= x) {
              ++ i;
              double tmp = A[i]; A[i] = A[j]; A[j] = tmp;
          }
      double tmp = A[i+1]; A[i+1] = A[r]; A[r] = tmp;
      return i+1;
  }
  // ----------------------------------------------------------------------------
  int main() {
      double A[10] = {8,1,3,2,4,7,5,6,9,0};
      for (int i = 0; i < 10; ++i) cout << A[i];
      cout << endl;
      cout << randomizedSelect(A, 0, 9, 5) << endl;
      return 0;
  }

E[T(n)] = O(n)

9.3 Selection in worst-case linear time 220

O(n)

III Data Structures

  • Data structures: techniques for representing finite dynamic sets and manipulating them on a computer.
  • Dynamic-set operations: CR(U)D 大小前后: insert, search, delete, max, min, predecessor, successor.
  • Dictionary: a dynamic set supporting insert, delete, and test membership.
  • 2 categories of operations
    • queries
    • modifying operations

10 Elementary Data Structures 232

10.1 Stacks and queues 232

Stack
stack.h
1
2
3
4
5
6
7
8
9
10
11
12
  #include <iostream>
  template <class T>
  class Stack
  {
  public:
      virtual bool IsEmpty() const=0;
      virtual bool IsFull() const=0;
      virtual bool Top(T &x) const=0;
      virtual bool Push(T x) =0;
      virtual T Pop() =0;
      virtual void Clear() =0;
  };
Sequential Stack
seqstack.cxx
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
  #include "stack.h"

  template <class T>
  class SeqStack:public Stack<T>
  {
  public:
      SeqStack(int mSize);
      ~SeqStack() {delete []s;}
      bool IsEmpty() const { return top == -1;}
      bool IsFull() const { return top == maxTop;}
      bool Top(T &x) const;         //return the top element in x
      bool Push(T x);
      T Pop();
      void Clear() { top = -1;}
  private:
      int top;
      int maxTop;
      T *s;
  };
  template <class T>
  SeqStack<T>::SeqStack(int mSize)
  {
      maxTop = mSize - 1;
      s = new T[mSize];
      top = -1;
  }

  template <class T>
  bool SeqStack<T>::Top(T &x) const
  {
      if (IsEmpty())
      {
          std::cout<<"Empty"<<std::endl;
          return false;
      }
      x = s[top];
      return true;
  }

  template <class T>
  bool SeqStack<T>::Push(T x)
  {
      if(IsFull())
      {
          std::cout<<"Overflow"<<std::endl;
          return false;
      }
      s[++top] = x;
      return true;
  }

  template <class T>
  T SeqStack<T>::Pop()
  {
      if(IsEmpty())
      {
          std::cout<<"Underflow"<<std::endl;
          return false;
      }
      return s[top--];
  }
SeqStackTest.cxx
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
  #include "seqstack.h"
  int main()
  {
      SeqStack<int> S(10);
      std::cout<<S.IsEmpty()<<std::endl;
      std::cout << S.IsFull() << std::endl;
      for(int i = 0; i < 10; i++)
      {
          S.Push(i);
          int x;
          S.Top(x);
          std::cout<< x << " ";
      }
      std::cout << std::endl;
      std::cout << S.IsFull() << std::endl;
      for(int i = 0; i < 10; i++)
      {
          std::cout <<  S.Pop() << " " ;
      }
      std::cout << std::endl;
  }
Queue
queue.h
1
2
3
4
5
6
7
8
9
10
11
12
  #include <iostream>
  template <class T>
  class Queue
  {
  public:
      virtual bool IsEmpty() const=0;
      virtual bool IsFull() const=0;
      virtual bool Front(T& x) const=0;
      virtual bool EnQueue(T x) =0;
      virtual T DeQueue() =0;
      virtual bool Clear() =0;
  };
Sequential Queue
seqqueue.h
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
  #include "queue.h"

  template <class T>
  class SeqQueue:public Queue
  {
  public:
      SeqQueue(int mSize);
      ~SeqQueue() { delete [] q;}
      bool IsEmpty() const { return front == rear;}
      bool IsFull() const { return (rear+1) == front;}
      bool Front(T& x) const;
      bool EnQueue(T x);
      T DeQueue();
      void Clear() { front = rear = 0;}
  private:
      int front, rear;
      int maxSize;
      T *q;
  };

  template <class T>
  SeqQueue<T>::SeqQueue(int mSize)
  {
      maxSize = mSize;
      q = new T[maxSize];
      front = rear = 0;
  }

  template <class T>
  bool SeqQueue<T>::Front(T& x) const
  {
      if (IsEmpty())
      {
          std::cout << "Empty" << std::endl;
          return false;
      }
      x = q[front];
      return true;
  }

  template <class T>
  bool SeqQueue<T>::EnQueue(T x)
  {
      if (IsFull())
      {
          std::cout << "Overflow" << std::endl;
          return false;
      }
      q[(rear=(rear + 1) % maxSize)] = x;
      return true;
  }

  template <class T>
  T SeqQueue<T>::DeQueue()
  {
      if (IsEmpty())
      {
          std::cout << "Underflow" << std::endl;
          return false;
      }
      T x = q[front];
      front = (front + 1) % maxSize;
      return x;
  }

10.2 Linked lists 236

Linear List
linearlist.h
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
  template <class T>
  class LinearList
  {
  public:
      virtual bool IsEmpty() const=0;
      virtual int GetLength() const=0;
      virtual bool Find(int i, T& x) const=0;            //return element x with index i
      virtual int Search(T x) const=0;                   //if x is found, return its index, else -1
      virtual bool Insert(int i, T x)=0;                 //insert x at element Ai.
      virtual bool Delete(int i)=0;                      //delete element Ai
      virtual bool Update(int i, T x)=0;                 //update element Ai with x
      virtual void Output(std::ostream& out)const=0;     //ouput the list to the stream
  protected:
      int ElemNum;                                        //element number in the list
  };
Single List
singlelist.h
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
  #include "iostream"
  #include "linearlist.h"

  template <class T> class SingleList;

  template <class T>
  class Node
  {
      private:
          T element;
          Node<T> *next;
          friend class SingleList<T>;
  };

  template <class T>
  class SingleList:public LinearList<T>
  {
      public:
          SingleList(){ head = NULL; this->ElemNum = 0;}
          ~SingleList();
          bool IsEmpty() const;
          int GetLength() const;
          bool Find(int i, T& x) const;
          int Search(T x) const;
          bool Insert(int i, T x);
          bool Delete(int i);
          bool Update(int i, T x);
          void Clear();
          void Output(std::ostream& out) const;
      private:
          Node<T>* head;
  };

  template <class T>
  SingleList<T>::~SingleList()
  {
      Node<T> *p;
      while (head)
      {
          p = head->next;
          delete head;
          head = p;
      }
  }

  template <class T>
  bool SingleList<T>::IsEmpty() const
  {
      return this->ElemNum == 0;
  }

  template <class T>
  int SingleList<T>::GetLength() const
  {
      return this->ElemNum;
  }

  template <class T>
  bool SingleList<T>::Find(int i, T& x) const
  {
      if (i < 0 || i > this->ElemNum -1)
      {
          std::cout << "Exceed Bounds!" << std::endl;
          return false;
      }
      Node<T> *p = head;
      for (int j = 0; j < i; j++) p = p->next;
      x = p->element;
      return true;
  }

  template <class T>
  int SingleList<T>::Search(T x) const
  {
      Node<T> *p = head;
      for (int i = 0; i < this->ElemNum; i ++)
      {
          if(x == p->element)
              return i;
          else
              p = p->next;
      }
      return -1;
  }

  template <class T>
  bool SingleList<T>::Insert(int i, T x)
  {
      if (i < 0 || i > this->ElemNum )
      {
          std::cout << "Exceed Bounds!" << std::endl;
          return false;
      }
      Node<T> * r = new Node<T>(); r->element = x;
      Node<T> * p = head;
      Node<T> * q ;
      for (int j = 0; j < i; j++)
      {
          q = p;
          p = p->next;
      }
      if(i == 0)
      {
          head = r;
          r->next = p;
      }
      else
      {
          q->next = r;
          r->next = p;
      }
      (this->ElemNum) ++;
      return true;
  }

  template <class T>
  bool SingleList<T>::Delete(int i)
  {
      if (!(this->ElemNum))
      {
          std::cout<<"Underflow!"<<std::endl;
          return false;
      }
      if (i < 0 || i > this->ElemNum - 1)
      {
          std::cout << "Exceed Bounds!" << std::endl;
          return false;
      }
      Node<T> * p = head;
      Node<T> * q = head;
      if (i == 0)
      {
          head = head->next;
          delete p;
      }
      else
      {
          for (int j = 0; j < i; j++)
          {
              q = p;
              p = p->next;
          }
          q->next = p->next;
          delete p;
      }
      (this->ElemNum) --;
      return true;
  }

  template <class T>
  bool SingleList<T>::Update(int i, T x)
  {
      if (i < 0 || i > this->ElemNum-1)
      {

          std::cout << "Exceed Bounds!" << std::endl;
          return false;
      }
      Node<T> * p = head;
      for (int j = 0; j < i; j ++)
      {
          p = p->next;
      }
      p->element = x;
      return true;
  }

  template <class T>
  void SingleList<T>::Clear()
  {
      Node<T> *p;
      while (head)
      {
          p = head->next;
          delete head;
          head = p;
      }
  }

  template <class T>
  void SingleList<T>::Output(std::ostream& out) const
  {
      Node<T> *p = head;
      while (p)
      {
          out << p->element << " ";
          p = p->next;
      }
      out << std::endl;
  }
SingleListTest.cxx
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
  #include "singlelisti.h"
  #include "stdio.h"
  int main()
  {
      SingleList<int> LA;
      SingleList<int> LB;
      for (int i = 0; i < 6; i++) LA.Insert(i, i);
      for (int i = 5; i < 10; i++) LB.Insert(i-5, i);
      LB.Insert(0, 0);
      LB.Insert(4, 2);
      LB.Insert(LB.GetLength(), 4);
      LA.Output(std::cout);
      LB.Output(std::cout);
      Intersection(LA, LB);
      LA.Output(std::cout);

      return 0;
  }
Double List
doublelist.h
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
  #include "linearlist.h"

  template <class T> class DoubleList;

  template <class T>
  class Node
  {
      private:
          T element;
          Node<T> *last, *next;
          friend class DoubleList<T>;
  };

  template <class T>
  class DoubleList:public LinearList<T>
  {
      public:
          DoubleList(){ head = NULL; this->ElemNum = 0;}
          ~DoubleList();
          bool IsEmpty() const;
          int GetLength() const;
          bool Find(int i, T& x) const;
          int Search(T x) const;
          bool Insert(int i, T x);
          bool Delete(int i);
          bool Update(int i, T x);
          void Clear();
          void Output(std::ostream& out) const;
      private:
          Node<T>* head;
  };

  template <class T>
  DoubleList<T>::~DoubleList()
  {
      Node<T> *p;
      while (head)
      {
          p = head->next;
          delete head;
          head = p;
      }
  }

  template <class T>
  bool DoubleList<T>::IsEmpty() const
  {
      return this->ElemNum == 0;
  }

  template <class T>
  int DoubleList<T>::GetLength() const
  {
      return this->ElemNum;
  }

  template <class T>
  bool DoubleList<T>::Find(int i, T& x) const
  {
      if (i < 0 || i > this->ElemNum -1)
      {
          std::cout << "Exceed Bounds!" << std::endl;
          return false;
      }
      Node<T> *p = head;
      for (int j = 0; j < i; j++) p = p->next;
      x = p->element;
      return true;
  }

  template <class T>
  int DoubleList<T>::Search(T x) const
  {
      Node<T> *p = head;
      for (int i = 0; i < this->ElemNum; i ++)
      {
          if(x == p->element)
              return i;
          else
              p = p->next;
      }
      return -1;
  }

  template <class T>
  bool DoubleList<T>::Insert(int i, T x)
  {
      if (i < 0 || i > this->ElemNum )
      {
          std::cout << "Exceed Bounds!" << std::endl;
          return false;
      }
      Node<T> * r = new Node<T>(); r->element = x;
      Node<T> * p = head;
      for (int j = 0; j < i; j++)
      {
          p = p->next;
      }
      if(i == 0)
      {
          head = r;
          r->last = NULL;
          r->next = p;
          p->last = r;
      }
      else
      {
          r->last = p->last;
          p->last->next = r;
          r->next = p;
          p->last = r;
      }
      (this->ElemNum) ++;
      return true;
  }

  template <class T>
  bool DoubleList<T>::Delete(int i)
  {
      if (!(this->ElemNum))
      {
          std::cout<<"Underflow!"<<std::endl;
          return false;
      }
      if (i < 0 || i > this->ElemNum - 1)
      {
          std::cout << "Exceed Bounds!" << std::endl;
          return false;
      }
      Node<T> * p = head;
      if (i == 0)
      {
          head = head->next;
          delete p;
      }
      else
      {
          for (int j = 0; j < i; j++)
          {
              p = p->next;
          }
          p->last->next = p->next;
          p->next->last = p->last;
          delete p;
      }
      (this->ElemNum) --;
      return true;
  }

  template <class T>
  bool DoubleList<T>::Update(int i, T x)
  {
      if (i < 0 || i > this->ElemNum-1)
      {

          std::cout << "Exceed Bounds!" << std::endl;
          return false;
      }
      Node<T> * p = head;
      for (int j = 0; j < i; j ++)
      {
          p = p->next;
      }
      p->element = x;
      return true;
  }

  template <class T>
  void DoubleList<T>::Clear()
  {
      Node<T> *p;
      while (head)
      {
          p = head->next;
          delete head;
          head = p;
      }
  }

  template <class T>
  void DoubleList<T>::Output(std::ostream& out) const
  {
      Node<T> *p = head;
      while (p)
      {
          out << p->element << " ";
          p = p->next;
      }
      out << std::endl;
  }
DoubleListTest.cxx
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
  #include "singlelisti.h"
  #include "stdio.h"
  int main()
  {
      SingleList<int> LA;
      SingleList<int> LB;
      for (int i = 0; i < 6; i++) LA.Insert(i, i);
      for (int i = 5; i < 10; i++) LB.Insert(i-5, i);
      LB.Insert(0, 0);
      LB.Insert(4, 2);
      LB.Insert(LB.GetLength(), 4);
      LA.Output(std::cout);
      LB.Output(std::cout);
      LA.Delete(3);
      LA.Output(std::cout);

      return 0;
  }

10.3 Implementing pointers and objects 241

  • How do we implement pointers and objects in languages that do not provide them?
    1. A multiple-array representation of objects
      • for example, next[], key[], prev[] can store the values and indexes
    2. A single-array representation of objects
      • for example, store triple < key, next, prev > in one single array
  • Allocating and freeing objects
    • a garbage collector is responsible for determining which objects are unused.
    • We keep the free objects in a singly linked list, which we call the free list. Each object in the representation is either in list L or in the free list, but not in both. Free list acts like a stack. 在free list的head操作

10.4 Representing rooted trees 246

  • Binary trees
  • Rooted trees with unbounded branching
    • left-child, right-sibling representation
      1. x.left-child points to the leftmost child of node x
      2. x.right-sibling points to the sibling of x immediately to its right.
      3. If node x has no children, then x.left-child = NIL, and if node x is the rightmost child of its parent, then x.right-sibling = NIL.
  • Other tree representations. Many are possible.
binarytree.hpp
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
  template <class T>
  struct BTNode
  {
      BTNode()
      {
          lChild = rChild = NULL;
      }
      BTNode(const T& x)
      {
          element = x;
          lChild = rChild = NULL;
      }
      BTNode(const T& x, BTNode<T>* l, BTNode<T>* r)
      {
          element = x;
          lChild = l;
          rChild = r;
      }
      T element;
      BTNode<T>* lChild, *rChild;
  };

  template <class T>
  class BinaryTree
  {
  public:
      BinaryTree() {root = NULL;}
      ~BinaryTree() {};
      bool IsEmpty() const;
      void Clear();
      bool Root(T& x) const;
      void MakeTree(const T& x, BinaryTree<T>& left, BinaryTree<T>& right);
      void BreakTree(T& x, BinaryTree<T>& left, BinaryTree<T>& right);
      void PreOrder(void (*Visit)(T& x));
      void InOrder(void (*Visit)(T& x));
      void PostOrder(void (*Visit)(T& x));
      int Size();
      void Copy();
  protected:
      BTNode<T>* root;
  private:
      void Clear(BTNode<T>* &t);
      void PreOrder(void (*Visit)(T& x), BTNode<T> *t);
      void InOrder(void (*Visit)(T& x), BTNode<T> *t);
      void PostOrder(void (*Visit)(T& x), BTNode<T> *t);
      int Size(BTNode<T> *t);
      void Clear(BTNode<T> *t);
      friend class BIterator;
  };

  template <class T>
  bool BinaryTree<T>::Root(T& x)const
  {
      if(root)
      {
          x = root->element;
          return true;
      }
      else
          return false;
  }

  template <class T>
  void BinaryTree<T>::MakeTree(const T& x, BinaryTree<T>& left, BinaryTree<T>& right)
  {
      if (root || &left == &right) return;
      root = new BTNode<T> (x, left.root, right.root);
      left.root=right.root=NULL;
  }

  template <class T>
  void BinaryTree<T>::BreakTree(T& x, BinaryTree<T>& left, BinaryTree<T>& right)
  {
      if (!root||&left==&right||left.root||left.root) return;
      x=root->element;
      left.root=root->lChild; right.root=root->rChild;
      delete root;
      root=NULL;
  }

  template <class T>
  void BinaryTree<T>::PreOrder(void (* Visit)(T& x))
  {
      PreOrder(Visit, root);
  }
  template <class T>
  void BinaryTree<T>::PreOrder(void (* Visit)(T& x), BTNode<T>* t)
  {
      if (t)
      {
          Visit(t->element);
          PreOrder(Visit, t->lChild);
          PreOrder(Visit, t->rChild);
      }
  }

  template <class T>
  void BinaryTree<T>::InOrder(void (* Visit)(T& x))
  {
      InOrder(Visit, root);
  }
  template <class T>
  void BinaryTree<T>::InOrder(void (* Visit)(T& x), BTNode<T>* t)
  {
      if(t)
      {
          InOrder(Visit, t->lChild);
          Visit(t->element);
          InOrder(Visit, t->rChild);
      }
  }


  template <class T>
  void BinaryTree<T>::PostOrder(void (* Visit)(T& x))
  {
      PostOrder(Visit, root);
  }
  template <class T>
  void BinaryTree<T>::PostOrder(void (* Visit)(T&x), BTNode<T>* t)
  {
      if(t)
      {
          PostOrder(Visit, t->lChild);
          PostOrder(Visit, t->rChild);
          Visit(t->element);
      }
  }

  template <class T>
  int BinaryTree<T>::Size()
  {
      return Size(root);
  }
  template <class T>
  int BinaryTree<T>::Size(BTNode<T> &t)
  {
      if(!t) return 0;
      return Size(t->lChild) + Size(t->rChild) + 1;
  }

  template <class T>
  BTNode<T>* BinaryTree<T>::Copy(BTNode<T>* t)
  {
      if(!t) return NULL;
      BTNode<T>* q=new BTNode<T>(t->element);
      q->lChild=Copy(t->lChild);
      q->rChild=Copy(t->rChild);
      return q;
  }

  template <class T>
  void BinaryTree<T>::Clear()
  {
      Clear(root);
  }
  template <class T>
  void BinaryTree<T>::Clear(BTNode<T>* t)
  {
      if(t)
      {
          Clear(t->lChild);
          Clear(t->rChild);
          delete t;
      }
  }


  template <class>
  class BIterator
  {
  public:
      virtual T* GoFirst(const BinaryTree<T>& bt)=0;
      virtual T* Next()=0;
      virtual void Traverse(void (*Visit)(T& x), const BinaryTree<T>& bt);
  protected:
      BTNode<T>* r, *current;
  };

  template <class T>
  void BIterator<T>::Traverse(void (*Visit)(T& x), const BinaryTree<T>& bt)
  {
      T* p=GoFirst(bt);
      while(p)
      {
          Visit(*p);
          p=Next();
      }
  }

  template <class T>
  class IInOrder:public BIterator<T>
  {
  public:
      IInOrder(BinaryTree<T>& bt, int mSize)
      {
          r=bt.root;
          current=NULL;
          s=new SeqStack<BTNode<T>* >(mSize);
      }
      T* GoFirst(const BinaryTree<T>& bt);
      T* Next();
  private:
      SeqStack<BTNode<T>*> *s;
  };

11 Hash Tables 253

  • A hash table is an effective data structure for implementing dictionaries.
    • searching worst-case Θ(n)
    • expected O(1)
    • Instead of using the key as an array index directly, the array index is computed from the key.

11.1 Direct-address tables 254

Assume that universe U of keys is reasonably small, and no two elements have the same key. Store x at T[x.key] or at the place which T[x.key] points to. (Even storing the key is not necessary.)

slot: position in the array (direct-address table)

11.2 Hash tables 256

  • What if universe U is large? namely, K of keys stored in a dictionary is much smaller than the universe U of all possible keys?
  • hash function! element is stored in slot h(k) instead of key directly
    • h: U -> {0, 1, …, m-1}
    • An element with key k hashes to slot h(k).
  • Problem: collision: two keys may hash to the same slot.
    • Collision resolution by chaining, insert at the head
      • Given a hash table T with m slots that stores n elements, we define the load factor α for T as n/m, that is, the average number of elements stored in a chain.
      • Worst-case: Θ(n) + T(compute hash)
      • Average-case: simple uniform hashing Any given element is equally likely to hash into any of the m slots, independently of where any other element has hashed to.
        • Unsuccessful, Θ(1+α) n个元素m个slot,期望是每个slot有n/m个元素
        • Successful, Θ(1+α)

11.3 Hash functions 262

  • What makes a good hash function? less collisions
    • satisfies (approximately) the assumption of simple uniform hashing.
    • Hash values are independent of any patterns that might exist in the data.
  • Interpreting keys as natural numbers
  • Different ways:
    1. division method. h(k) = k mod m.
    2. multiplication method. h(k) = ⦏m(kA mod 1)⦎, “kA mod 1” means the fractional part of kA, that is, kA - ⦏kA⦎. (A is a constant, 0 < A < 1) Knuth suggests that A = 0.618
    3. universal hashing. select the hash function at random from a carefully designed class of functions.

11.4 Open addressing 269

  • Open addressing: all elements occupy the hash table itself. No chaining. load factor α can never exceed 1.
    • We compute the sequence of slots to be examined by probing until we find an empty slot. The probing sequence has not to be < 0, 1, …, m-1 > with Θ(n), but depends upon the key being inserted. To determine which slots to probe, we extend the hash function to include the probe number (starting from 0) as a second input. < h(k,0), h(k,1), …, h(k,m-1) >.
    • 注意,此处不能标记删除的slot为NIL,因为这样一来就无法读到存在其后面部分的散列 值了。解决方法是,用DELETED取代NIL.
    • uniform hashing: the probe sequence of each key is equally likely to be any of the m! permutations of < 0, 1, …, m-1 >.
      • linear probing
        • h(k,i) = (h’(k)+i) mod m
        • problem: primary clustering
      • quadratic probing
        • h(k,i) = (h’(k) + c1*i + c2*i2) mod m
        • problem: secondary clustering
      • double hashing
        • one of the best
        • h(k,i) = (h1(k) + i*h2(k)) mod m

11.5 Perfect hashing 277

  • Perfect hashing: O(1) memory accesses are required to perform a search in the worst case.
    • two levels of hashing with universal hashing at each level.
    • By choosing the first-level hash function well, we can limit the expected total amount of space used to O(n)
    • TODO 证明很复杂的样子

C++ std::map的使用

12 Binary Search Trees 286

  • Performance
    • Best-case Θ(lgn)
    • Average-case Θ(lgn)
    • Worst-case Θ(n)
  • Improve? red-black trees(height O(lg n)), B-trees(secondary storage)
bstree.hpp
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
  #include"dynamic.h"
  #include"binarytree.h"
  //二叉搜索树:
  //所有节点的关键字不同,此树要么为空,要么:
  //1. 左不空,则左所有<根
  //2. 右不空,则右所有>根
  //3. 左右子树均为二叉搜索树
  template <class T>
  class BSTree:public DynamicSet<T>
  {
  public:
      BSTree() {root=NULL;}
      ResultCode Search_recursion(T& x) const;
      ResultCode Search_iteration(T& x) const;
      ResultCode Insert_recursion(T& x);
      ResultCode Insert_iteration(T& x);
      ResultCode Remove(T& x);

  protected:
      BTNode<T>* root;

  private:
      ResultCode Search(BTNode<T> *p, T& x) const;
      ResultCode Insert(BTNode<T> *p, T& x);
  };

  template <class T>
  ResultCode BSTree<T>::Search_recursion(T& x) const
  {
      return Search(root, x);
  }

  template <class T>
  ResultCode BSTree<T>::Search(BTNode<T> *p, T& x) const
  {
      if(!p) return NotPresent;
      else if(x>p->element) return Search(p->rChild, x);
      else if(x<p->element) return Search(p->lChild, x);
      else return Success;
  }
  template <class T>
  ResultCode BSTree<T>::Search_iteration(T& x) const
  {
      BTNode<T> *p=root;
      while(p)
      {
          if(x<p->element) p=p->lChild;
          else if(x>p->element) p=p->rChild;
          else return Success;
      }
      return NotPresent;
  }

  template <class T>
  ResultCode BSTree<T>::Insert_recursion(T& x)//ATTENTION: situation when root is NULL
  {
      if(root) return Insert(root, x);
      else root=new BTNode<T>(x);
  }

  template <class T>
  ResultCode BSTree<T>::Insert(BTNode<T>* p, T& x) //ATTENTION: situation when root is NULL
  {
      if(x>p->element)
      {
         if(p->rChild==NULL)
         {
             p->rChild=new BTNode(x);
             return Success;
         }
         else return Insert(p->rChild, x);
      }
      else if(x<p->element)
      {
         if(p->lChild==NULL)
         {
             p->lChild=new BTNode(T);
             return Success;
         }
         else return Insert(p->lChild, x);
      }
      else return Duplicate;
  }

  template <class T>
  ResultCode BSTree<T>::Insert_iteration(T& x)
  {
      BTNode<T> *p=root, *q=NULL;
      if(!p)
      {
          q=p;
          if(x>p->element) p=p->rChild;
          else if(x<p->element) p=p->lChild;
          else return Duplicate;
      }
      p=new BTNode<T>(x);
      if(!root) root=p;
      else if(x<q->element) q->lChild=p;
      else q->rChild=p;
      return Success;
  }

  template <class T>
  ResultCode BSTree<T>::Remove(T* x)
  {
     // TODO
      }

12.1 What is a binary search tree? 286

binary-search-tree property: left <= middle <= right: Let x be a node in a binary search tree. If y is a node in the left subtree of x, then y.key <= x.key. If y is a node in the right subtree of x, then y.key >= x.key.

How to print out keys in sorted order? inorder tree walk (other walks: preorder, postorder)

1
2
3
4
5
6
7
  void inorderTreeWalk(Node* r) {
      if (r) {
          inorderTreeWalk(r->left);
          cout << r.key;
          inorderTreeWalk(r->right)};
      }
  }

Θ(n)

12.2 Querying a binary search tree 289

  • Iteration is faster in Java, C, Python… Recursion is fairly expensive compared to iteration (in general) because it requires the allocation of a new stack frame.
  • Recursion might be faster in functional programming language.
Searching
  • Recursive version: Me? search left? search right?
  • Iteration version: Use pointer p. p->key? p=p->left? p=p->right?
Min and Max

left-most leaf and right-most leaf

Successor and predecessor
  • Successor: If x has right subtree, return max(x.right), else find in parents.
  • Predecessor: If x has left subtree, return max(x.left), else find in parents.

12.3 Insertion and deletion 294

12.4 Randomly built binary search trees 299

13 Red-Black Trees 308

13.1 Properties of red-black trees 308

  • A BST with one extra color bit of storage per node.
  • Red-black properties:
    1. Every node is either red or black.
    2. The root is black.
    3. Every leaf (NIL) is black.
    4. If a node is red, then both its children are black.
    5. For each node, all simple paths from the node to descendant leaves contain the same number of black nodes.
  • sentinel T.nil for all all pointers to NIL (replacing leaves and parent of root)
  • black-height of node x: number of black nodes on the path from (not including) x to a leaf
  • black-height of a tree: black-height of its root
  • A red-black tree with n internal nodes has height at most 2*log(n+1)

13.2 Rotations 312

  • Rotation: A local operation in a search tree that preserves the binary-search-tree property.
    • left-rotation: counter-clockwise
      • set y
      • turn y’s left subtree into x’s right subtree
      • link x’s parent to y
      • put x on y’s left
    • right-rotation: clockwise

13.3 Insertion 315

  1. insert node red z into tree T (Property 5)
  2. auxiliary procedure RBinsertFixup(T,z) to re-color and rotate.Ω
    • Property 2 is violated if z is the root
    • Property 4 is violated if z’s parent is red.
      • Case 1: z’s uncle y is red: z上两层反色
      • Case 2: z’s uncle y is black and z is a right child: around z left-rotate
      • Case 3: z’s uncle y is black and z is a left child: around z’s parent right-rotate + z上两层反色
    • O(lgn)

13.4 Deletion 323

  • RBTransplant(T, u, v): replace subtree u with subtree v. mind NIL node as parent and leaves

TODO

14 Augmenting Data Structures 339

14.1 Dynamic order statistics 339

  • Rank of an element: its position in the linear order of the set
  • Order-statistic tree: RBT + every node has x.size = number of nodes in its subtree, including the NIL sentinel
    • x.size = x.left.size + x.right.size + 1

14.2 How to augment a data structure 345

  1. Choose an underlying data structure.
  2. Determine additional information to maintain in the underlying data structure.
  3. Verify that we can maintain the additional information for the basic modifying operations on the underlying data structure.
  4. Develop new operations.

14.3 Interval trees 348

RBT + operations on dynamic sets of intervals.