CSCI 4041 Algorithms and Data Structures - Spring 2025
Midterm Exam
The following Algorithms and Data Structures may be useful for answering questions..
Algorithms:
bubble_sort(A)
# Sorts by repeatedly swapping adjacent items until they are correct.
selection_sort(A)
# Sorts by finding the minimum value for each subarray.
insertion_sort(A)
# Sorts list by repeatedly inserting into a sorted list.
merge_sort(A)
# Sorts list using divide and conquer
heap_sort(A)
# Sorts by moving the top element of a max heap to the end of the array.
quicksort(A)
# Sorts by partitioning by a pivot and recursively sorting each partition.
counting_sort(A)
# Sorts by counting the times a key appears.
radix_sort(A, sortAlgorithm)
# Sorts array by sorting digits from least significant to most significant.
bucket_sort(A)
# Sorts by putting keys in bucket lists and sorting the lists.
Data Structures:
Heap(compareFunction):
# Maintains a min / max heap depending on a compareFunction.
Heap.extract_max()
# Returns and removes the top item on the heap and maintains the heap prpoerty.
Heap.size()
# Returns the size of the heap.
Heap.build_heap(A)
# Builds a heap from an array.
Heap.insert(item)
# Inserts an item into the heap in correct location.
Problem 1: Runtime (50 points)
Theory (15 points):
(3 points each) True/False - Check the box.
1. □ True □ False If f(n) = Θ(n
2
), then f(n) = O(n
3
) and n = O(f(n))
2. □ True □ False lg (n!) = O(n lg n).
3. □ True □ False n sin n = Θ(n).
4. □ True □ False n lg n = Ω(
√
n).
5. □ True □ False log3 n + lg n
3
= Θ(log7 n
3
).
Sorts (15 points):
(3 points each) Short Answer. Write the name of the sort you would use to solve each problem. Refer to Page 2 for a list of sorts.
6. Sort a list of text strings (each of length d) in linear time.
7. Sort sensor data that is uniformly distributed in linear time.
8. Sort a small array stably and in place.
9. Sort a large array of data stably in Θ(n lg n) time.
10. Sort random numbers in-place in Θ(n lg n) time.
Master Theorem (20 points):
Calculate the runtime for the following algorithm using the Master Theorem (if possible). If you cannot use the Master Theorem, explain why it cannot be used.
def blur_edges(A, p, r):
if p >= r
return
q = ⌊(r − p)/4⌋
blur_edges(A, p, p + q)
blur_edges(A, r-q, r)
for i = 2 to r-p:
A[i-1] = (A[i] + A[i-1])/2
Recall:
• log24 = 2
• log42 = 0.5
• a =
• b =
• f(n) =
• n
logba =
• Case (1,2,3 or None):
• Regulatory Condition? (if applicable)
• Runtime:
Problem 2: Correctness (50 points)
The following algorithm takes heap A, and returns the maximum path from the node at index index to a leaf node. The path is returned as a list of indexes:
// MAX - PATH (...) calculates the maximum path in heap A
// from the index to a leaf . Returns a list of indexes .
MAX - PATH (A , index )
1. if ( index > A . heap_size )
2. return []
3. L = MAX - PATH (A , LEFT ( index ))
4. R = MAX - PATH (A , RIGHT ( index ))
5. sum_L = SUM - PATH (A , L ) + A [ index ]
6. sum_R = SUM - PATH (A , R ) + A [ index ]
7. P = L
8. if sum_L < sum_R
9. P = R
10. P . append ( index )
11. return P
// SUM - PATH (...) returns the sum of items in heap A
// for a list of indexes in a path P
SUM - PATH (A , P )
1. n = P . length
1. sum = 0
2. for i = 1 to n - 1
3. sum = sum + A [ P [ i ]]
4. return sum + A [ P [ n ]]
Answer the following questions:
(a) Describe a loop invariant that would help prove the correctness of SUM-PATH(A, P).
(b) State the loop invariant before and after the k
th iteration. What does this mean about the state of the loop before the k + 1 iteration?
(c) Prove: MAX-PATH(A, index) is correct. (You may assume SUM-PATH(A, P) is correct.)
Problem 3: Sorting Application (50 points)
Consider the following scenario:
Your company specializes in analyzing remote sensor data. One of your customers would like to monitor the real-time statistics from a sensor. This involves repeatedly calculating the median from a list of measurements that are passed in via array A, and correspond to a particular time (e.g. A = [0.3, 2.0, 10.2, ...] at 110.34 minutes). To accomplish this task, you must meet the following requirements:
• Calculate the median as fast as possible without allocating memory. The median is the middle value of a sorted list.
• Measurements are not guaranteed to arrive in the correct order. For example, an early time measurement may arrive after a late time measurement. Therefore, we will use a heap to efficiently sort as measurements arrive.
• You will need to implement the following functions:
– INIT-HEAP() - Creates a heap that stores medians for efficient time ordering.
– CALC-MEDIAN(time, A, H) - Calculates the median of list A adds it to the heap H.
– GET-NEXT-MEDIAN(last time, H) - Returns the measurement with lowest time step greater than last time.
• You may use the following comparison function if needed.
min - time - measurement (A , B )
return A . time < B . time
Complete the following:
Note:
• You may call or use any of the algorithms or data structures listed on Page 2.
• You may write your code in psuedocode or any other language.
• Don’t worry too much about syntax. As long as we understand your algorithm, it is sufficient.
(a) Initialize the Medians array or data structure:
(b) Write the CALC-MEDIAN(time, A, H) algorithm described above.
(c) Write the GET-NEXT-MEDIAN(last time, H) algorithm described above.
(d) Briefly and informally describe the worst-case runtime for the following:
• INIT-MEDIANS()
• CALC-MEDIAN(...)
• GET-NEXT-MEDIAN(...)