Welcome to the Treehouse Community

The Treehouse Community is a meeting place for developers, designers, and programmers of all backgrounds and skill levels to get support. Collaborate here on code errors or bugs that you need feedback on, or asking for an extra set of eyes on your latest project. Join thousands of Treehouse students and alumni in the community today. (Note: Only Treehouse students can comment or ask questions, but non-students are welcome to browse our conversations.)

Looking to learn something new?

Treehouse offers a seven day free trial for new students. Get access to thousands of hours of content and a supportive community. Start your free trial today.

Computer Science Algorithms: Sorting and Searching Searching Names Big O Runtime of Search Algorithms

Mike Straw
Mike Straw
7,058 Points

Is there a case where it will be faster to quicksort + binary search an unsorted list instead of using linear search?

If I'm understanding the way these run, since the average quicksort is O(n log n) and binary search is O(log n), the overall efficiency is still O(n log n), which is worse than O(n).

This matches what I saw when I tested a linear search on an unsorted list, then created a combined script to run quick sort, then perform a binary search:

import sys
from load import load_strings

names = load_strings(sys.argv[1])

def merge_sort(values):
  if len(values) <= 1:
    return values
  middle_index = len(values) // 2
  left_values = merge_sort(values[:middle_index])
  right_values = merge_sort(values[middle_index:])
  sorted_values = []
  left_index = 0
  right_index = 0
  while left_index < len(left_values) and right_index < len(right_values):
    if left_values[left_index] < right_values[right_index]:
      sorted_values.append(left_values[left_index])
      left_index += 1
    else:
      sorted_values.append(right_values[right_index])
      right_index += 1
  sorted_values += left_values[left_index:]
  sorted_values += right_values[right_index:]
  return sorted_values


sorted_names = merge_sort(names)

search_names = ["Francina Vigneault", "Lucie Hansman", "Nancie Rubalcaba", "Elida Sleight", "Guy Lashbaugh", "Ginger Schlossman", "Suellen Luaces", "Jamey Kirchgesler", "Amiee Elwer", "Lacresha Peet", "Leonia Goretti", "Carina Bunge", "Renee Brendeland", "Andrew Mcgibney", "Gina Degon", "Deandra Pihl", "Desmond Steindorf", "Magda Growney", "Tawana Srivastava", "Tammi Todman", "Harley Mussell", "Iola Bordenet", "Edwardo Khela", "Myles Deanne", "Elden Dohrman", "Ira Hooghkirk", "Eileen Stigers", "Mariann Melena", "Maryrose Badura", "Ardelia Koffler", "Lacresha Kempker", "Charlyn Singley", "Lekisha Tawney", "Christena Botras", "Mike Blanchet", "Cathryn Hinkson", "Errol Shinkle", "Mavis Bhardwaj", "Sung Filipi", "Keiko Dedeke", "Lorelei Morrical", "Jimmie Lessin", "Adrianne Hercules", "Latrisha Haen", "Denny Friedeck", "Emmett Whitesell", "Sina Sauby", "Melony Engwer", "Alina Reichel", "Rosamond Shawe", "Elinore Benyard", "Sang Bouy", "Ed Aparo", "Sheri Wedding", "Sang Snellgrove", "Shaquana Sones", "Elvia Motamed", "Candice Lucey", "Sibyl Froeschle", "Ray Spratling", "Cody Mandeville", "Donita Cheatham", "Darren Later", "Johnnie Stivanson", "Enola Kohli", "Leann Muccia", "Carey Philps", "Suellen Tohonnie", "Evelynn Delucia", "Luz Kliment", "Lettie Jirjis", "Francene Klebe", "Margart Scholz", "Sarah Growden", "Glennis Gines", "Rachael Ojima", "Teofila Stample", "Narcisa Shanley", "Gene Lesnick", "Malena Applebaum", "Norma Tingey", "Marianela Mcmullen", "Rosalva Dosreis", "Dallas Heinzmann", "Sade Streitnatter", "Lea Pelzel", "Judith Zwahlen", "Hope Vacarro", "Annette Ayudan", "Irvin Cyree", "Scottie Levenstein", "Agustina Kobel", "Kira Moala", "Fawn Englar", "Jamal Gillians", "Karen Lauterborn", "Kit Karratti", "Steven Deras", "Aaron Agustine", "Zulma Tishler"]

def binary_search(collection, target):
  first = 0
  last = len(collection) - 1
  while first <= last:
    midpoint = (first + last) // 2
    if collection[midpoint] == target:
      return midpoint
    elif collection[midpoint] < target:
      first = midpoint + 1
    else:
      last = midpoint - 1
  return None

for n in search_names:
  index = binary_search(names, n)
  print(index)

My test times align with what's predicted:

Linear search: O(n^2)

real 0m0.694s
user 0m0.313s
sys 0m0.018s

Quicksort + binary search: O(n log n)

real 0m0.802s
user 0m0.388s
sys 0m0.013s

Intuitively, though, it feels like there would be a point where sorting would be more efficient. If you try to do a linear search of every string in Google, it would take an inordinate amount of time. Would a quicksort speed this up?

I definitely see the value in storing data in a sorted manner up front, but was wondering if there was something I'm missing in how to better handle an unsorted list.