I'm sharing my experience and knowledge of writing code with a friend who has begun learning to code. I recently used a single problem to expose her to several aspects of Python, her chosen programming language. Coincidentally, she chose a language I enjoy using--I have so much to show her.
Image source: https://pexels.com
Let us begin with the problem
I asked her to write a function that takes a list of numbers and returns the modes. I also asked her to not use any libraries (code that others have written for our convenience) to solve the problem.
The function should work as such.
>> numbers = [3, 3, 3, 1, 1, 3, 4, 9, 9, 0, 0, 9, 9]
>> modes = find_modes(numbers)
>> print(modes)
[3, 9]
>> numbers2 = [3, 3, 1, 1, 4, 9, 9, 0, 0, 9, 9]
>> modes2 = find_modes(numbers2)
>> print(modes2)
[9]
The First Solution
def find_modes(numbers):
dictionary = {}
unique_set = set(numbers)
for unique in unique_set:
dictionary[unique] = 0
for number in numbers:
if number == unique:
dictionary[unique] += 1
max_freq = max(dictionary.values())
modes = []
for key in dictionary:
if dictionary[key] == max_freq:
modes.append(key)
return modes
The snippet above was her first solution to the problem, and it is correct. But, as Raymond Hettinger often says, "There must be a better way!"
NOTE: If you're learning Python and you don't know who Raymond is, then you need to know who Raymond is. I've learned so much about Python from him.
Lesson 1: Choose descriptive variable names
I find it common among beginner programmers to name their variables after data structures. However, it's not a practice I encourage. In statically typed languages (think C/C++ and Java), the types (int, float, struct, etc.) already tell you how the data is stored, which means there is no need to do what is shown here.
Array<Int> numbersArray = [1, 2, 3];
An even worse case is shown below. In this example, we have no clue of what stored data is, only how the data is stored, which is not very useful when reading the code.
Array<Int> array = ['a', 'b' 'c'];
Also, in dynamically typed languages (think Python and JavaScript), there is no guarantee that a variable named numbersArray
will always be an array. Someone could easily change its value. For example:
numbersArray = array('i', [1, 2, 3])
// ... some code
numbersArray = tuple(numbersArray)
// now you have a name that is misleading because
// numbersArray is not an array anymore
My rule of thumb is to name variables after the data they contain. Shortly after I shared my thoughts on choosing descriptive variable names with her, she refactored (changed to make better) the code and resent it to me.
def find_modes(numbers):
num_frequencies = {}
uniq_numbers = set(numbers)
for uniq_number in uniq_numbers:
num_frequencies[uniq_number] = 0
for number in numbers:
if number == uniq_number:
num_frequencies[uniq_number] += 1
max_freq = max(num_frequencies.values())
modes = []
for number in num_frequencies:
if num_frequencies[number] == max_freq:
modes.append(number)
return modes
This function reads much better, but there was still more to be done.
Lesson 2: Using List Comprehension
I mentioned to her that the last five lines could be made into a single line if she learned to used List Comprehensions. "Python has a construct called list comprehension. It allows the programmer to build lists in a concise way," I said. Then, I left her to explore, and this was the outcome.
def find_modes(numbers):
num_frequencies = {}
uniq_numbers = set(numbers)
for uniq_number in uniq_numbers:
num_frequencies[uniq_number] = 0
for number in numbers:
if number == uniq_number:
num_frequencies[uniq_number] += 1
max_freq = max(num_frequencies.values())
return [num for num, freq in num_frequencies.items() if freq == max_freq]
This is looking better, but we can still do more.
Lesson 3: Using the collections library
There is a very common pattern in this code. We are building a dictionary that has an default starting value. Take notice of this block. The starting value in the num_frequencies
dictionary is always set to zero.
for uniq_number in uniq_numbers:
num_frequencies[uniq_number] = 0
for number in numbers:
if number == uniq_number:
num_frequencies[uniq_number] += 1
defaultdicts
Again, I sent her back to the Python Collections documentation, now to explore and refactor the code to use a defaultdict.
from collections import defaultdict
def find_modes(numbers):
num_frequencies = defaultdict(int)
for number in numbers:
num_frequencies[number] += 1
max_freq = max(num_frequencies.values())
return [num for num, freq in num_frequencies.items() if freq == max_freq]
This function is shaping up quite nicely. We have gone from a dense 14-line function to a neat 5-line function. But, we can still do better. This block below is another pattern that Python developers have made easy for us to code.
num_frequencies = defaultdict(int)
for number in numbers:
num_frequencies[number] += 1
Counters
Counting the number of items in a collection is such a common task that the Python developers decided to create a convenient way to do it. I pointed her to the Counters documentation and asked her to refactor the find_modes
function to use a Counter. This was the result, an easy to read 3-line function. Belle!
from collections import Counter
def find_modes(numbers):
num_frequencies = Counter(numbers)
max_freq = max(num_frequencies.values())
return [num for num, freq in num_frequencies.items() if freq == max_freq]
That was the end of my lesson. With a single problem, she practiced choosing descriptive variable names, explored 2 commonly used and extremely useful library functions, and learned about list comprehension.
I like teaching techniques that inspire learners to explore on their own and apply what they discover to a problem. The technique I presented in this article does exactly that. If you find yourself sharing your programming experience with someone, consider using this technique. If you're the learner, I truly hope you learned something reading this article.
The mind is not a vessel to be filled, but a fire to be kindled. - Plutarch