Itertools - dropwhile()
remove from the iterable until <something> happens
This is the third post about itertools functions. You can see all of them Here
Dropwhile is used to lazily loop an iterable, but it will drop values from start, based on a condition, usually a lambda function.
Cool, but how does it work?
When you create a dropwhile iterator, you are creating a pointer that will be placed in the original iterator when you start to iterate
Note: In practice there are a few caveats, but we'll see it later
To create the iterator, you need to pass two parameters: a callable that will define the condition to drop, and the iterator you want to use it. As stated before, the dropwhile works lazily, so at this stage, you only have the reference for the original iterator, without running it.
This setup also means that you will can still modify the iterator initial places, and it will be taken in account in the first loop1.
To simplify the initial concept, let's imagine you have a list of 10 numbers, and a dropwhile that will drop numbers less or equal 5
from itertools import dropwhile
original = [0,1,2,3,4,5,6,7,8,9]
drop = dropwhile(lambda x: x <= 5, original)at this point in time, in memory, you will have something like this:
Then, by calling next(drop), we would have a pointer moving forward until we reach position 0x06, which is the first value that is higher than 5, in the sequence:
Since there we still have the reference in original, we are still capable of modifying the list values, both before and after the drop pointer. But now, drop won't validate values, and will just keep moving forward on each iteration, even if the number meet the drop condition.
For example, a list that has numbers 1 to 10 and back to 1, would start at number 6 and print all numbers after that, including the numbers lower than 5:
from itertools import dropwhile
original = [0,1,2,3,4,5,6,7,8,9,10,9,8,7,6,5,4,3,2,1,0]
drop = list(dropwhile(lambda x: x <= 5, original))
print(drop)
#[6, 7, 8, 9, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0]This functions allow us to remove everything you don't want at the start of the array. You can remove initial log lines that appear for setup, or skip empty lines at start of a file. Another use case is to find a line that does not comply with schema. An example? Sure: Given a csv file, how do you which line is breaking the import? something like:
with open(file_path, newline='', encoding='utf-8') as f:
reader = csv.reader(f)
next(reader, None) # Skip header
return next(dropwhile(lambda row: len(row.split(',')) != 2, reader), None)What ifs
Now that we saw how it works, let's check a few tests that I did
What if I remove the original list?
from itertools import dropwhile
original = [0,1,2,3,4,5,6,7,8,9]
drop = dropwhile(lambda x: x <= 5, original)
del originalwhen we declare drop, we add a reference of the original memory, and python keep a track of the memory references for each variable. When del original is executed, we are actually removing the reference of the variable original to the value, but drop still has it, and because of that, the value will not be cleaned up by garbage collector. Drop will work as expected.
What if I change the list before getting the first value in the generator?
from itertools import dropwhile
original = [0,1,2,3,4,5,6,7,8,9]
drop = dropwhile(lambda x: x <= 5, original)
original[0] = 7In the case above, nothing will be dropped, and the modified list will be returned in drop
What if I change the list after I start to iterate with dropwhile?
Let's consider the original list, and drop variables, then you call next(drop) to return the first value, and drop all the previous values and have the pointer in the first value (that will be returned). Then, you modify the original values, and finally keep returning values from the generator.
from itertools import dropwhile
original = [0,1,2,3,4,5,6,7,8,9]
drop = dropwhile(lambda x: x <= 5, original)
next(drop)
original[7]='A'
result = next(drop)What would result be? The answer is A. Because the pointer is still at the same place, changing values after will be picked up when the generator reaches that index.
Additionally, if you append elements, they will also be picked up by the generator.
usually via next(iterator)



