Scaling your code with Python Generator Expressions

Quick Intro

Imagine you need to iterate through 100 million entries but not really going through all the objects at the same time?

Does your app really need to keep 100 million entries in memory? If you just need to go through some of the entries at a given time then generators might be a solution for you.

In this short article, I will show you the following:

  • How to create a simple generator object using a generator expression
  • How generator objects keep track of where you are in the loop

How to create a simple generator object

This is a list:

This is a generator expression:

Generator expressions use parenthesis instead of square brackets.

Because the list is immediately loaded into memory, it eats up a much larger chunk of memory:

Also note that generator expression creates a generator object:

How generator objects keep track of where you are in the loop

It means that the 100 million entries are not immediately loaded into memory in a generator object but only pointers to where you are!

I'll show it to you now...

Let's print first 10 numbers from our list:

And now from our generator object:

Up to now, everything looks the same, right?

Let's create another loop printing up to 15 numbers now.

Here's the list again:

Notice it starts from the beginning of number_list again.

Let's create an identical loop for generator object now:

And here is the generator object loop:

See? Generator object knows exactly where it was and continues from where it left off.

It uses next() function behind the scenes to keep going.

We can even move pointer forward if we want to with next() function, look at that:

In case you're wondering, 11 and 16 were not printed in generator loops because there is no print in our last else statement before break

Bonus! Generator vs List lab test!

If we add 100 million numbers to a list, it's going to load everything into memory and it should take much longer and much more memory:

Because we just wanted to read the first 50 entries of generator, it took only 0.001 seconds to do it.

For the list, it took 4 seconds because it had to load everything into memory first.

If you're curious, this was the code I used for my performance function:

It's literally a function that generates numbers up to list_size and iterates through them up to limit we set.

For the list, the problem is just that we have to load it into memory first as I mentioned since the beginning.

Published Jul 04, 2019
Version 1.0
No CommentsBe the first to comment