Build a blog with Django: Use FreezeGun to improve seed data

In the last post we used Factory Boy and Faker to seed the database with a configurable amount of unpublished and published posts.

However, if you looked at the timestamps created_at, updated_at and published_at across all the posts you'd see that they were all equal, give or take a couple milliseconds.

Posts will all equal timestamps

It's not a huge problem but it would be better if we could spread the posts across a larger timespan to make the data seem more realistic.

A nice way to do it is by using a Python library called FreezeGun to control time in a limited context and then let your code run as it usually would but in the context you've setup. Then, when it makes use of time it will be in the time you've set and not the current time you're in in the physical world.

Don't worry if I'm confusing you, you'd see what I mean in a bit when I show you an example.

But, before I do that, I just wanted to mention that FreezeGun is mainly used in automated testing but I think we have a valid use case here.

So, let's get started.

FreezeGun

Install it, start up the shell:

(venv) $ pip install freezegun
(venv) $ cd src && python manage.py shell

And, try it out:

>>> from posts.factories import PostFactory
>>> post = PostFactory.create(is_published=True)
>>> post.published_at
datetime.datetime(2017, 1, 20, 9, 17, 37, 427022, tzinfo=<UTC>)

>>> import datetime
>>> from freezegun import freeze_time

>>> past = datetime.datetime(2016, 1, 1)

>>> with freeze_time(past):
...     post = PostFactory.create(is_published=True)
>>> post.published_at
FakeDatetime(2016, 1, 1, 0, 0, tzinfo=<UTC>)  
>>> post.id
2

>>> from posts.post import Post
>>> post = Post.objects.get(pk=2)
>>> post.published_at
datetime.datetime(2016, 1, 1, 0, 0, tzinfo=<UTC>)  

First, we created a post using PostFactory and this gave us a published_at timestamp based on the current time in the physical world.

Then, we imported freeze_time from the freezegun library and used it to control the time. Within that time altered context we created another post and after checking the published_at timestamp we noticed it was set to the time we wanted.

How cool was that?

Let's apply what we learned to our create_posts method in posts/seed.py. Open that file and edit it to contain:

from freezegun import freeze_time

from .factories import PostFactory  
from .models import Post  
from .utils.random import random_datetime


def create_posts(unpublished, published):  
    """Creates a given number of unpublished and published posts
    at most 90 days into the past.
    """

    for i in range(unpublished + published):
        is_published = i >= unpublished

        with freeze_time(random_datetime(n=90)):
            PostFactory.create(is_published=is_published)

The major difference is that I'm using freeze_time to cause the timestamps to be set to a random timestamp within a 90 day timespan. The code for the random_datetime generator can be found here.

N.B. Let me know, in the comments below, if you need me to explain how random_datetime works.

And, here's what we get when we run create_posts.

Posts with timestamps spread across a 90 day timespan

Much better :).

Wrap up

That's all I wanted to show you for now. I hope you learned something new.

Check out this commit to see all the changes necessary to get it working in yaba.

I only covered what was necessary to get the feature implemented but FreezeGun can do much more. Be sure to check out their docs to see what else it does.

In the next post, as promised, I'd show you how to write a custom management command to make seeding your database that much simpler.

P.S. Have you used FreezeGun before? Did you ever consider using it in this way? Let me know in the comments below.

P.S.S. Remember to signup for my newsletter to be the first to be notified of new posts I publish.