@retry

Posted on
python

Background

I encountered an issue at work today where I had to retry an API call an unknown but reasonable amount of times.

We’re a Python shop at my current job, and there are retry decorators available on Pypi, but I didn’t want to take another software dependency, and personally I think pip is a rather fragile package manager. (There’s pipenv, which all the cool kids are using, but I haven’t tried it yet.)

I came up with a working implementation in a few minutes, and revised it a bit when I got home.

import functools
import time
import logging


logging.basicConfig(level=logging.DEBUG, format='%(message)s')
logger = logging.getLogger(__name__)


def retry(predicate=None, max_tries=3, max_delay=64, delay_s=1, exponential=False):
    if predicate is None:
        predicate = bool

    def decorator(f):
        def it():
            def factor(p):
                if exponential:
                    return pow(2, p)
                return 1

            for p in range(max_tries):
                yield min(factor(p) * delay_s, max_delay)

        @functools.wraps(f)
        def wrapper(*args, **kwargs):
            last_e = None
            for delay in it():
                try:
                    result = f(*args, **kwargs)
                except BaseException as e:
                    last_e = e
                    logger.exception('error in retry decorator!')
                else:
                    if predicate(result):
                        return result
                logger.debug('retrying after {} seconds'.format(delay))
                time.sleep(delay)
            if last_e:
                raise last_e
            raise Exception('No more retries!')
        return wrapper
    return decorator

Notes

  • a generator is used to yield the delay values out, which can be configured to grow exponentially
  • there’s a cap on the retry delay
  • iteration is favored over recursion to avoid any potential stack overflow
  • all exceptions are logged but the application can ignore the messages1
  • the last error is reraised if we run out of retry attempts1

I could have used a simple for loop, but one of the values of generators is decoupling iteration logic from application logic. This leads to cleaner code.

Testing

The function I used to test was

def flaky(n_errors=0, max_tries=5,
          predicate=lambda _: True, exponential=True):
    n = {'errors': n_errors}

    @retry(max_tries=max_tries,
           predicate=predicate, exponential=exponential)
    def f():
        if n['errors'] <= 0:
            return
        n['errors'] -= 1
        raise Exception('boom!')
    return f

Here’s what a successful series of retries looks like:

# within retry limit
f = flaky(n_errors=3, max_tries=5)
f()
(i) python3 retry.py
error in retry decorator!
Traceback (most recent call last):
  File "retry.py", line 30, in wrapper
    result = f(*args, **kwargs)
  File "retry.py", line 57, in f
    raise Exception('boom!')
Exception: boom!
retrying after 1 seconds
error in retry decorator!
Traceback (most recent call last):
  File "retry.py", line 30, in wrapper
    result = f(*args, **kwargs)
  File "retry.py", line 57, in f
    raise Exception('boom!')
Exception: boom!
retrying after 2 seconds
error in retry decorator!
Traceback (most recent call last):
  File "retry.py", line 30, in wrapper
    result = f(*args, **kwargs)
  File "retry.py", line 57, in f
    raise Exception('boom!')
Exception: boom!
retrying after 4 seconds
success!

Here’s what failure looks like:

# no more retries
f = flaky(n_errors=3, max_tries=3)
f()
error in retry decorator!
Traceback (most recent call last):
  File "retry.py", line 30, in wrapper
    result = f(*args, **kwargs)
  File "retry.py", line 57, in f
    raise Exception('boom!')
Exception: boom!
retrying after 1 seconds
error in retry decorator!
Traceback (most recent call last):
  File "retry.py", line 30, in wrapper
    result = f(*args, **kwargs)
  File "retry.py", line 57, in f
    raise Exception('boom!')
Exception: boom!
retrying after 2 seconds
error in retry decorator!
Traceback (most recent call last):
  File "retry.py", line 30, in wrapper
    result = f(*args, **kwargs)
  File "retry.py", line 57, in f
    raise Exception('boom!')
Exception: boom!
retrying after 4 seconds
Traceback (most recent call last):
  File "retry.py", line 66, in <module>
    print(f())
  File "retry.py", line 40, in wrapper
    raise last_e
  File "retry.py", line 30, in wrapper
    result = f(*args, **kwargs)
  File "retry.py", line 57, in f
    raise Exception('boom!')
Exception: boom!

Note that in practice, one should be able to configure the list of exceptions to trap.

Delays between retries should also be jittered in order to avoid synchronized request stampedes.


  1. Thanks to my former manager Terry for pointing this out. [return]