Interrupting and Resuming via Checkpoints

It is important to be able to interrupt optimization and continue right from where you left off. Reasons include scheduling on shared resources, branching the optimization with different settings or securing yourself against crashes in long-running processes.

Note

Currently, this is not supported by all optimizers. It is the case for gradient descent, rmsprop, adadelta and rprop.

Climin makes this in parts possible and leaves the responsibility to the user in other parts. More specifically, the user has to take over the serialization of the parameter vector (i.e. wrt), the objective function and its derivatives (e.g. fprime) and the data (i.e. args). The reason for this is that one cannot build a generic procedure for this. The data might be depending on an open file descriptor and only a subset of Python functions can be serialized, which is those that are defined at the top level.

Saving the state to disk (or somewhere else)

The idea is that the info dictionary which is the result of each optimization step carries all the information necesseray to resume. Thus a recipe to write your state to disk is as follows.

import numpy as np
import cPickle
from climin.gd import Gradient Descent

pars = make_pars()
fprime = make_fprime()
data = make_data()
opt = GradientDescent(pars, fprime, args=data)
for info in opt:
    with open('state.pkl', 'w') as fp:
        cPickle.dump(info, fp)
    np.savetxt('parameters.csv', pars)

This snippet first generates the necessery quantities from library functions which we assume given. We then create a GradientDescent object over which we iterate to optimize. In each iteration, we pickle the info dictionary to disk.

Note

Pickling an info dictionary directly to disk might be a bad idea in many cases. E.g. it will contain the current data element or a gnumpy array, which is not picklable. It is the users’s responsibility to take care of that.

Loading the state from disk

We will now load the info dictionary from file, create an optimizer object an initialize it with values from the info dictionary.

import numpy as np
import cPickle
from climin.gd import Gradient Descent

pars = np.loadtxt('parameters.csv')
fprime = make_fprime()
data = make_data()
with open('state.pkl') as fp:
    info = cPickle.load(fp)

opt = GradientDescent(pars, fprime, args=data)
opt.set_from_info(info)

We can continue optimization right from where we left off.