Week 4

This week we discussed deeply into the implementation of our framework along with making a few changes.

After merging #4230 the Perceptron is ready to implement on_pause_impl() of its own.

To start this off I wrote a simple code to serialize a machine whenever the user chooses pause after pressing CTRL+C. The idea here was to simply allow a user to serialize the model in a CFile* member of StoppableObject class. If the user wanted to do something else on pause he could overide the on_pause_impl() method and do it.

This had a few problems:

Most of the time the File member will remain unused which is a bad design.
Problems with file_name and overwriting of files.
Tests: We had earlier worked on reusing the serialization tests and including a test for whether model is stoppable.
However, we will also need a test to check if a model is updating its state properly in each iteration. We came up with one but it still had its own limitations.
Interfaces like Python cannot directly overload the on_pause_impl() methods. User does not have any choice but to write C++ code if he wants to do something else on paused.

With this in mind the current approach needed more thought. So, we decided to introduce the IterativeMachine class to shogun. This will allow the user to only cancel computation and then perform what ever is needed on the intermediate model then if he wants to, continue training again. This approach solves all our problems above.

We do not need to “predict” and prepare for what the user might want to do ideally. So no need for any extra members to StoppableObject. Any issues with filename are now directly in control of the user so that makes it a lot more transparent.

To implement this would mean changing implementation of all iterative algorithms to in a way that they only have a method to run a single iteration and not the whole thing at once. This means an updated state in each iteration is almost neccessary for the model to successfully train. We dont need complicated tests anymore for that.

But the best problem this approach solves is easy flow of code into Interfaces. Earlier we were looking into using DirectorClasses to allow interfaces to overload a method but that feels overkill. Now all the user in python needs to do is stop the training process with CTRL+C, perform whatever is needed, then call continue_train() again. Much simpler to implment and more flexible.

Contributions

helper to serialize machine to ascii

Iterative Machine

Contributions

Shubham Shukla