Keras is a high-level API suitable for research and prototyping. The developer defines the neural network but essentially outsources its implementation, meaning they can operate with little to no insight on the underlying Tensorflow graph.
The arrangement works until the model needs to be used in a Keras-independent context—notably, upon completion of training and export to C++. Much work has been done in the way of exporting Tensorflow graphs, so it’s valuable to be able to approach Keras models in the same way as if they were created in pure Tensorflow.
Running Predictions
Before saving the graph, it’s helpful to run it in Tensorflow to gain insight on the graph architecture. These insights will be necessary for running in C++ later. First, load the model and collect the input and output tensors.
Create nodes to reset the states of the network. Each layer has 2 states, the hidden (h) and cell (c) states. Each state is contained in a Variable node in the shape (1, N) where N is the number of cells in the layer. The reset nodes assign a zero tensor to each variable.
Each state variable is also associated with an exit node; whereas the state is used during the current computation of outputs, the exit node is an adjusted tensor to be used as the state in the next iteration. To assist with updating the states, create nodes to assign the exit node value to the state variable.
Now you’re ready to run inference! For the first iteration, reset the states by running your reset_ops. Line 1 is the Tensorflow equivalent of model.reset_states() in Keras. Iterate through the input blocks, feeding them using the name of the input tensor. When Tensorflow runs the graph, the output computation occurs prior to the state update. That is, update_ops runs after the current inference and prepares the graph for the next iteration.