The Functional API
The Functional API
About Keras
Contributing to Keras
Setup
import numpy as np
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
Introduction
The Keras functional API is a way to create models that is more exible than the
tf.keras.Sequential API. The functional API can handle models with non-linear topology, models
with shared layers, and models with multiple inputs or outputs.
The main idea that a deep learning model is usually a directed acyclic graph (DAG) of layers. So the
functional API is a way to build graphs of layers.
This is a basic graph with three layers. To build this model using the functional API, start by creating
an input node:
inputs = keras.Input(shape=(784,))
The shape of the data is set as a 784-dimensional vector. The batch size is always omitted since
only the shape of each sample is speci ed.
If, for example, you have an image input with a shape of (32, 32, 3), you would use:
The inputs that is returned contains information about the shape and dtype of the input data that
you feed to your model. Here's the shape:
inputs.shape
https://keras.io/guides/functional_api/ 1/18
8/1/2020 The Functional API
TensorShape([None, 784])
inputs.dtype
tf.float32
You create a new node in the graph of layers by calling a layer on this inputs object:
The "layer call" action is like drawing an arrow from "inputs" to this layer you created. You're
"passing" the inputs to the dense layer, and out you get x.
x = layers.Dense(64, activation="relu")(x)
outputs = layers.Dense(10)(x)
At this point, you can create a Model by specifying its inputs and outputs in the graph of layers:
model.summary()
Model: "mnist_model"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) [(None, 784)] 0
_________________________________________________________________
dense (Dense) (None, 64) 50240
_________________________________________________________________
dense_1 (Dense) (None, 64) 4160
_________________________________________________________________
dense_2 (Dense) (None, 10) 650
=================================================================
Total params: 55,050
Trainable params: 55,050
Non-trainable params: 0
_________________________________________________________________
keras.utils.plot_model(model, "my_first_model.png")
https://keras.io/guides/functional_api/ 2/18
8/1/2020 The Functional API
And, optionally, display the input and output shapes of each layer in the plotted graph:
This gure and the code are almost identical. In the code version, the connection arrows are
replaced by the call operation.
A "graph of layers" is an intuitive mental image for a deep learning model, and the functional API is
a way to create models that closely mirror this.
Here, load the MNIST image data, reshape it into vectors, t the model on the data (while
monitoring performance on a validation split), then evaluate the model on the test data:
https://keras.io/guides/functional_api/ 3/18
8/1/2020 The Functional API
model.compile(
loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
optimizer=keras.optimizers.RMSprop(),
metrics=["accuracy"],
)
Epoch 1/2
750/750 [==============================] - 1s 1ms/step - loss: 0.3528 - accuracy: 0.9005 -
val_loss: 0.1883 - val_accuracy: 0.9457
Epoch 2/2
750/750 [==============================] - 1s 1ms/step - loss: 0.1684 - accuracy: 0.9505 -
val_loss: 0.1385 - val_accuracy: 0.9597
313/313 - 0s - loss: 0.1361 - accuracy: 0.9605
Test loss: 0.13611680269241333
Test accuracy: 0.9605000019073486
This saved le includes the: - model architecture - model weight values (that were learned during
training) - model training con g, if any (as passed to compile) - optimizer and its state, if any (to
restart training where you left o )
model.save("path_to_my_model")
del model
# Recreate the exact same model purely from the file:
model = keras.models.load_model("path_to_my_model")
In the example below, you use the same stack of layers to instantiate two models: an encoder model
that turns image inputs into 16-dimensional vectors, and an end-to-end autoencoder model for
training.
https://keras.io/guides/functional_api/ 4/18
8/1/2020 The Functional API
x = layers.Reshape((4, 4, 1))(encoder_output)
x = layers.Conv2DTranspose(16, 3, activation="relu")(x)
x = layers.Conv2DTranspose(32, 3, activation="relu")(x)
x = layers.UpSampling2D(3)(x)
x = layers.Conv2DTranspose(16, 3, activation="relu")(x)
decoder_output = layers.Conv2DTranspose(1, 3, activation="relu")(x)
Model: "encoder"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
img (InputLayer) [(None, 28, 28, 1)] 0
_________________________________________________________________
conv2d (Conv2D) (None, 26, 26, 16) 160
_________________________________________________________________
conv2d_1 (Conv2D) (None, 24, 24, 32) 4640
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 8, 8, 32) 0
_________________________________________________________________
conv2d_2 (Conv2D) (None, 6, 6, 32) 9248
_________________________________________________________________
conv2d_3 (Conv2D) (None, 4, 4, 16) 4624
_________________________________________________________________
global_max_pooling2d (Global (None, 16) 0
=================================================================
Total params: 18,672
Trainable params: 18,672
Non-trainable params: 0
_________________________________________________________________
Model: "autoencoder"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
img (InputLayer) [(None, 28, 28, 1)] 0
_________________________________________________________________
conv2d (Conv2D) (None, 26, 26, 16) 160
_________________________________________________________________
conv2d_1 (Conv2D) (None, 24, 24, 32) 4640
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 8, 8, 32) 0
_________________________________________________________________
conv2d_2 (Conv2D) (None, 6, 6, 32) 9248
_________________________________________________________________
conv2d_3 (Conv2D) (None, 4, 4, 16) 4624
_________________________________________________________________
global_max_pooling2d (Global (None, 16) 0
_________________________________________________________________
reshape (Reshape) (None, 4, 4, 1) 0
_________________________________________________________________
conv2d_transpose (Conv2DTran (None, 6, 6, 16) 160
_________________________________________________________________
conv2d_transpose_1 (Conv2DTr (None, 8, 8, 32) 4640
_________________________________________________________________
up_sampling2d (UpSampling2D) (None, 24, 24, 32) 0
_________________________________________________________________
conv2d_transpose_2 (Conv2DTr (None, 26, 26, 16) 4624
_________________________________________________________________
conv2d_transpose_3 (Conv2DTr (None, 28, 28, 1) 145
=================================================================
Total params: 28,241
Trainable params: 28,241
Non-trainable params: 0
_________________________________________________________________
https://keras.io/guides/functional_api/ 5/18
8/1/2020 The Functional API
Here, the decoding architecture is strictly symmetrical to the encoding architecture, so the output
shape is the same as the input shape (28, 28, 1).
The reverse of a Conv2D layer is a Conv2DTranspose layer, and the reverse of a MaxPooling2D layer is an
UpSampling2D layer.
To see this in action, here's a di erent take on the autoencoder example that creates an encoder
model, a decoder model, and chain them in two calls to obtain the autoencoder model:
https://keras.io/guides/functional_api/ 6/18
8/1/2020 The Functional API
Model: "encoder"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
original_img (InputLayer) [(None, 28, 28, 1)] 0
_________________________________________________________________
conv2d_4 (Conv2D) (None, 26, 26, 16) 160
_________________________________________________________________
conv2d_5 (Conv2D) (None, 24, 24, 32) 4640
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 8, 8, 32) 0
_________________________________________________________________
conv2d_6 (Conv2D) (None, 6, 6, 32) 9248
_________________________________________________________________
conv2d_7 (Conv2D) (None, 4, 4, 16) 4624
_________________________________________________________________
global_max_pooling2d_1 (Glob (None, 16) 0
=================================================================
Total params: 18,672
Trainable params: 18,672
Non-trainable params: 0
_________________________________________________________________
Model: "decoder"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
encoded_img (InputLayer) [(None, 16)] 0
_________________________________________________________________
reshape_1 (Reshape) (None, 4, 4, 1) 0
_________________________________________________________________
conv2d_transpose_4 (Conv2DTr (None, 6, 6, 16) 160
_________________________________________________________________
conv2d_transpose_5 (Conv2DTr (None, 8, 8, 32) 4640
_________________________________________________________________
up_sampling2d_1 (UpSampling2 (None, 24, 24, 32) 0
_________________________________________________________________
conv2d_transpose_6 (Conv2DTr (None, 26, 26, 16) 4624
_________________________________________________________________
conv2d_transpose_7 (Conv2DTr (None, 28, 28, 1) 145
=================================================================
Total params: 9,569
Trainable params: 9,569
Non-trainable params: 0
_________________________________________________________________
Model: "autoencoder"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
img (InputLayer) [(None, 28, 28, 1)] 0
_________________________________________________________________
encoder (Functional) (None, 16) 18672
_________________________________________________________________
decoder (Functional) (None, 28, 28, 1) 9569
=================================================================
Total params: 28,241
Trainable params: 28,241
Non-trainable params: 0
_________________________________________________________________
As you can see, the model can be nested: a model can contain sub-models (since a model is just
like a layer). A common use case for model nesting is ensembling. For example, here's how to
ensemble a set of models into a single model that averages their predictions:
https://keras.io/guides/functional_api/ 7/18
8/1/2020 The Functional API
def get_model():
inputs = keras.Input(shape=(128,))
outputs = layers.Dense(1)(inputs)
return keras.Model(inputs, outputs)
model1 = get_model()
model2 = get_model()
model3 = get_model()
inputs = keras.Input(shape=(128,))
y1 = model1(inputs)
y2 = model2(inputs)
y3 = model3(inputs)
outputs = layers.average([y1, y2, y3])
ensemble_model = keras.Model(inputs=inputs, outputs=outputs)
For example, if you're building a system for ranking custom issue tickets by priority and routing
them to the correct department, then the model will have three inputs:
You can build this model in a few lines with the functional API:
title_input = keras.Input(
shape=(None,), name="title"
) # Variable-length sequence of ints
body_input = keras.Input(shape=(None,), name="body") # Variable-length sequence of ints
tags_input = keras.Input(
shape=(num_tags,), name="tags"
) # Binary vectors of size `num_tags`
# Reduce sequence of embedded words in the title into a single 128-dimensional vector
title_features = layers.LSTM(128)(title_features)
# Reduce sequence of embedded words in the body into a single 32-dimensional vector
body_features = layers.LSTM(32)(body_features)
# Merge all available features into a single large vector via concatenation
x = layers.concatenate([title_features, body_features, tags_input])
https://keras.io/guides/functional_api/ 8/18
8/1/2020 The Functional API
When compiling this model, you can assign di erent losses to each output. You can even assign
di erent weights to each loss -- to modulate their contribution to the total training loss.
model.compile(
optimizer=keras.optimizers.RMSprop(1e-3),
loss=[
keras.losses.BinaryCrossentropy(from_logits=True),
keras.losses.CategoricalCrossentropy(from_logits=True),
],
loss_weights=[1.0, 0.2],
)
Since the output layers have di erent names, you could also specify the loss like this:
model.compile(
optimizer=keras.optimizers.RMSprop(1e-3),
loss={
"priority": keras.losses.BinaryCrossentropy(from_logits=True),
"department": keras.losses.CategoricalCrossentropy(from_logits=True),
},
loss_weights=[1.0, 0.2],
)
Train the model by passing lists of NumPy arrays of inputs and targets:
model.fit(
{"title": title_data, "body": body_data, "tags": tags_data},
{"priority": priority_targets, "department": dept_targets},
epochs=2,
batch_size=32,
)
Epoch 1/2
40/40 [==============================] - 1s 27ms/step - loss: 1.3097 - priority_loss:
0.6958 - department_loss: 3.0697
Epoch 2/2
40/40 [==============================] - 1s 27ms/step - loss: 1.2982 - priority_loss:
0.6946 - department_loss: 3.0178
<tensorflow.python.keras.callbacks.History at 0x154d75c50>
https://keras.io/guides/functional_api/ 9/18
8/1/2020 The Functional API
When calling t with a Dataset object, it should yield either a tuple of lists like ([title_data,
body_data, tags_data], [priority_targets, dept_targets]) or a tuple of dictionaries like ({'title':
title_data, 'body': body_data, 'tags': tags_data}, {'priority': priority_targets, 'department':
dept_targets}).
For more detailed explanation, refer to the training and evaluation guide.
A common use case for this is residual connections. Let's build a toy ResNet model for CIFAR10 to
demonstrate this:
x = layers.Conv2D(64, 3, activation="relu")(block_3_output)
x = layers.GlobalAveragePooling2D()(x)
x = layers.Dense(256, activation="relu")(x)
x = layers.Dropout(0.5)(x)
outputs = layers.Dense(10)(x)
https://keras.io/guides/functional_api/ 10/18
8/1/2020 The Functional API
Model: "toy_resnet"
____________________________________________________________________________________________
https://keras.io/guides/functional_api/ 11/18
8/1/2020 The Functional API
https://keras.io/guides/functional_api/ 12/18
8/1/2020 The Functional API
model.compile(
optimizer=keras.optimizers.RMSprop(1e-3),
loss=keras.losses.CategoricalCrossentropy(from_logits=True),
metrics=["acc"],
)
# We restrict the data to the first 1000 samples so as to limit execution time
# on Colab. Try to train on the entire dataset until convergence!
model.fit(x_train[:1000], y_train[:1000], batch_size=64, epochs=1, validation_split=0.2)
<tensorflow.python.keras.callbacks.History at 0x1559a2050>
Shared layers
Another good use for the functional API are for models that use shared layers. Shared layers are
layer instances that are reused multiple times in a same model -- they learn features that
correspond to multiple paths in the graph-of-layers.
Shared layers are often used to encode inputs from similar spaces (say, two di erent pieces of text
that feature similar vocabulary). They enable sharing of information across these di erent inputs,
and they make it possible to train such a model on less data. If a given word is seen in one of the
inputs, that will bene t the processing of all inputs that pass through the shared layer.
To share a layer in the functional API, call the same layer instance multiple times. For instance,
here's an Embedding layer shared across two di erent text inputs:
This also means that you can access the activations of intermediate layers ("nodes" in the graph)
and reuse them elsewhere -- which is very useful for something like feature extraction.
Let's look at an example. This is a VGG19 model with weights pretrained on ImageNet:
vgg19 = tf.keras.applications.VGG19()
And these are the intermediate activations of the model, obtained by querying the graph data
structure:
https://keras.io/guides/functional_api/ 13/18
8/1/2020 The Functional API
Use these features to create a new feature-extraction model that returns the values of the
intermediate layer activations:
This comes in handy for tasks like neural style transfer, among other things.
But if you don't nd what you need, it's easy to extend the API by creating your own layers. All
layers subclass the Layer class and implement:
To learn more about creating layers from scratch, read custom layers and models guide.
class CustomDense(layers.Layer):
def __init__(self, units=32):
super(CustomDense, self).__init__()
self.units = units
inputs = keras.Input((4,))
outputs = CustomDense(10)(inputs)
For serialization support in your custom layer, de ne a get_config method that returns the
constructor arguments of the layer instance:
https://keras.io/guides/functional_api/ 14/18
8/1/2020 The Functional API
class CustomDense(layers.Layer):
def __init__(self, units=32):
super(CustomDense, self).__init__()
self.units = units
def get_config(self):
return {"units": self.units}
inputs = keras.Input((4,))
outputs = CustomDense(10)(inputs)
Optionally, implement the classmethod from_config(cls, config) which is used when recreating a
layer instance given its con g dictionary. The default implementation of from_config is:
However, model subclassing provides greater exibility when building models that are not easily
expressible as directed acyclic graphs of layers. For example, you could not implement a Tree-RNN
with the functional API and would have to subclass Model directly.
For in-depth look at the di erences between the functional API and model subclassing, read What
are Symbolic and Imperative APIs in TensorFlow 2.0?.
Less verbose
There is no super(MyClass, self).__init__(...), no def call(self, ...):, etc.
Compare:
inputs = keras.Input(shape=(32,))
x = layers.Dense(64, activation='relu')(inputs)
outputs = layers.Dense(10)(x)
mlp = keras.Model(inputs, outputs)
https://keras.io/guides/functional_api/ 15/18
8/1/2020 The Functional API
class MLP(keras.Model):
This guarantees that any model you can build with the functional API will run. All debugging -- other
than convergence-related debugging -- happens statically during the model construction and not at
execution time. This is similar to type checking in a compiler.
To serialize a subclassed model, it is necessary for the implementer to specify a get_config() and
from_config() method at the model level.
You can always use a functional model or Sequential model as part of a subclassed model or layer:
https://keras.io/guides/functional_api/ 16/18
8/1/2020 The Functional API
units = 32
timesteps = 10
input_dim = 5
class CustomRNN(layers.Layer):
def __init__(self):
super(CustomRNN, self).__init__()
self.units = units
self.projection_1 = layers.Dense(units=units, activation="tanh")
self.projection_2 = layers.Dense(units=units, activation="tanh")
# Our previously-defined Functional model
self.classifier = model
rnn_model = CustomRNN()
_ = rnn_model(tf.zeros((1, timesteps, input_dim)))
You can use any subclassed layer or model in the functional API as long as it implements a call
method that follows one of the following patterns:
call(self, inputs, **kwargs) -- Where inputs is a tensor or a nested structure of tensors (e.g.
a list of tensors), and where **kwargs are non-tensor arguments (non-inputs).
call(self, inputs, training=None, **kwargs) -- Where training is a boolean indicating
whether the layer should behave in training mode and inference mode.
call(self, inputs, mask=None, **kwargs) -- Where mask is a boolean mask tensor (useful for
RNNs, for instance).
call(self, inputs, training=None, mask=None, **kwargs) -- Of course, you can have both
masking and training-speci c behavior at the same time.
Additionally, if you implement the get_config method on your custom Layer or model, the
functional models you create will still be serializable and cloneable.
Here's a quick example of a custom RNN, written from scratch, being used in a functional model:
https://keras.io/guides/functional_api/ 17/18
8/1/2020 The Functional API
units = 32
timesteps = 10
input_dim = 5
batch_size = 16
class CustomRNN(layers.Layer):
def __init__(self):
super(CustomRNN, self).__init__()
self.units = units
self.projection_1 = layers.Dense(units=units, activation="tanh")
self.projection_2 = layers.Dense(units=units, activation="tanh")
self.classifier = layers.Dense(1)
# Note that you specify a static batch size for the inputs with the `batch_shape`
# arg, because the inner computation of `CustomRNN` requires a static batch size
# (when you create the `state` zeros tensor).
inputs = keras.Input(batch_shape=(batch_size, timesteps, input_dim))
x = layers.Conv1D(32, 3)(inputs)
outputs = CustomRNN()(x)
rnn_model = CustomRNN()
_ = rnn_model(tf.zeros((1, 10, 5)))
https://keras.io/guides/functional_api/ 18/18