tf.keras.layers.GRU in TensorFlow
Last Updated :
09 Feb, 2025
TensorFlow provides an easy-to-use implementation of GRU through tf.keras.layers.GRU, making it ideal for sequence-based tasks such as speech recognition, machine translation, and time-series forecasting.
Gated Recurrent Unit (GRU) is a variant of LSTM that simplifies the architecture by using only two gates:
- Update Gate – Determines how much past information should be carried forward.
- Reset Gate – Decides how much of the past information should be forgotten.
Unlike LSTMs, GRUs do not have a separate cell state and hidden state, making them computationally more efficient while still retaining the ability to handle long-term dependencies.
Syntax of tf.keras.layers.GRU
tf.keras.layers.GRU(
units,
activation='tanh',
recurrent_activation='sigmoid',
return_sequences=False,
return_state=False,
dropout=0.0,
recurrent_dropout=0.0,
stateful=False,
unroll=False
)
Parameters of tf.keras.layers.GRU
- units – Number of neurons in the GRU layer.
- activation – Activation function for the output (default: 'tanh').
- recurrent_activation – Activation function for recurrent connections (default: 'sigmoid').
- return_sequences – If True, returns the output for all time steps instead of just the last one.
- return_state – If True, returns the final hidden state along with the output.
- dropout – Dropout rate applied to input connections.
- recurrent_dropout – Dropout rate applied to recurrent connections.
- stateful – If True, maintains state across batches for stateful GRUs.
- unroll – If True, unrolls the GRU for faster computation (uses more memory).
How to Use tf.keras.layers.GRU in TensorFlow?
1. Import Required Libraries
Python
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import GRU, Dense
import numpy as np
2. Create Dummy Sequential Data
Python
# Generating sequential data
X = np.random.random((100, 10, 5))
y = np.random.randint(2, size=(100, 1))
3. Build a GRU Model
Python
model = Sequential([
GRU(64, activation='tanh', return_sequences=True, input_shape=(10, 5)), # First GRU layer
GRU(32, activation='tanh'), # Second GRU layer
Dense(1, activation='sigmoid') # Output layer for binary classification
])
# Compile the model
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
model.summary()
Output:
4. Train the Model
Python
model.fit(X, y, epochs=10, batch_size=16)
Output:
Epoch 1/10
7/7 ━━━━━━━━━━━━━━━━━━━━ 6s 15ms/step - accuracy: 0.5487 - loss: 0.6960
.
.
.
Epoch 9/10
7/7 ━━━━━━━━━━━━━━━━━━━━ 0s 16ms/step - accuracy: 0.6135 - loss: 0.6825
Epoch 10/10
7/7 ━━━━━━━━━━━━━━━━━━━━ 0s 16ms/step - accuracy: 0.6224 - loss: 0.6848
<keras.src.callbacks.history.History at 0x7968ee5983d0>
Understanding return_sequences and return_state
- return_sequences=True → Returns output for each time step instead of just the last output.
- return_state=True → Returns the final hidden state along with the output.
Example: Extracting Hidden States
Python
gru_layer = GRU(50, return_sequences=True, return_state=True)
output, hidden_state = gru_layer(tf.random.normal([5, 10, 8])) # (batch_size=5, time_steps=10, features=8)
print(output.shape, hidden_state.shape)
Output:
(5, 10, 50) (5, 50)
This means:
- The output contains 50 units for each time step (10) and batch (5).
- The hidden state has 50 units per batch.
TensorFlow’s tf.keras.layers.GRU is a powerful alternative to LSTMs, offering faster training and fewer parameters while still effectively handling long-term dependencies. GRUs are widely used in NLP, finance, and speech processing tasks.
Similar Reads
tf.keras.models.load_model in Tensorflow TensorFlow is an open-source machine-learning library developed by Google. In this article, we are going to explore the how can we load a model in TensorFlow. tf.keras.models.load_model tf.keras.models.load_model function is used to load saved models from storage for further use. It allows users to
3 min read
tf.keras.layers.Dense : Fully Connected Layer in TensorFlow In TensorFlow, the tf.keras.layers.Dense layer represents a fully connected (or dense) layer, where every neuron in the layer is connected to every neuron in the previous layer. This layer is essential for building deep learning models, as it is used to learn complex patterns and relationships in da
2 min read
Python Tensorflow - tf.keras.layers.Conv1DTranspose() Function The tf.keras.layers.Conv1DTranspose() function is used to apply the transposed 1D convolution operation, also known as deconvolution, on data. Syntax:tf.keras.layers.Conv1DTranspose( filters, kernel_size, strides=1, padding='valid', output_padding=None, Â data_format=None, dilation_rate=1, activatio
2 min read
tf.GradientTape in TensorFlow TensorFlow is an open-source library for data science and machine learning. It provides various tools and APIs for building, training, and deploying models. One of the core features of TensorFlow is automatic differentiation (autodiff). Autodiff is the process of computing the gradients of a functio
9 min read
tf.function in TensorFlow TensorFlow is a machine learning framework that has offered flexibility, scalability and performance for deep learning tasks. tf.function helps to optimize and accelerate computation by leveraging graph-based execution. In the article, we will cover the concept of tf.function in TensorFlow. Table of
5 min read
tf.function in TensorFlow TensorFlow is a machine learning framework that has offered flexibility, scalability and performance for deep learning tasks. tf.function helps to optimize and accelerate computation by leveraging graph-based execution. In the article, we will cover the concept of tf.function in TensorFlow. Table of
5 min read