tf.keras.layers.GRU in TensorFlow

Last Updated : 09 Feb, 2025

TensorFlow provides an easy-to-use implementation of GRU through tf.keras.layers.GRU, making it ideal for sequence-based tasks such as speech recognition, machine translation, and time-series forecasting.

Gated Recurrent Unit (GRU) is a variant of LSTM that simplifies the architecture by using only two gates:

Update Gate – Determines how much past information should be carried forward.
Reset Gate – Decides how much of the past information should be forgotten.

Unlike LSTMs, GRUs do not have a separate cell state and hidden state, making them computationally more efficient while still retaining the ability to handle long-term dependencies.

Syntax of tf.keras.layers.GRU

tf.keras.layers.GRU(
units,
activation='tanh',
recurrent_activation='sigmoid',
return_sequences=False,
return_state=False,
dropout=0.0,
recurrent_dropout=0.0,
stateful=False,
unroll=False
)

Parameters of tf.keras.layers.GRU

units – Number of neurons in the GRU layer.
activation – Activation function for the output (default: 'tanh').
recurrent_activation – Activation function for recurrent connections (default: 'sigmoid').
return_sequences – If True, returns the output for all time steps instead of just the last one.
return_state – If True, returns the final hidden state along with the output.
dropout – Dropout rate applied to input connections.
recurrent_dropout – Dropout rate applied to recurrent connections.
stateful – If True, maintains state across batches for stateful GRUs.
unroll – If True, unrolls the GRU for faster computation (uses more memory).

How to Use tf.keras.layers.GRU in TensorFlow?

1. Import Required Libraries

Python

import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import GRU, Dense
import numpy as np

2. Create Dummy Sequential Data

Python

# Generating sequential data
X = np.random.random((100, 10, 5))
y = np.random.randint(2, size=(100, 1))

3. Build a GRU Model

Python

model = Sequential([
    GRU(64, activation='tanh', return_sequences=True, input_shape=(10, 5)),  # First GRU layer
    GRU(32, activation='tanh'),  # Second GRU layer
    Dense(1, activation='sigmoid')  # Output layer for binary classification
])

# Compile the model
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
model.summary()

Output:

4. Train the Model

Python

model.fit(X, y, epochs=10, batch_size=16)

Output:

Epoch 1/10
7/7 ━━━━━━━━━━━━━━━━━━━━ 6s 15ms/step - accuracy: 0.5487 - loss: 0.6960
.
.
.
Epoch 9/10
7/7 ━━━━━━━━━━━━━━━━━━━━ 0s 16ms/step - accuracy: 0.6135 - loss: 0.6825
Epoch 10/10
7/7 ━━━━━━━━━━━━━━━━━━━━ 0s 16ms/step - accuracy: 0.6224 - loss: 0.6848
<keras.src.callbacks.history.History at 0x7968ee5983d0>

Understanding return_sequences and return_state

return_sequences=True → Returns output for each time step instead of just the last output.
return_state=True → Returns the final hidden state along with the output.

Example: Extracting Hidden States

Python

gru_layer = GRU(50, return_sequences=True, return_state=True)
output, hidden_state = gru_layer(tf.random.normal([5, 10, 8]))  # (batch_size=5, time_steps=10, features=8)
print(output.shape, hidden_state.shape)

Output:

(5, 10, 50) (5, 50)

This means:

The output contains 50 units for each time step (10) and batch (5).
The hidden state has 50 units per batch.

TensorFlow’s tf.keras.layers.GRU is a powerful alternative to LSTMs, offering faster training and fewer parameters while still effectively handling long-term dependencies. GRUs are widely used in NLP, finance, and speech processing tasks.

tf.keras.models.load_model in Tensorflow

sanjulika_sharma

Improve

Article Tags :

tf.keras.layers.GRU in TensorFlow

Syntax of tf.keras.layers.GRU

How to Use tf.keras.layers.GRU in TensorFlow?

1. Import Required Libraries

2. Create Dummy Sequential Data

3. Build a GRU Model

4. Train the Model

Similar Reads

Thank You!

What kind of Experience do you want to share?