tf.keras.layers.LSTM in TensorFlow
Last Updated :
09 Feb, 2025
The tf.keras.layers.LSTM layer is a built-in TensorFlow layer designed to handle sequential data efficiently. It is widely used for applications like:
- Text Generation
- Machine Translation
- Stock Price Prediction
- Speech Recognition
- Time-Series Forecasting
Long-Short Term Memory (LSTMs) address the limitations of standard Recurrent Neural Networks (RNNs) by incorporating gates (forget, input, and output gates), which help in retaining important information over long sequences.
Syntax of tf.keras.layers.LSTM
tf.keras.layers.LSTM(
units,
activation='tanh',
recurrent_activation='sigmoid',
return_sequences=False,
return_state=False,
dropout=0.0,
recurrent_dropout=0.0,
stateful=False,
unroll=False
)
Parameters of tf.keras.layers.LSTM:
- units – Number of LSTM cells (neurons) in the layer.
- activation – Activation function (default: 'tanh').
- recurrent_activation – Activation for the recurrent step (default: 'sigmoid').
- return_sequences – If True, returns sequences instead of just the last output.
- return_state – If True, returns the hidden state and cell state along with the output.
- go_backwards – If True, processes input in reverse order.
- stateful – If True, maintains state across batches.
- dropout – Dropout rate for input connections.
- recurrent_dropout – Dropout rate for recurrent connections.
- kernel_initializer – Weight initialization strategy.
How to Use tf.keras.layers.LSTM in TensorFlow?
Let's learn to use LSTMs in TensorFlow, covering key parameters like return_sequences and return_state. You'll also understand how LSTMs process sequences and retain long-term dependencies through hidden and cell states.
1. Import Required Libraries
Python
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense
import numpy as np
2. Create Dummy Sequential Data
Python
# Generating random data
X = np.random.random((100, 10, 5))
y = np.random.randint(2, size=(100, 1))
3. Build an LSTM Model
Python
model = Sequential([
LSTM(50, activation='tanh', return_sequences=True, input_shape=(10, 5)), # First LSTM layer
LSTM(30, activation='tanh'), # Second LSTM layer
Dense(1, activation='sigmoid') # Output layer for binary classification
])
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
model.summary()
Output:
4. Train the Model
Python
model.fit(X, y, epochs=10, batch_size=16)
Output:
Epoch 1/10
7/7 ━━━━━━━━━━━━━━━━━━━━ 4s 14ms/step - accuracy: 0.5260 - loss: 0.6946
.
.
.
Epoch 10/10
7/7 ━━━━━━━━━━━━━━━━━━━━ 0s 14ms/step - accuracy: 0.5830 - loss: 0.6830
<keras.src.callbacks.history.History at 0x7968ee53b250>
Understanding return_sequences
and return_state
- return_sequences=True → Returns the output for each time step instead of just the final one.
- return_state=True → Returns the hidden state and cell state along with the output.
Example:
Python
lstm_layer = LSTM(50, return_sequences=True, return_state=True)
output, hidden_state, cell_state = lstm_layer(tf.random.normal([5, 10, 8])) # (batch_size=5, time_steps=10, features=8)
print(output.shape, hidden_state.shape, cell_state.shape)
Output:
(5, 10, 50) (5, 50) (5, 50)
This means:
- The output contains 50 units for each time step (10) and batch (5).
- The hidden and cell states have 50 units per batch.
TensorFlow’s tf.keras.layers.LSTM is a powerful tool for handling sequential data, providing flexibility with return states, bidirectional processing, and dropout regularization. Whether you're working on NLP, finance, or speech recognition, LSTMs are essential for capturing long-term dependencies.
Similar Reads
tf.keras.layers.GRU in TensorFlow TensorFlow provides an easy-to-use implementation of GRU through tf.keras.layers.GRU, making it ideal for sequence-based tasks such as speech recognition, machine translation, and time-series forecasting.Gated Recurrent Unit (GRU) is a variant of LSTM that simplifies the architecture by using only t
3 min read
tf.keras.models.load_model in Tensorflow TensorFlow is an open-source machine-learning library developed by Google. In this article, we are going to explore the how can we load a model in TensorFlow. tf.keras.models.load_model tf.keras.models.load_model function is used to load saved models from storage for further use. It allows users to
3 min read
Python Tensorflow - tf.keras.layers.Conv2D() Function The tf.keras.layers.Conv2D() function in TensorFlow is a key building block of Convolutional Neural Networks (CNNs). It applies convolutional operations to input images, extracting spatial features that improve the modelâs ability to recognize patterns.The Conv2D layer applies a 2D convolution over
2 min read
Recurrent Layers in TensorFlow Recurrent layers are used in Recurrent Neural Networks (RNNs), which are designed to handle sequential data. Unlike traditional feedforward networks, recurrent layers maintain information across time steps, making them suitable for tasks such as speech recognition, machine translation, and time seri
2 min read
tf.keras.layers.Dense : Fully Connected Layer in TensorFlow In TensorFlow, the tf.keras.layers.Dense layer represents a fully connected (or dense) layer, where every neuron in the layer is connected to every neuron in the previous layer. This layer is essential for building deep learning models, as it is used to learn complex patterns and relationships in da
2 min read
Python Tensorflow - tf.keras.layers.Conv1DTranspose() Function The tf.keras.layers.Conv1DTranspose() function is used to apply the transposed 1D convolution operation, also known as deconvolution, on data. Syntax:tf.keras.layers.Conv1DTranspose( filters, kernel_size, strides=1, padding='valid', output_padding=None, Â data_format=None, dilation_rate=1, activatio
2 min read