0% found this document useful (0 votes)
66 views74 pages

Development of A Secure Steganography System Using Advanced Encryption Standard With Polyalphabetic Substitution

This document provides an introduction to steganography, a method of covert communication that hides secret messages within various media, and discusses its significance in data security alongside cryptography. It outlines the challenges and techniques in steganography, including the integration of advanced encryption standards and polyalphabetic ciphers to enhance security. The study aims to develop a secure steganography system that protects sensitive information, particularly in cloud environments, while addressing current vulnerabilities in digital communication.

Uploaded by

ajogbejeo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
66 views74 pages

Development of A Secure Steganography System Using Advanced Encryption Standard With Polyalphabetic Substitution

This document provides an introduction to steganography, a method of covert communication that hides secret messages within various media, and discusses its significance in data security alongside cryptography. It outlines the challenges and techniques in steganography, including the integration of advanced encryption standards and polyalphabetic ciphers to enhance security. The study aims to develop a secure steganography system that protects sensitive information, particularly in cloud environments, while addressing current vulnerabilities in digital communication.

Uploaded by

ajogbejeo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 74

CHAPTER ONE

1.0 INTRODUCTION

1.1 Background of the Study

Steganography, derived from the Greek words "Stego" (covered) and "Graphia" (writing), is an

ancient practice and a modern field for covert communication. It's both a science and an art

focused on hiding secret messages within various cover media like text, images (color, binary,

grayscale), audio, and video. Essentially, it's the process of embedding one form of data

(message, audio, image, or video) within another to protect sensitive information from malicious

attacks (Dhawan & Gupta, 2021).

Steganography aims to conceal information by making the "stego image" (the image containing

the hidden message) appear nearly identical to the original "cover image," thus evading

detection. The core challenges in image steganography research revolve around two main areas:

creating effective steganography (embedding the secret message) and developing corresponding

extraction techniques for the receiver to retrieve the hidden information (Song et al., 2024).

Steganography does not scramble words this makes the seeker of such important file to neglect

the file carrying the important file due to lack of his/her knowledge that the normal/less

important file is a steganographic file (Odii et al., 2021).

Cryptography alters the data, Steganography hides the data and watermarking protects the

ownership of data (Modern Education Society’s College of Engineering, Pune, India. et al.,

2020).
In the current scenario, secret messages can be sent by hiding in an image or a text so nobody

other than sender and receiver can read or see the message. Hiding and unhiding of data are

known as steganography (Wahab et al., 2021).

Steganography and cryptography help in providing data safety. Steganography hides the

existence of message by inserting data in some other digital media and Cryptography transforms

data in to cypher object that can be in unreadable form to normal user (Mohammed Abdulmaged

et al., 2024). Message hiding techniques are primarily categorized into Cryptography,

Steganography, and Digital Watermarking. Data transmission security and durability can be

enhanced by integrating steganography (the concealment of messages within other media) and

cryptography (the encryption of messages using keys), as both are effective methods for ensuring

data security (Raphael George A. Mendoza et al., 2024a).

Steganalysis plays a vital role in cybersecurity in today’s digital era where exchange of malicious

information can be done easily across web pages. Steganalysis is the process for detection of

steganography within an object and can be categorized as active and passive steganalysis.

Passive steganalysis tries to classify a given object as a clean or modified object. Active

steganalysis aims to extract more details about hidden contents such as length of embedded

message, region of inserted message, key used for embedding, required by cybersecurity experts

for comprehensive analysis (Bedi et al., 2024).

Steganography plays a pivotal role in the SafePass paradigm. One of the generated shares is

covertly embedded within images, including reCAPTCHA images or images chosen by users,

thus providing an additional layer of obscurity and resilience against malicious actors (Chauhan

et al., 2024). Image steganography plays a pivotal role in secure data communication and

confidentiality protection, particularly in cloud-based environments (Ahmad et al., 2024).


Steganographic cover contains secret images in which can only be extracted by the recipient. All

digital files in bits can be the media of steganography (Darwis et al., 2021a). In image

steganography, the cover image is manipulated in such a way that the hidden data is not visible

thus making it not suspicious as in the case of cryptography (Subramanian et al., 2021). The

synergy of encryption and steganography not only safeguards the integrity of image data but also

adds an additional layer of concealment, making it arduous for adversaries to detect and

decypher sensitive information (Senior Professor, Computer Science and Engineering

department in Galgotias University, Gautam Budhh Nagar, U.P et al., 2020).

In essence, steganography hides what is being communicated, cryptography scrambles the

content of the communication, and digital watermarking prove ownership or authenticity.

Every gadget is becoming smarter; smartphones, automobiles, industrial sensors, home

appliances, medical equipment, and much more are now online around-the-clock, allowing our

gadgets to communicate with one another and improve our quality of life. This is what we hear

about every day—the Internet of EVERYTHING! It may, however, also endanger businesses

and their customers. In order to prevent unwanted access to private information and maintain the

secrecy of sensitive data in cyberspace, hidden data is essential. reduces the possibility of

cyberattacks and data breaches while enabling safe lines of communication. The Vigenère

Cypher has Vulnerabilities, particularly in brute force attacks when short keys are used (Purwanti

et al., 2024).

The necessity for both steganography and cryptography emerges from the core requirement to

safeguard information in adversarial environments. Steganography addresses the challenge of

covert communication, aiming to transmit secret messages without betraying their existence,

especially when overt encryption might draw unwanted attention. Cryptography, conversely,
tackles the inherent insecurity of communication channels and data storage, striving to guarantee

the confidentiality, integrity, authenticity, and non-repudiation of information, even when subject

to interception or manipulation by adversaries. Ultimately, both disciplines are vital in preserving

privacy and trust amidst ubiquitous information threats.

1.2 Statement of the Problem

It is significant to ensure the communication networks and information security, and balancing

simplicity and complexity is still a real issue i.e the key to improve the effectiveness of a scheme

in many applications scenarios. Current researches focusing on the security of digital images and

putting barriers to protect sensitive data still have room for improvement (Tong et al., 2023). The

classical Vigenère cypher, while historically significant, is open to frequency analysis and other

forms of cryptographic attacks due to its reliance on a repetitive key and alphabetic cyphertext.

To address this weakness, Hassan proposed modification that uses an improved Vigenère square

table which introduces alphanumeric cyphertext, expanding the character set from 26 alphabetic

characters to 31 characters, including letters and digits (Hassan, 2024).

Vigenère cypher is one of the earliest known polyalphabetic cypher which was considered secure

for a very long time until 1917 when Friedman and Kasiski were able to break it by determining

repeating segments of the cyphertext and using it to determine the length of the key (Mohammad

& Olaniyan, 2016). With billions of active internet users every given moment, user data privacy

is vulnerable to potential attackers, so secure digital transmissions have always been a concern.

With numerous researches going on in the field of cryptography systems there is always search

new methods that can improve security of data sent over the internet (Anitha Malaghan et al.,

2022).
Security on the internet is a shared responsibility. Individuals and corporations may secure their

digital presence and sensitive information by identifying the threats and applying appropriate

mitigation strategies. This calls for enhanced security mechanisms for shared documents on the

internet.

1.3 How this Project will Address the Challenge

This project aims to combat data breaches by integrating AES and polyalphabetic cyphers (new

method) into a steganographic approach. This will enhance the security of communication

networks and information, which is crucial in today's digital landscape. The challenge lies in

finding the right balance between simplicity and complexity, which is key to making any

security scheme effective across various applications. While current research focuses on securing

digital images and protecting sensitive data, there's still room for improvement in these areas.

1.4 Aim and Objectives of the Study

The aim of this project is to develop a secure steganography system using advance encryption

standard (AES) with polyalphabetic substitution.

To achieve this aim, the following objective are to:

i. Design and implement a cypher with AES encryption using Galois/Counter Mode.

ii. Integrate the polyalphabetic cypher that supports the full Unicode character set.

iii. Develop a steganographic module to embed the encrypted message into an image.

iv. Evaluate the system's performance based on metrics such as security, computational

efficiency, and payload capacity.

v. Develop a standalone application interface for end-users.


1.5 Scope of the study

This study focuses on developing a standalone desktop application capable of:

i. Encrypting text files using AES encryption (in Galois/Counter Mode).

ii. Applying a modified Vigenère cypher (that supports the full Unicode range) to further

secure the cypher text.

iii. Embedding the final encrypted data into an image using steganographic techniques.

The system will support secure message encoding, image processing, and user interaction

through a graphical interface.

1.6 Significant of the Study

This study contributes significantly to the field of data security by implementing a multi-layered

approach that combines steganography, the Vigenère cypher, and AES encryption in

Galois/Counter Mode. The proposed system enhances the confidentiality, integrity, and

authenticity of sensitive information, especially in cloud-based environments. By embedding

encrypted data into innocuous cover images, the approach offers an additional layer of security

against interception and unauthorized access, making it difficult for attackers to detect or decrypt

the concealed messages.

CHAPTER TWO
2.0 LITERATURE REVIEW

This Chapter provides an analysis of the significant works that have formed the foundation of the

methodology used in this study. It also gives a detailed examination of related concepts in

Steganography and Encryption.

2.1 Steganography

Steganography involves embedding confidential information within ordinary multimedia

in a way that makes it indistinguishable to an outside observer. This technique, a

relatively modern field of study, is now commonly used to covertly conceal data from

human perception (Al Rubaie & Al-Azawi, 2024). It conceals sensitive information

within an ordinary file, called a "cover file," to prevent its detection. This allows

messages to be sent securely without the risk of being intercepted or leaked (Modern

Education Society’s College of Engineering, Pune, India. et al., 2020). It is used to hide

sensitive information—like a message, audio, image, or video by embedding it within

another, seemingly innocuous, file of the same type. Its main purpose is to safeguard

secret data from malicious attacks (Dhawan & Gupta, 2021).

A steganography cover is just a regular file, like an image or audio clip, that's used to

hide a secret message. Think of it as the disguise for hidden information. The secret

content, such as a hidden image, is tucked inside this cover, making it invisible to anyone

who isn't the intended recipient. Essentially, any digital file, down to its basic bits, can

serve as a medium for steganography (Darwis et al., 2021a).


In today's digital age, where malicious information can be easily exchanged online,

steganalysis is crucial for cybersecurity. It's the process of detecting steganography

within an object, and can be broadly divided into two types:

i. Passive steganalysis aims to simply classify an object as either original or having hidden

content (Bedi et al., 2024).

ii. Active steganalysis goes further, attempting to extract more detailed information about

the hidden content, such as the length of the embedded message, the specific area where

it's inserted, and even the key used for embedding. This deeper level of analysis is vital

for cybersecurity experts (Bedi et al., 2024).

Steganography
Steganography

IMAGE VIDEO AUDIO TEXT NETWORK

Figure 2.1: Different Mediums to achieve steganography (Senior Professor, Computer Science

and Engineering department in Galgotias University, Gautam Budhh Nagar, U.P et al., 2020).

Steganography hides encrypted data directly within an image's pixels or metadata, making it

invisible. Combining encryption with steganography not only protects the image's integrity but

also creates an extra layer of stealth, making it much harder for anyone to detect and

understandstand the sensitive information (Senior Professor, Computer Science and Engineering

department in Galgotias University, Gautam Budhh Nagar, U.P et al., 2020).

2.1.1 Techniques used in steganography


i. Spatial Domain Steganography: The method involves hiding secret data within the

intensity of pixels, which brings multiple benefits, including unlimited capacity, a

simplified system design, and the guarantee that the hidden message remains

undetectable (Selvamani & Yusoff, 2024).

ii. Least Significant Bit (LSB) substitution: is the lowest-value bit within a digital data point

(such as a pixel or audio sample) that is altered to embed secret information (Sead et al.,

2023).

iii. Transform Domain Steganography: Information is hidden through the application of

various algorithms and modifications, making this an intricate embedding process. This

technique, which can be somewhat challenging, is categorized by its core transform

methods: Discrete Cosine Transform (Used for breaking down signals into their basic frequency

parts), Discrete Wavelet Transform (Employs discrete sampling of wavelets), Discrete Fourier

Transform (Used to determine the frequency components of individual pixel values) (Senior

Professor, Computer Science and Engineering department in Galgotias University, Gautam

Budhh Nagar, U.P et al., 2020).

iv. Frequency Domain Steganography: Instead of making direct changes to pixels or samples

(spatial domain steganography), frequency domain steganography embeds secret

information into the frequency components of a cover medium, typically images, audio,

or or video (Setyaningsih et al., 2020).

v. Deep Learning-based Steganography: the use of deep learning models to hide secret

information within images, particularly neural networks, to perform the tasks of

embedding (hiding) and extracting secret information within digital media (Song et al.,

2024). More recently, deep learning-based methods have shown promise in adaptively
learning optimal embedding strategies, improving imperceptibility and resistance to

steganalysis (Song et al., 2024).

vi. Adaptive Steganography: also called content-aware or statistical steganography, is a more

advanced way to hide information than simpler methods like basic Least Significant Bit

(LSB) embedding. Instead of making uniform changes, this technique adapts the

embedding process to the specific characteristics of the cover file. It strategically places

hidden data in "noisy," "textured," or "complex" areas where slight alterations are

unlikely to be noticed or to create suspicious patterns. Conversely, it avoids "smooth" or

"homogeneous" regions where even tiny changes would be easily spotted (Amitava

Podder et al., 2022).

LSB (Least Significant Bit) substitution is favored for its simplicity of implementation and

perceptual imperceptibility. It requires minimal computational effort and mathematical

complexity, as it directly modifies the least significant bits of pixel color values. These minute

changes are typically undetectable by the human eye, ensuring the stego-image remains visually

indistinguishable from the original cover image.

In summation, spatial domain steganography techniques are characterized by their algorithmic

simplicity and capacity for high perceptual imperceptibility for minimal data modifications, yet

they exhibit inherent limitations in robust security. Transform domain methodologies, in

contrast, offer enhanced resilience. Deep learning-based approaches constitute the vanguard of

the field, providing superior undetectability and greater adaptability, albeit at the expense of

substantially elevated computational resource requirements and systemic complexity.

2.2 Cryptography
Cryptography, as a discipline, has evolved through two distinct phases: classical and modern.

Classical cryptography (Caesar Cypher and Vigenère Cypher) refers to earlier techniques that

relied on manual or basic mechanical operations before widespread computer adoption. Modern

cryptography (Advanced Encryption Standard, Rivest–Shamir–Adleman, Elliptic Curve

Cryptography, Secure Hash Algorithm 256), however, emerged alongside computing, leveraging

complex mathematical principles and advanced computational capabilities. Regardless of their

disparate approaches or eras, both historical and contemporary cryptography are fundamentally

dedicated to the principles and techniques required for secure communication in hostile

environments (Purwanti et al., 2024).

Using mathematical concepts and a key, cryptography transforms information into unreadable

codes, resulting in a "cypher object" that obscures the original data from unauthorized users.

Confidentiality, Integrity and Authentication are the primary goal of cryptography (Mohammed

Abdulmaged et al., 2024) and (Al Rubaie & Al-Azawi, 2024).

Cryptography encompasses three primary techniques: Symmetric-key cryptography uses a

single, shared secret key for both encrypting and decrypting data. Asymmetric-key cryptography

employs a pair of linked keys—a public key for encryption and a private key for decryption.

Lastly, hash functions are one-way algorithms that generate a fixed-length "fingerprint" of data,

crucial for ensuring data integrity without using keys for encryption/decryption, and their process

is irreversible (e.g., SHA-2) (Modern Education Society’s College of Engineering, Pune, India.

et al., 2020).

Cryptography is built around two key actions: encryption, where outgoing information is

scrambled into an unreadable format using an algorithm, and decryption, which is the reverse
process of turning that scrambled data back into its original, understandable form ( Nair et al.,

2019).

Cryptographic methods include the Vigenère cypher and Caesar cypher, both of which operate

by replacing each original character with another character or letter (Hammad et al., 2022).

2.2.1 Classical Cryptography

2.2.2 Vigenère cypher

For a time, people considered the Vigenère cypher highly secure and believed it to be

unbreakable which is why it gained the nickname "le chiffre indéchiffrable " meaning "the

unbreakable cypher" in French. While it wasn't truly impervious to decryption it served as an

encryption method. The Vigenère cypher is a method of substituting letters in a way that

involves using alphabets for different parts of the message. It uses a matrix of 26 by 26 shifts, to

the Caesar cypher. This technique is based on principles derived from Caesar cyphers. With a

wider range of shifts from 0 to 25. It was named after Blaise de Vigenère, who lived during the

century in France under Henry IIIs rule. To encrypt with Vigenère each letter in the plaintext is

shifted according to its index and the corresponding letter, in the password using the Vigenère

square or tableau (Raphael George A. Mendoza et al., 2024b).

The Vigenère cypher is a method of encrypting alphabetic text by using a simple form of

polyalphabetic substitution. It employs a keyword to shift the letters of the plaintext by different

amounts.

Table 2.1: The Vigenère cypher encryption and decryption


Here's how it works

a) Using Vigenère Square Table, display in table 1 above.

i. Encryption process: Find its column in the Vigenère square and the corresponding key

letter's row. The letter at the intersection is the cyphertext letter.

ii. Decryption process: To decrypt, the process is reversed using the same keyword. For

each cyphertext letter, the receiver finds its position in the row corresponding to the key

letter and then identifies the plaintext letter from the column heading.

b) Using Algebraic method, the algebraic method uses Table 1 for characters and their

corresponding number values. If the letters A–Z are taken to be the numbers 0–25 (A=0, B=1

etc) and addition is performed modulo 26.

Encryption:
En ( x )=( x +n ) mod 26
(2.1)
Decryption

Dn (x )=(x −n)mod 26 (2.2)


Where:

x is the numerical value of aletter , typically mapped as A=0 , B=1, … , Z=25


n is the shift value of key , an integer between 0∧25

Encryption:

Plaintext: "HELLO"

Key: Choose a keyword (e.g., "HEY").

Repeat the Keyword: Repeat the keyword so that it matches the length of the plaintext.

Plaintext: "HELLO"

Keyword: "HEYHE"

Table 2.2: Repeated key to match the plaintext.

H E L L O

H E Y H E

Encryption:

En ( x )=( x +n ) mod 26 (2.3)

H (7) + H (7) = O (14)

E (4) + E (4) = I (8)

L (11) + Y (24) = J (9)

L (11) + H (7) = S (18)

O (14) + E (4) = S (18)


Encrypted Text: "OIJSS"

Decryption:

Dn ( x )=( x−n ) mod 26 (2.4)

O (14) - H (7) = H (7)

I (8) - E (4) = E (4)

J (9) - Y (24) = L (11)

S (18) – H (7) = L (11)

S (18) - E (4) = O (14)

Decrypted Text: "HELLO"

The above algorithm is for the classical Vigenère cypher.

2.2.3 Caesar cypher

The Caesar cypher is named after Julius Caesar, who, according to Suetonius, used it with a shift

of three (A becoming D when encrypting, and D becoming A when decrypting) to protect

messages of military significance. While Caesar's was the first recorded use of this scheme, other

substitution cyphers are known to have been used earlier. The Caesar cypher, a foundational

element of classical cryptography, is a type of substitution cypher. This algorithm stands as one

of the earliest and simplest encryption methodologies. Historically attributed to Julius Caesar, it

operates by systematically replacing each character within a 26-character alphabet with another,

based on a fixed shift (Hammad et al., 2022).

For instance, a left shift of three positions would result in 'D' being replaced by 'A', 'E' by 'B', and

so forth. This encryption method is named with Julius Caesar, who notably employed it for

secure private correspondence.


Table 2.3: A left shift of three position of Julius Caeser.

Plain A B C D E F G H I J K L MN O P Q R S T U V WA Y Z

line

Cypher X Y Z A B C D E F G H I J K L MN O P Q R S T U V W

line

From table 3: When encrypting, looks up each letter of the message in the "plain" line and writes

down the corresponding letter in the "cypher" line.

Plaintext: THE QUICK BROWN FOX JUMPS OVER THE LAZY DOG

Cyphertext: QEB NRFZH YOLTK CLU GRJMP LSBO QEB IXWV ALD

The encryption step performed by a Caesar cypher is often incorporated as part of more complex

schemes, such as the Vigenère cypher, and still has modern application in the ROT13 system.

2.2.4 ROT13

This encryption method functions as a substitution cypher, wherein each character is shifted

forward by thirteen positions. This displacement is performed with reference to the sequential

order of characters as defined within the ASCII table (Raphael George A. Mendoza et al.,

2024b).

ROT13 is a specific instance of the Caesar cypher, a cryptographic method that originated in

ancient Rome and was employed by Julius Caesar during the 1st century BC. This particular

cypher represents an early milestone in the timeline of cryptography.

A B C D E F G H I J K L M
ROT13

H E L L O

ROT13

Figure 2.2: ROT13

2.2.5 ASCII
The American Standard Code for Information Interchange (ASCII) is a character encoding

standard that defines 128 characters, including the English alphabet, numbers, punctuation, and

control characters, each represented by a 7-bit binary number. Although a single byte can

represent 256 values, ASCII only uses the first 128. To utilize the remaining 128 values (128-

255), various operating systems and applications developed "code pages," which are character

sets that often include special symbols or letters based on language preferences. However,

modern applications increasingly favor Unicode-based encodings. Unicode is a more

comprehensive standard designed to support a vast array of characters from diverse global

languages within a single, unified system, thereby resolving compatibility issues.

2.2.6 Unicode

The Unicode standard, maintained by the non-profit Unicode Consortium, is a universal text

encoding system designed to enable digital representation of virtually all the world's writing

systems. Initiated in 1987, it has grown to encompass over 149,000 characters and symbols,

making it the dominant text encoding on computers today. Each Unicode character is assigned a

unique "code point," and since many of these cannot fit into a single byte, common encodings
like UTF-8 and UTF-16 are used to handle this complexity. UTF-8 is notably compatible with

ASCII, as its first 128 characters directly correspond to ASCII, ensuring backward compatibility.

The standard currently defines 154,998 characters and 168 scripts (as of Version 16.0), unifying

common elements like numerals and punctuation across different writing systems. It also

encodes 3,790 emoji, contributing significantly to their global popularization outside Japan.

Unicode is capable of encoding over 1.1 million characters, spanning a vast range of code points

from U+0000 to U+10FFFF, organized across 17 "planes," each containing 65,536 code points.

These planes include the Basic Multilingual Plane (BMP) and various supplementary planes for

less common characters, ideographs, special purposes, and private use areas.

2.3 Modern Cryptography


2.3.1 Encryption

Encryption is the process of converting readable plaintext into unreadable cyphertext, effectively

concealing sensitive information from unauthorized access. Organizations routinely employ

encryption in data security to safeguard confidential data against breaches and illicit viewing.

Conversely, decryption is the process of transforming that encrypted message back into its

original, understandable form (Mukrozi et al., 2024).

2.3.1.1 Types of encryptions


i. Symmetric Encryption: Uses the same key for encryption and decryption. Examples:

AES, DES. In this technique, there is only a single communication key between sender

and receiver during the entire duration of communication. This key can be a combination

of letters, numbers, etc. Such techniques are faster and more reliable against third-party

users but needs the key to be exchanged somehow between the users of the system.

Examples of such encryption techniques are Data Encryption System (DES), Triple DES
or 3 DES, Advanced Encryption System (AES), etc. AES being the latest and most

advanced of them all which allows encryption of 128 bits block of data with the

communication key of sizes 128, 192 and 256 bits. AES provides encryption and

decryption in rounds constituting byte substitution, shift rows, mix columns and

transposition of bytes (Modern Education Society’s College of Engineering, Pune, India.

et al., 2020).

ii. Asymmetric Encryption: The difference between symmetric and asymmetric encryption

techniques is the use of keys. Asymmetric Encryption techniques makes use of two

different pair of keys instead of one. A public key is used for encryption and a private key

is used for decryption. This technique provides additional security as even if public key is

known to the attacker, the information cannot be decoded as private key is known only to

the receiver. Example of such techniques are Rivest-Shamie-Adleman (RSA), Digital

Signature Algorithm (DSA) and Elliptic Curve Cryptography (ECC). ECC being the

safest of them as it uses a logarithmic function for encryption and decryption (Modern

Education Society’s College of Engineering, Pune, India. et al., 2020).

iii. Hashing: Hash functions are non-reversible algorithms that transform input data into a

fixed-length output, known as a hash value. Crucially, these algorithms operate without a

key, and the design ensures that the original plaintext cannot be recovered from its hash.

A common application of hash functions by operating systems is for securely encrypting

passwords (Modern Education Society’s College of Engineering, Pune, India. et al.,

2020). While hashing is a crucial cryptographic tool for security, its inability to be

reversed means it cannot be used to conceal or make information confidential. For

confidentiality, you must use encryption algorithms that allow for decryption.
iv. Hybrid Encryption: Hybrid encryption integrates various encryption types to leverage

their individual strengths (Zhang, 2021).

Each type of encryption has its own use cases, advantages, and trade-offs, making it important to

select the appropriate method based on specific security requirements and performance

considerations.

2.3.2 What is an encryption algorithm

Essentially, an encryption algorithm takes an original, readable message (known as plaintext)

and, using a specific key, transforms it into an unreadable or incomprehensible format (known as

cyphertext). The purpose of this transformation is to secure the sensitive information, making it

unintelligible to anyone who does not possess the correct decryption key and algorithm. The

phrase "is the process of coding a message so that it is incomprehensible or interpretable"

precisely describes the function of these algorithms: to scramble data in such a way that it loses

its original meaning and becomes uninterpretable without the proper means to reverse the

process (Zahraa Ch. Oleiwi et al., 2020).

2.3.3 Common encryption algorithms

Among the many encryption algorithms available today, we will highlight five of the most

frequently encountered.

i. The Advanced Encryption Standard (AES): which is a commonly used block encryption

algorithm, is built on the Rijndael algorithm. Due to the fact that it uses symmetric keys,

the same key is utilized for both encryption and decryption. Rijndael supports a variety of

block sizes and key lengths and operates on fixed-size data block (G.mohammed, 2023).
Although extremely efficient in the 128-bit form, AES also uses 192- and 256-bit keys

for very demanding encryption purposes.

ii. Triple DES: 3DES (Triple Data Encryption Algorithm) is essentially the DES (Data

Encryption Standard) algorithm applied three times to each data block. This triple

application was developed to counter the vulnerability of the original DES to brute-force

attacks, which became feasible with advancements in computing power. By effectively

increasing the key length, 3DES significantly enhances security over DES. While 3DES

is more secure, it's also slower and more resource-intensive than DES. Therefore, it's best

suited for encrypting small amounts of highly important or valuable data (Zhang, 2021).

iii. RSA: RSA is a public-key encryption asymmetric algorithm and the standard for

encrypting information transmitted via the internet. RSA encryption is robust and reliable

because it creates a massive bunch of gibberish that frustrates would-be hackers, causing

them to expend a lot of time and energy to crack into systems. RSA, the Advanced

Encryption Standard, is asymmetric key encryption standard commonly used to protect

data where confidentiality is a critical and important problem. The security of the

algorithm is based on the rigidity of the analysis of a large number of compounds and a

complex number for a specified odd integer (e) computation of the unit of moral roots.

The RSA public key consists of an integer pair. Using the key feature of RSA, a variable

key size, and cypher block to improve security (Setyaningsih et al., 2020).

iv. Blowfish: Blowfish is a symmetric block cypher designed for strong encryption. At its

core, it's a 16-round Feistel network that relies heavily on password-dependent S-boxes

for its security. The strength of Blowfish comes from its clever sub-key generation and its

fundamental design principles of confusion and diffusion. The algorithm utilizes eighteen
32-bit P-boxes (Permutation Arrays) and four S-boxes (Substitution Boxes), each 32 bits

in size with 256 entries. As a Feistel cypher, Blowfish transforms data through a series of

permutations, diffusion, and confusion operations (Jasim, 2020).

v. Twofish: Twofish is a strong and flexible symmetric key block cypher that was a finalist

for the Advanced Encryption Standard (AES). It processes data in 128-bit blocks and

supports key sizes of 128, 192, and 256 bits. Built on a 16-round Feistel network,

Twofish's security stems from its key-dependent S-boxes, a complex key schedule, and

robust diffusion mechanisms like the Pseudo-Hadamard Transform (PHT) (Jasim, 2020).

vi. Rivest-Shamir-Adleman (RSA): RSA is an asymmetric key encryption algorithm named

after its three inventors: Ron Rivest, Adi Shamir, and Leonard Adleman, who published it

in 1977. It's fundamental to information security, enabling secure communication, digital

signatures, and key exchange. RSA's strength lies in the computational difficulty of

factoring the product of two very large prime numbers. This allows for effective

encryption and decryption, even over channels that may otherwise be insecure. As a

public-key cryptosystem based on number theory, RSA uses a unique pair of keys—a

public key for encryption and a private key for decryption. A message encrypted with the

recipient's public key can only be decrypted by their corresponding private key

(Department of Information Technology, Faculty of, Computer Sciences, Lahore

Garrison University, Lahore, Pakistan et al., 2024).

2.3.4 AES (Advanced Encryption Standard) modes

This refer to the different ways in which the AES algorithm can be used to encrypt and decrypt

data. Here are some common AES modes:


i. Electronic Codebook (ECB) Mode: ECB (Electronic Codebook) mode is a basic block

cypher mode that encrypts each 128-bit plaintext block independently with a fixed key.

While simple to implement and allowing parallel processing, it's highly insecure because

identical plaintext blocks produce identical cyphertext blocks. This reveals patterns in the

original data (e.g., in images), compromising confidentiality. Thus, ECB is rarely used

for general encryption due to pattern leakage and is only suitable for very small, random

data like single cryptographic keys, where such patterns don't exist.

ii. Cypher Block Chaining (CBC) Mode: CBC (Cypher Block Chaining) mode is a common

block cypher mode where each plaintext block's encryption depends on the previous

cyphertext block, starting with an Initialization Vector (IV). This chaining makes it more

secure than ECB by hiding patterns. While encryption is sequential, decryption can be

parallelized after the first block. It requires a unique (but not secret) IV for each

encryption. A drawback is error propagation, where one corrupted cyphertext block

affects subsequent decryption. Also, padding is needed for non-aligned data, which can

lead to vulnerabilities like padding oracle attacks if not handled correctly. Despite these

issues, CBC is widely used for data-at-rest encryption in many older systems, such as

file, disk, and database encryption.

iii. Cypher Feedback (CFB) Mode: Cypher Feedback (CFB) mode functions similarly to a

stream cypher. It encrypts the previous cyphertext block (or an Initialization Vector for

the first block), then XORs this result with the current plaintext segment to generate the

current cyphertext. CFB can process data in units smaller than the block size (e.g.,

individual bytes) and is self-synchronizing, meaning transmission errors eventually


resolve themselves. However, it cannot be parallelized for encryption, and errors in one

block will propagate. CFB is less commonly used than CTR or GCM modes.

iv. Output Feedback (OFB) Mode: Output Feedback (OFB) mode shares operational

similarities with Counter (CTR) mode, functioning by generating a keystream. This

keystream is produced through the iterative encryption of an Initialization Vector (IV) or

the output of the preceding OFB round, which is subsequently XORed with the plaintext.

A distinguishing feature of OFB is the complete independence of its keystream

generation from the plaintext data. As a direct consequence of its stream cypher

paradigm, padding is not required. Moreover, a notable advantage is its resilience to error

propagation, ensuring that corruption in one cyphertext block does not affect the

decryption of others.

v. Counter (CTR) Mode: Counter (CTR) mode transforms a block cypher into a stream

cypher by encrypting a unique, incrementing counter value (concatenated with a nonce or

Initialization Vector) for each block. This process generates a "keystream" that is then

XORed with the plaintext to produce the cyphertext. CTR mode offers significant

advantages, being highly parallelizable for both encryption and decryption, thus enabling

very high performance. As a stream cypher, it obviates the need for padding and

facilitates random access to encrypted data blocks. Furthermore, errors in one cyphertext

block are localized, affecting only that specific block. Operationally, a unique nonce/IV

is critically required for each encryption under the same key; reusing this value

constitutes a catastrophic security failure, as it directly compromises the confidentiality of

multiple plaintexts. Consequently, CTR mode is an excellent choice for high-


performance applications, streaming data, real-time encryption, and scenarios demanding

random access to encrypted data, such as in database or certain file encryption schemes.

vi. Galois/Counter Mode (GCM): Galois/Counter Mode (GCM) is a widely used symmetric

block cypher mode, often with AES, that provides Authenticated Encryption with

Associated Data (AEAD). This means it simultaneously ensures confidentiality

(encryption) and authenticity/integrity (detects tampering) of data, including optional

unencrypted associated data. GCM achieves this by combining Counter (CTR) mode for

efficient, parallelizable encryption with GHASH (Galois Field multiplication) for

generating a robust authentication tag. Its key benefits are high performance and strong

security, provided a unique Initialization Vector (IV) is used for each encryption. GCM is

the standard for modern secure protocols.

vii. XEX-based Tweakable Block Cypher with Cyphertext Stealing (XTS) Mode:

XTS (XEX-based Tweakable Block Cypher with Cyphertext Stealing) Mode is a

specialized symmetric block cypher mode of operation primarily designed for disk and

storage device encryption. It aims to provide strong confidentiality for data at rest,

addressing the unique challenges of encrypting large volumes of data organized into

sectors or blocks. It is specifically designed for disk encryption. It uses two keys and

incorporates a "tweak" value (derived from the sector number) to ensure that identical

plaintext blocks at different locations on the disk produce different cyphertexts, without

the need for chaining between sectors. Ideal for disk/storage encryption. Allows for

random access to sectors. Resists certain types of attacks specific to disk encryption (e.g.,

cut-and-paste attacks on blocks). Does not provide authentication. Can be vulnerable if


attackers can manipulate data and re-encrypt it. It is widely used in full-disk encryption

software like BitLocker, VeraCrypt, and LUKS.

2.4 Benefits of Data Encryption

Encryption can provide various data protection benefits both on premises and in the cloud. Some

of the most significant benefits include:

i. Data security: Encryption is among the most critical and widespread data security tools.

By encoding plain text as cyphertext, encryption helps organizations protect data against

a range of cyberattacks, including ransomware and other malware. Notably, the use of

info stealer malware that exfiltrates sensitive data is up 266 percent from 2022, according

to the 2024 IBM X-Force Threat Intelligence Index. Encryption helps combat this threat

by making data unusable to hackers, defeating the purpose of stealing it. Recent advances

in AI-powered encryption systems have also revolutionized data security practices. These

solutions use AI to dynamically adjust encryption parameters based on contextual factors

such as network traffic, device type, and user behavior. This adaptive approach allows

organizations to optimize encryption algorithms in real-time and tailor their data

protection strategies to evolving security threats.

ii. Cloud security: While cloud service providers (CSPs) are responsible for the security of

the cloud, customers are responsible for security in the cloud, including the security of

any data. Enterprise-wide data encryption can help organizations protect their sensitive

data on-premises and in the cloud.

2.5 Review Related works


Panigrahi & Padhy, 2025 hide data using the LSB (Least Significant Bit) technique within

images that can be detected only by the specified user. They developed a user-friendly GUI such

that it can be used with the utmost ease. The secret text is converted to cyphertext to make it

more secure. The sender selects the cover image, and it is used to generate the secured

Steganography image, which is identical to the cover image. The recipient downloads the stego

image and retrieves the secret text concealed within the stego image using the same application.

They used three different types of cover images, i.e., grayscale and RGB, and also estimated the

performance metrics: SNR, MSE, and PSNR. The three performance metrics were used, and it

was found that PSNR achieved good results, i.e., 71.4733. The RGB image with the hidden text

achieved up to 77.6697.

Hassan 2024 addressed the weakness of the Vigenère Cypher by proposing a modification that

uses an improved Vigenère square table which introduces alphanumeric cyphertext, expanding

the character set from 26 alphabetic characters to 31 characters, including letters and digits from

1 to 5. This increases the complexity and enhances the security of the cypher, making it more

resistant to frequency analysis and brute-force attacks. The modification of the Vigenère square

table plays a crucial role in this enhanced security, providing an effective means to protect

sensitive information. However, even with this modification, the Vigenère cypher remains

potentially vulnerable to frequency analysis. This project will increase the mod 31 proposed by

Hassan to the full range of Unicode.

Bhatt et al., 2024 used Python to implement the encryption and decryption of text into a stego-

image through the Django framework. They successfully implemented Django for web-based

image steganography. However, the built-in image processing of Django is limited and uses a
basic substitution method. The user needs to run the application, which has two tabs: encrypt and

decrypt. This project will increase the security layer of the encryption and decryption proposed,

by encrypting text with the Advanced Encryption Standard using Galois/Counter Mode.

Rehana Saheb et al., 2024 used a methodology involving several project modules, including an

Image Preprocessing Module, Encryption Module, Steganography Module, Decryption Module,

Image Post processing Module, and Evaluation and Analysis Module. Leveraging AES

encryption and steganographic methods like LSB embedding, spread spectrum techniques, or

adaptive steganography, their system aims to ensure the confidentiality, integrity, and

authenticity of image data. However, lack results for the evaluation metrics. This project will be

evaluated for securing confidential messages.

Demircan & Ozekes 2024 proposed algorithms built using a pipeline of two separate parts: a

segmentation model algorithm written in Python and embedding and extracting algorithms

written in JAVASCRIPT as separate web applications. They used three major testing categories:

Distortion methods (Mean Squared Error (MSE), Peak Signal-to-Noise Ratio (PSNR), and

Structural Similarity Index (SSIM)), Statistical steganalysis methods (RS Analysis, Sample Pairs

Analysis, Primary Sets Analysis (Chi-square Attack and Fusion Technique)), and a visual attack

method. The visual attack results of their proposed method achieved substantial improvement

over the LSB method without modifications and showed overall good resistance to this attack.

Their proposed method also performed better in the statistical analysis.

Ahmad et al., 2024 used a diverse dataset of high-resolution images containing various types of

sensitive information, such as medical records, financial statements, and personal identification

documents. The dataset consisted of 500 images, with each image containing a different type of
hidden sensitive data. The dataset underwent a two-step process. Ahmad extraction method,

using a deep CNN, enabled the selection of appropriate regions for data embedding, ensuring

minimal visual distortions. The proposed hybrid CNN was implemented using Python and

popular deep-learning libraries such as TensorFlow and Keras for CNN operations. Next, the

selected regions were subjected to the DCT-based steganography technique, where secret data

was seamlessly embedded into the image, rendering it visually indistinguishable from the

original. (DCT embedding and extraction were performed using mathematical libraries

supporting DCT transformation.) The CNN-DCT steganography approach significantly advances

image steganography over cloud storage. By continuously refining and extending the proposed

method, Ahmad address real-world challenges and pave the way for secure data communication

and sharing in cloud environments. However, the method is vulnerable to timing, power analysis,

or cache attacks.

Li et al., 2024 proposed a method that allows both communicating parties to independently

create multiple coverless image datasets (CIDs) using random seeds. A mapping rule was

designed for creating a one-to-one correspondence between hash sequences and images in CIDs.

Finally, the secret information was carried by the images whose hash sequences were equal to

the secret segments. At the receiver side, the robust Speeded-Up Robust Features (SURF) of

images were utilized to retrieve the secret information. The results demonstrated that the

proposed algorithm method was capable of resisting most malicious image attacks and exhibited

stronger robustness compared to other advanced techniques. However, the method has a limited

capacity due to the coverless images used in SURF.


Salim et al., 2024 proposed leveraging Lorenz’s chaotic system and the Bloom filter, aiming to

enhance information security by preventing data repetition and loss in the same pixel (Bloom

filter). This method effectively preserves image quality and displays a high similarity between

the original and hidden images. However, Salim focused on data repetition loss in the same pixel

but did not address the security of the message and also experienced slightly longer embedding

times.

Zhao et al., 2024 used a linked list encryption scheme for image steganography without

embedding. This approach reconstructs the image into a multi-linked list structure consisting of

numerous nodes, where each pixel is transformed into a single node with data and pointer

domains. By employing a special addressing algorithm, the optimal linked list corresponding to

the secret information can be identified. The receiver can restore the secret message from the

received image using only the list header position information. The scheme is based on the

concept of coverless steganography, eliminating the need for any modifications to the cover

image. This scheme does not require any modifications to the carrier image and can completely

resist steganalysis, offering very high security and concealment. However, the method requires

more memory for pointers.

Huo et al., 2024 delineated the concealed secret information’s location and presented an image

permutation algorithm grounded in chaotic mapping. Subsequently, they provided a detailed

analysis of the image steganography network, Chaotic mapping-enhanced image Steganography

network (CHASE), treating its encoding and decoding processes as the network’s forward and

backward processes, respectively. They also introduced an adversarial neural network for

adversarial training to enhance image quality. Finally, they expounded on the network’s training
strategy and elucidated the rationale behind constructing the loss function. The capacity and

image quality outperformed other state-of-the-art methods in various numerical comparisons.

However, use of different ways based on the logistic map to hide information in images has

potential complexity issues.

Al Rubaie & Al-Azawi 2024 used cryptography and steganography based on three different

chaotic maps to ensure information security. In the cryptography part, two chaotic maps were

used for encryption, and one chaotic map was used in the steganography section. The secret

information was concealed in the least significant bits (LSBs) of the double-precision image’s

pixels. The double-precision image was a high-quality image and can be represented in 64 bits

per pixel for grayscale images. The experiments demonstrated a high embedding capacity of

60.938% for the lossless technique and 400% for the lossy method, along with an extraordinary

PSNR reach of 69.964 dB. The project was robust to violent attacks like noise, cropping, and

compression due to the fact that both approaches use the spatial domain to embed the secret data.

Selvamani & Yusoff 2024 used spatial domain techniques in digital image steganography to

upload a pre-prepared cover image in PNG, JPG, or BMP format. Subsequently, the stego-image

was generated, and the durability, capacity, and imperceptibility were evaluated by computing

and closely examining the PSNR and MSE measurements. The cover material consisted of four

standard pictures, specifically Lena, Tiffany, Baboon, and Peppers, each with pixel dimensions

of 1024 x 1024. These images were utilized as cover images in both grayscale and color

throughout the testing phase. The dataset was sourced from the USC SIPI Volume 3:

Miscellaneous website (https://sipi.usc.edu/database/database.php?volume=misc). These

experiments concentrated on imperceptibility, embedding capacity, robustness, and security.


MATLAB was employed to execute the LSB and PVD algorithms, while Optimum Pixel

Adjustment Procedure (OPAP) was implemented in Google Collaboratory. OPAP in the spatial

domain stands out as a promising foundation for an advanced and effective image

steganographic algorithm, performing better than Least Significant Bit (LSB) and Pixel Value

Differencing (PVD). Raw format, vector format, and other formats were not implemented into

the system to determine the most effectiveness in spatial domain techniques.

Mohammed Abdulmaged et al., 2024 used a new steganography technique to integrate a genetic

algorithm and data hiding. The Genetic Algorithm was used for pixel variety selection in the

image where secret data was to be hidden, so that detection of secret data becomes more

complex. Using the RSA algorithm to generate public and private keys. The result showed that

the suggested techniques improve genetic algorithm and steganography techniques by evaluating

these techniques and several others. However, the project has compatibility issues with various

file formats and has high computational overhead.

Alenizi et al., 2024 reviewed image steganography based on multiple hashing algorithms. They

proposed a multi-layered security system that combines cryptography and steganography to

make data more impenetrable to outsiders. They encrypted the text that would eventually be

hidden in the cover photo, making it nearly impossible for anyone to read without access to the

private key. Next, they used a hash function to determine where the message should go, and

finally, they used a technique that embeds eight bits of secret data into the LSB of each pixel’s

value in the color channels of RGB. They used the RSA algorithm to safeguard the secret

message, making it difficult to decrypt the data without the proper key. However, the method

was limited in embedding capacity due to hashing overhead.


Avantika Bisht et al., 2024 reviewed various techniques on image steganography and discussed

why image steganography is a better approach. Numerous studies had been conducted in the

realm of image steganography, and their work delved into some of the research pertinent to the

field. The work demonstrated adaptability, allowing extension to diverse image formats.

Furthermore, elaborated on potential extensions by involving alternative transform methods,

which was also feasible. By addressing these aspects, the study contributed to the ongoing

evolution of steganography, emphasizing the need for advanced hiding techniques to overcome

challenges in steganalysis and maintain the effectiveness of covert information transmission.

However, the review was limited to image steganography.

Mukrozi et al., 2024 implemented a combination of the Stream Cypher Algorithm and Rotate 13

for Data Security. The encryption process begins by using the Stream Cypher method, then the

encryption results are encrypted again using the Rotate13 method so that it becomes safer

because it cannot be opened directly through other applications. It is relatively faster in the

encryption-decryption process and is also not limited by the length of the plaintext. However,

ROT 13 has a simple substitution (easy to detect and break), which provides minimal security

against attack.

Fernando et al., 2023 used image Metadata to tackle robustness; steganography on image

metadata (SIM) was developed. The SIM method was combined with the Fernet cypher

algorithm, which was used to create a stego-key and perform encryption before embedding the

message into the image. The results showed that the SIM method can withstand symmetric and

asymmetric cropping manipulations, and the stego-image was rotated up to 180 degrees even

when the image was compressed up to 90%. However, this method was only suitable for
embedding relatively small messages; also, when the image was manipulated using specific tools

like Paint 3D or Windows Photo Editor, the message could not be retrieved.

G.mohammed 2023 described three categories of audio steganography: statistical, random

generation, and linguistics. Each class's tactics were examined, with a focus on how a special

method was offered for concealing sensitive information. In addition, surveyed the existing

works in the advancement of strategies and algorithms relevant to audio steganography, covering

work published from 2016 to 2023, although it was not exhaustive. The study gathered the

present approaches, difficulties, and future directions of the topic with the goal of assisting other

researchers. Furthermore, they reviewed the existing works in the advancement of strategies and

algorithms relevant to audio steganography. Able to evaluate LSB Audio Steganography systems

based on Rijndael’s algorithms, which offer a high level of security for concealing hidden

messages within an audio file. However, the method is limited in capacity (block size) and

embedding (128 bits). Stego-image detectability increases with embedding capacity, which may

not withstand image processing attacks and may not support all image formats.

Tong et al., 2023 used a chaotic coyote optimization algorithm for image encryption and

steganography. The proposed method was based on the Chaotic Coyote Optimization Algorithm

(CCOA) by combining it with the basics of a chaotic system to enhance the security and

efficiency of encrypting and hiding images. The image attack tests were conducted in the

simulation experiment, which demonstrated the ability to defend against attacks was also present

in both of the two methods. However, the method lacked a comparison analysis with other

methods that confirms the superiority of the model.


Hammad et al., 2022 used the combination of the Vigenère cypher and the Caesar cypher

methods. The two methods were then converted to the names of the chemical elements in the

periodic table. The chemical element used is its atomic number, which is then converted into an

ASCII code to get its character value. Understanding that the message is a symbol of the

chemical element in the periodic table, it can be easily cracked and extracted. Because the

chemical element set is limited in value, introducing the full range of Unicode increases the

possible values, thereby giving more security to the text.

Suresh & Shatheesh Sam 2022 proposed optimized interesting region identification for video

steganography using fractional grey wolf optimization along with a multi-objective cost function.

The key frames were extracted using the Structural Similarity Index measurement (SSIM). Then,

the key frames were subjected to the formation of regions with the help of grid lines.

Simultaneously, the secret information was pre-processed with encryption to strengthen the

security of the proposed method. And Lifting Wavelet Transform (LWT) was finally applied to

accomplish the wavelet coefficient frame to hide the secret information. Using Peak Signal to

Noise Ratio (PSNR), Embedding Capacity, and Normalized Correlation (NC) to determine the

system metric. The proposed technique attained a maximum PSNR of 75.141 dB and a

maximum embedding capacity of 70.8%, respectively, which makes it highly robust,

imperceptible, and secure from attacks. However, the proposed technique has the drawback of

requiring a significant amount of processing power and sufficient time to finish the optimization

procedure.

Raphael George A. Mendoza et al., 2024a proposed image steganography using deep learning-

based edge detection, Multimedia Tools and Applications. The proposed technique consists of
two stages: first, the image was preprocessed to extract its edges utilizing DCNs, and second, the

secret message was embedded into the edges of the image using a binary code. The proposed

method achieves a higher payload. However, the proposed technique is ineffective if the input

image contains low contrast edges.

Amitava Podder et al., 2022 reviewed steganography techniques by discussing different

techniques for implementing steganography to multimedia files (text, still images, audio, and

video). Other types of steganography were not reviewed.

Sead et al., 2023 proposed a robust method for embedding an image inside a cover image based

on least significant bit steganography, which includes two phases. The first phase consists of two

stages: in the first stage, they encrypt a secret image using an encryption algorithm to increase

the security of information. In the second stage, they complete the first stage by embedding the

most significant bits (MSB) of the encrypted image into the least significant bits (LSB) of the

cover image for instituting the stego image. The second phase was the reverse of the first phase,

where the encrypted image was retrieved from the stego image and then decrypted to reconstruct

the secret image. Extraction algorithms obtained better and faster results for the restoration of the

secret image and preserved the good visual quality of stego images. However, the work was only

focused on images and needs extension to include audio or video, which varies according to the

content and format. It has a limited number of bits available for embedding in the cover stego

histogram.

Anitha Malaghan et al., 2022 combined steganography and image cryptography systems for

securing data transfer. They converted the plaintext to cyphertext using a symmetric

cryptography algorithm, then used steganography to hide the cyphertext inside the image, and
finally, they used the concept of image cryptography to get a cypher image with encryption keys

that encrypted the same. If a small mistake occurs in key values during decryption of the image

even when steganography is performed on the decrypted image, the plaintext cannot be obtained.

However, using Rubik’s cube-based steganography is vulnerable to attacks. Insecure key (weak

random number) generation compromises security and has limited embedding capacity due to

cube size, and it is limited to small image sizes.

Subramanian et al., 2021 reviewed recent advances in image steganography by grouping image

steganography into three categories: traditional image steganography methods, CNN-based

image steganography methods, and GAN-based image steganography methods. They compiled

the current trends, challenges, and some future directions in this field. However, their review was

limited to image steganography.

Nath et al., 2021 used the LSB substitution algorithm for data hiding and retrieval. They

developed a program in JAVA to embed the data in the LSBs of the R, G, and B components of

every pixel of the cover file. The bytes of the secret message file were encrypted before

embedding into the cover image file so that the task becomes difficult for the intruder to decrypt

the actual hidden message. The method is susceptible to brute-force attacks due to the lack of a

robust cryptography process.

Darwis et al., 2021b enhanced the Center Sequential Technique (CST) method by allowing RGB

for both cover and secret images (still using an image as the secret image) and refined the

method for the embedding process, calling the method Center Embedded Pixel Positioning

(CEPP). The process steps of CEPP are embedding and extraction methods. The project was

successful in setting the secret image in the center of the cover image. However, the message
cannot be extracted if the stego-image is cropped at a maximum of 25%; the extraction process

will only fail if the lower side is cropped by more than 70%.

Odii et al., 2021 proposed a hybridization of cryptography and steganography for information

security. The system was implemented using Microsoft Visual Studio 2012 as the Integrated

Development Environment (IDE) and was programmed with C# (csharp), a .NET language.

With this model, the security of transmitted data was assured since the information transmitted

was invisible and hence remains hidden to the physical eyes even to a suspecting hacker. A 3-tier

information security system was considered using cryptography, steganography, and barcode

technology, whereby the message would be encrypted, then turned into a barcode, and finally

embedded into the covert medium, which could be either an image or video. However, the cover

image does not support PNG format.

Senior Professor, Computer Science and Engineering department in Galgotias University,

Gautam Budhh Nagar, U.P et al., 2020 used LSB to hide text in images of any kind, beginning

data writing from the 8th or LSB layer (the last layer). Their method could take any kind of

image file without converting it into bitmap and also used maximum memory space to hide files

in pictures. However, there was no review on text (linguistic) steganography, and the work

lacked evaluation metrics, dealing only with steganography using LSB techniques, which lacks

security, authentication, and confidentiality of the message.

Duan et al., 2020 proposed a new high-capacity image steganography method combined with

image elliptic curve cryptography and a deep neural network. The Discrete Cosine Transform

(DCT) was used to transform the secret image, and then the transformed image was encrypted by

Elliptic Curve Cryptography (ECC) to improve the anti-detection property of the obtained image.
To improve steganographic capacity, the SegNet Deep Neural Network with a set of Hiding and

Extraction networks enables steganography and extraction of full-size images. The secret image

appeared noisy and was introduced into the Host image through the SegNet deep network model.

The visual quality of the Host image was not seriously affected, and the anti-detection property

was also improved. However, combining ECC and DNNS increases computational complexity,

and training DNNS with ECC-encrypted data can be time-consuming.

2.6 Appraisal of Literature

The existing literature provides a fundamental understanding and great insights into the

challenges and opportunities in cryptography and steganography algorithms. Further research is

needed to address the attacks of intruders on hidden messages to enhance the robustness of

stego-images. The integration of AES and the Vigenère cypher within steganography represents

a valuable approach to secure data transmission. While the application of these techniques

significantly enhances security, ongoing research into optimization, vulnerabilities, and advanced

methodologies is essential for maintaining the integrity of confidential communications.


CHAPTER THREE
3.0 METHODOLOGY

3.1 Introduction

The system built upon the previous method of steganography to enhance the security of secret

messages over the cloud. Secure transmission of information is crucial due to the Internet of

Things. All electronic devices are connected to the web, which calls for the integrity and

authenticity of messages over the cloud. This concealment with the aid of multiple encryption

helps protect sensitive data from interception and detection by adversaries, making it ideal for

covert communication. Steganography, combined with encryption techniques like AES in Galois

counter mode and cypher the modern way with Unicode compatibility for information security,

ensures that even if the secret data is discovered, it remains protected. Leveraging the previous
method of combination with steganography, this new method of combination provides an extra

layer of security and privacy, making it a valuable tool for secure communication in sensitive or

hostile environments.

3.2 Methodology

The methodology depicts a step-by-step process of the system from data collection to

cryptographic processes and finally into steganography. It begins with Plain Text which is

encrypted using symmetric encryption and then undergoes AES encryption. Simultaneously, a

Key exchange process generates a key variable. The output of the AES encryption is further

processed using a polyalphabetic substitution of Vigenère cypher resulting in encrypted data.

This encrypted data is then combined with an image using an insertion process and an index

structure, resulting in a stego file. The stego-file is a cover image with hidden data. To extract the

hidden data, the stego file is decrypted. The decryption process reverses the steps: First, the

image is extracted, followed by Vigenère cypher decryption and AES decryption respectively, to

retrieve the plain text. The entire process uses boxes representing steps and arrows indicating the

flow of data in a clear and logical visual representation. The mood and atmosphere are purely

informative and technical, conveying the systematic flow of a cryptographic process.

This program employs a three-phase method, each accessible via a python GUI for both

encryption and decryption. Initially, the plaintext undergoes encryption using Advanced

Encryption Standard (AES-256) in Galois/Counter Mode (GCM), thereby ensuring

confidentiality and authenticity. To enhance security and provide robust attack resistance, the

encryption key is derived using a Password-Based Key Derivation Function 2 (PBKDF2-

HMAC). Subsequently, the generated cyphertext is represented using the full Unicode range,
from U+0000 to U+10FFFF, which comprises 1,114,112 distinct code points. The final step

involves embedding this Unicode-encoded cyphertext into a chosen image through the Least

Significant Bit (LSB) steganography method.

The algorithm of the system architecture is as follows

For encryption:

Step 1: get the plain text to be encrypted

Step 2: apply symmetric encryption techniques to the secret information.

Step 3: generate a secret salt key and initialization vector of 16 bytes and 12 bytes respectively

for the encryption

Step 4: encryption the plain text combined with the keystream generated by AES-GCM mode to

produce the ciphertext.

Step 5: GCM mode combines the encrypted data and additional authenticated data to produce

this tag through Galois field multiplication.

Step 6: The final encrypted message is generated which include the IV, the cypher text and

authenticated tag

Step 7: the encrypted data is pass through another cypher with polyalphabetic process of modern

modification of Vigenère cypher.

Step 8: using same key as of AES or another generated key

Step 9: generate another layer of cypher text to be embedded


Step 10: pass the cypher text for insertion process

Step 11: get the cover image for the cypher text

Step 12: select the pixel to use for the embedding and the index structure

Step 13: combine the cypher text and the index structure of the pixel for insertion process

Step 14: get the stego image from the insertion process

Step 15: end

For decryption:

Step 1: get the stego image which hold the message

Step 2: extract the cypher text message from the image

Step 3: the recipient applies the polyalphabetic modification key

Step 3: decrypt the ciphertext using the polyalphabetic modification of Vigenère cypher to

extract the AES cypher text

Step 4: The recipient uses the same secret key and IV used for AES encryption.

Step 5: Verifies the authentication tag to ensure data integrity

Step 6: decrypts the ciphertext using AES in GCM.

Step 7: If verification passes, the plaintext is recovered; otherwise, the message is rejected.

Step 8: end.

3.3 AES Encryption using GCM

3.3.1 Overview of GCM


In GCM, the message is encrypted using AES in CTR mode, while the authentication tag is

generated through Galois field operations. GCM uses a function called GHASH for

authentication, which is based on polynomial multiplication in the Galois Field GF (2128).

Authentication tag=GHASH (3.1)

Where:

Key H : HASH subkey derived fronthe cypher key , computed as

H=E k ( 0128 ) , where E k is the block cypher encryption function

x :inpute data ( cyphertext )


GHASH ( x ) :the authentication tag calculation
Where : GHASH ( H , X ) :¿

⊕ :bitwise XOR
X 1 :each block of input data
n−1 128
H =H raises ¿ the power ( n−1 ) ∈GF (2 )
components of AES-GCM:

i. Key (K): A symmetric key used for encryption.

ii. Nonce (N): A unique number for each encryption process; must never be reused with the

same key.

iii. Plaintext (P): The original message to be encrypted.

iv. Associated Data (A): Additional data that needs to be authenticated but not encrypted

(used for integrity checks).

v. Cyphertext (C): The encrypted output.

vi. Authentication Tag (T): A tag used to verify the authenticity of both the cyphertext and

associated data.
3.3.2 Encryption

C i=Pi ⊕ AESK (Counter ( i )) (3.2)

Where
C i: the cyphertext at position i
Pi :the plaintext at positioni
⊕ :the bitwise exclusive∨operation
Counter ( i ) : a counter value associated∈ positioni
AESK : AES encyption of the counter value with K

Steps taken for the encryption

Step 1: Generate initial counter block ( J 0) :

To establish the initial counter block, J 0, the Initialization Vector (IV) or Nonce is

concatenated with a fixed 32-bit field, frequently set to the value '1'. This process is

straightforward when the IV measures exactly 96 bits. Should the IV deviate from this

96-bit length, a specialized GHASH computation is executed on the IV itself to

accurately derive J 0.

J 0=GHASH H ( N ) (3.3)

When the nonce N is not 12 bytes ( 96 bits ) , AES−GCM usesthe GHASH

function ¿ derive the initial counter block J 0 ¿ the nonce

GHASH is auniversal hash function based on Galois Field ( GF ( 2 ) ) arithmetic .


128

Step 2: Encrypt counter blocks (CTR Mode for Confidentiality):

i. J 0 is incremented to produce a sequence of unique counter blocks: J 1, J 2, J 3, ....


ii. Each counter block ( J 1) is then encrypted using the AES block cypher under the secret

key ( K ). This generates a series of unique keystream blocks: E(K , J 1 ), E(K , J 2 ),

E(K , J 3 ),...

iii. The keystream blocks are XORed (⊕) with the corresponding plaintext blocks ( Pi) to

produce the cyphertext blocks C i .

C i = Pi ⊕ E(K , J 1 ) (3.4)

Step 3: Authenticate data (GHASH for Integrity & Authenticity):

i. A special authentication key ( H ) is derived by encrypting a zero block with AES under

the secret key ( H = E(K , J 0 )¿ .

ii. The Associated Data ( A ) and the generated Cyphertext (C ) are then processed by the

GHASH function. GHASH is a Universal Hash Function that performs multiplication in

a Galois Field (GF(2128)).

iii. The GHASH function processes A and C (along with their lengths) using the

authentication key H to produce a raw hash value.

Step 4: Generate authentication tag (T):

iv. The raw hash value from the GHASH computation is XORed with the encrypted initial

counter block (E( K , J 0 )) . T =GHASH ( A ,C ) ⊕ E(K , J 0) .

v. This resulting T is the Authentication Tag.

3.3.3 Decryption
Pi=C i ⊕ AESK (Counter ( i )) (3.5)
Where:
i. Pi: The i-th block of plaintext that you want to recover.

ii. C i: The i-th block of cyphertext that you received after encryption.

iii. AESK : The AES encryption function utilizing the key K.

iv. Counter ( i ): The counter value for the i-th block, which must be the same as the one used

during encryption.

v. ⊕: Represents the bitwise XOR operation.

Understanding the components

1. Cyphertext C i:

This is the output of the encryption process for the corresponding plaintext block Pi .

2. AES encryption function AESK:

i. This function takes a key K and a value (in this case, the counter) and produces a 256-

bit output.

ii. It's a symmetric operation, meaning that the same function can be applied to revert to

the original plaintext if the same key is used.

3. Counter Counter ( i ):

i. This is a unique value associated with each block of plaintext and is crucial in CTR

mode. It is typically incremented for each block to produce distinct key streams.

ii. Counter(0), Counter(1), and so forth helps produce unique input to the AES function

for each block.

Decryption process:
1. Retrieve the cyphertext: Obtain the cyphertext C i from the encrypted data.

2. Generate the same counter: Calculate the counter for the i-th block, Counter (i),which

should be identical to what was used during the encryption process.

3. Decrypt the block:

i. Compute the encryption of the counter using the AES function:

AESK ( Counter ( i ) )

ii. Then compute the original plaintext as follows:

iii. Pi=C i ⊕ AESK ( Counter (i ) )

iv. By XORing the cyphertext C i with the result of the AES operation, you retrieve the

original plaintext Pi.

Outputs of AES-GCM encryption:

The AES-GCM encryption process yields three main outputs: The cyphertext (C), which is

the encrypted data; an Authentication Tag (T), a fixed-size value ensuring the integrity and

authenticity of both the cyphertext and any associated data; and the Initialization Vector

(IV/Nonce), which must be transmitted along with the cyphertext and tag for successful

decryption and verification.

3.4 Unicode into Vigenère cypher

3.4.1 Convert characters to code points:


P [ i ]=ord ( P [ i ] ) (3.6)

This assigns the integer Unicode code point of the character at position i in string P to the variable P [ i ].

P: represents a string of plaintext, (like letters, numbers, symbols).

i : This denotes the index of a specific character in the string.

ord ( P [ i ] ): The `ord()` function convert a character into its corresponding numerical value based

on its Unicode code point. For example, `ord('A')` would return 65 because that is the Unicode

value for 'A'. (where ord returns the Unicode code point)

3.4.2 Encryption
C [i]=(P [i]+ K [ j])modM (3.7)
Where:
P[i]: Unicode code point of the plaintext character.

K [ j]¿: Unicode code point of the key character.

M : Total number of Unicode code points (1,114,111 for Unicode's maximum code

point).

C [i]: The encrypted character's code point.

3.4.3 Decryption

C [i]=chr ¿ (3.8)

Where:

C [ i ] :The encrypted character s code point


'

P[i]: Unicode code point of the plaintext character.

K [ j]¿: Unicode code point of the key character.

M : Total number of Unicode code points (1,114,111 for Unicode's maximum code

point).

chr : Converts a numerical value back to a character.

3.5 Steganography

Steganography is the technique of hiding secret data within an ordinary, non-secret, file or

message in order to avoid detection. Here we mainly use Image steganography. The image

steganography is the process in which we hide the data within an image so that there will not be

any perceived visible change in the original image. The conventional image steganography

algorithm is LSB embedding algorithm.


We implemented the most basic steganographic approach: The Least Significant Bit (LSB)

method. This technique leverages the structure of a color pixel, which is composed of red, green,

and blue components, each represented by one byte. The core principle is to embed secret

information into the least significant bit (the first bit) of each RGB (8-bits) component within

every pixel. In the most extreme scenario, this modification changes the decimal value by only

one, rendering the alteration visually imperceptible to the human eye. If the available capacity in

the first bits is insufficient, the embedding process extends to the second bit of each pixel

component, and continues similarly. To prevent detection, it is imperative to limit the embedded

data to less than 250 words per image.

Maximum Capacity in image steganography, when embedding data in the Least Significant Bits

(LSBs) is:

Max Capacity (bits) = Total Pixels × Number of Color Channels × Number of LSBs Used per

Channel

Table 3.2: Summary of maximum capacity of 3 colour channel of LSB steganography

Image Size Dimension Total Pixel 1 LSB per 2 LSB per


Channel (bits) Channel (bits)

Small 200 X 200 40,000 120,000 240,000

Medium 800 X 600 480,000 1,440,000 2,880,000

Full HD Image 1920 X 1080 2,073,600 6,220,800 12,441,600

4K UHD 3840 X 2160 8,294,400 24,883,200 49,766,400

3.5.1 How it works

Employing an OpenCV-based function, this program embeds text data into images using a

progressive LSB (Least Significant Bit) approach. It begins by utilizing the least significant bit

of each pixel's color channel. If more space is required, the function then uses the next higher-

order bits. This means the more data you hide, the more the image is altered. Conceptually, a

digital image is an array of pixels—finite digital values that serve as the smallest elements. Each

pixel stores a value representing a specific color's brightness at its location, effectively forming a

fixed-dimension matrix (or two-dimensional array) of these pixel values.

3.5.2 Sampling theory and pixel intensity

The process of converting analog images into discrete digital pixels, forming the "canvas" for

steganography, relies on two core steps: sampling and quantization. Sampling breaks the

continuous image into individual pixel measurements, determining spatial resolution.

Quantization then converts these analog measurements into discrete numerical pixel intensity

values (e.g., 0-255), representing colors or brightness. This digital pixel data is what

steganography manipulates to hide information.


i. Sampling theory in digital imaging describes the process of converting a continuous

analog image (where color and intensity vary smoothly) into discrete units called pixels.

This is done by taking regular, defined measurements of the image's light information

across its surface, making it understandable for computers.

ii. Quantization is the process following sampling, where the continuous analog electrical

signal representing a pixel's light intensity or color is converted into a finite set of

discrete numerical values. These discrete values are what computers can store and

process.

iii. The "Canvas" for hiding data: The combined processes of sampling and quantization

transform continuous analog images into a discrete matrix of numerical pixel values. This

matrix, where each pixel's color or intensity is represented by bits, forms the digital

"canvas" for steganography. Techniques like LSB embedding exploit human visual

limitations by subtly altering the least significant bits of these pixel intensity values,

effectively hiding secret data without creating noticeable visual changes, thanks to the

ample "space" provided by the image's high pixel count and multi-bit depth.

3.5.3 LSB modification of pixel value

As already mentioned, pixels are the smallest individual element of an image. So, each pixel is a

sample of an original image. It means, more samples provide more accurate representations of

the original. The intensity of each pixel is variable. In color imaging systems, a color is typically

represented by three or four component intensities such as red, green, and blue, or cyan,

magenta, yellow, and black.


Here, we work with the RGB color model. As you can imagine, the RGB color model has 3

channels, red, green and blue.

So, each pixel from the image is composed of 3 values (red, green, blue) which are 8-bit values

(the range is 0–255).

for each pixel we have three values, which can be represented in binary code.

When working with binary codes, we have more significant bits and less significant bits, as you

can see in the image below.

Table 3.4: The different values of bit in image

27 26 25 24 23 22 21 20

128 64 32 16 8 4 2 1

Range: 0 to 255

The leftmost bit is the most significant bit. If we change the leftmost bit it will have a large

impact on the final value. For example, if we change the leftmost bit from 1 to 0 (11111111 to

01111111) it will change the decimal value from 255 to 127.

On the other hand, the rightmost bit is the least significant bit. If we change the rightmost bit it

will have less impact on the final value. For example, if we change the rightmost bit from 1 to 0

(11111111 to 11111110) it will change the decimal value from 255 to 254. Note that the

rightmost bit will change only 1 in a range of 256 (it represents less than 1%).

In Short: each pixel has three values (RGB), each RGB value is 8-bit (it means we can embed 8

binary values) and the rightmost bits are least significant. So, if we change the rightmost bits it
will have a small visual impact on the final image. This is the steganography key to hide an

image inside another. Change the least significant bits from an image and include the most

significant bits from the other image.

3.5.4 Encode the data

Every byte of data is first converted into 8-bit binary code using ASCII values for each byte.

Pixels are then processed from left to right in groups of three (totaling nine color component

values). The first eight of these nine values are used to embed the binary data, with each value

adjusted to be odd if the corresponding binary bit is '1' and even if it's '0'.

As the maximum amount of hidden information is limited by the cover image's properties, such

as its dimensions and bit-depth. A related issue is by handling image overflow, which happens

when the message to be hidden exceeds the image's capacity. It is stated that this necessitates

strategies like truncating the message or utilizing multiple cover images to accommodate the

data. Error-Correcting Codes (ECC) are applied externally to handle error in the secret message

before embedding. ECC encodes the message with redundant bits, allowing the system to detect

and often correct flipped LSBs upon extraction.

3.5.5 Decode the data

For decoding, the system reads pixels in groups, processing three at a time, until the entire hidden

message has been extracted. Each group of three pixels contains 8 bits of embedded binary data, which is

retrieved using the identical logic employed during the encoding phase. The termination of the message is

indicated by a specific binary bit (or sequence) where '1' signifies completion and '0' indicates

continuation.
CHAPTER FOUR
4.0 RESULT AND IMPLEMENTATION

4.1 Programming Language

The programming language used in the development is Python, chosen for its extensive libraries

and ease of use. Python is a popular programming language known for its simplicity, readability,

versatility, strong community support, and extensive libraries, making it an excellent choice for

both beginners and experienced developers compared to other programming languages.

4.2 Advanced Encryption Standard:

AES (Advanced Encryption Standard) is a symmetric encryption algorithm widely used to secure

sensitive data. It was established as a standard by the National Institute of Standards and
Technology (NIST) in 2001 and has since become one of the most popular encryption methods

globally. The modules used are

4.2.1 AES encryption process

Code Explain

Salt Generation:

A random 16-byte salt is generated to be used in key derivation.

This ensures that the same password produces different encryption keys.

Key Derivation:

Uses (password, salt) (PBKDF2 with SHA-256) to generate a secure encryption key.

IV (Nonce) Generation:

A 12-byte IV (Nonce) is generated, as required by AES-GCM.

AES-GCM Encryption:

The cypher is initialized using the derived key and IV.

Base64 Encoding:

The final output consists of:

Salt (16 bytes)

IV (12 bytes)

Cyphertext (variable length depending on plaintext size)

Authentication Tag (16 bytes)

The entire byte sequence is Base64-encoded to return a safe, readable string.

When all the necessary modules were imported from the cryptography library to provide robust

cryptographic recipes and primitives, a securely generated key (derived using a key derivation

function, KDF) was created from the password and salt (a random value added to the input of a
hash function to ensure that the output is unique, even for identical input, which is crucial for

preventing rainbow table attacks) within a function. The salt generates a random 16-byte salt,

which is crucial to enhance the security of the derived_key function. This function is called to

generate a cryptographic key using the provided password and generated salt. A 12-byte

initialization vector (IV) is generated, which is necessary for the Galois/Counter Mode (GCM) of

operation. A cipher object is created using the AES algorithm in GCM mode, utilizing the

derived key and initialization vector (IV). The text is encoded using the encryptor object. The

update method processes the text, while finalize ensures that all data is encrypted. The function

returns a base64-encoded string that concatenates the salt, IV, ciphertext, and the authentication

tag, ensuring that all necessary components for decryption are included. AES encryption is a

robust, efficient, and widely adopted method for securing data, making it a cornerstone of

modern cryptographic practices.

Code Explain:

i. PBKDF2HMAC: Uses PBKDF2 with HMAC (Hash-based Message Authentication Code) to

derive a cryptographic key from a password.

ii. SHA256 Hash Algorithm: Ensures secure key derivation.

iii. Length = 32: Specifies that the derived key should be 32 bytes (256 bits) long.

iv. Salt: A unique random value used to prevent dictionary attacks.

v. Iterations = 100000: The number of iterations to slow down brute-force attacks.

vi. Backend = Specifies the cryptographic backend.

4.2.2 AES decryption processes.


Code Explain:

Base64 Decoding:

The input cyphertext is assumed to be a Base64-encoded string.

It's decoded back into raw encrypted bytes.

Extract Salt, IV (Nonce), Tag, and Encrypted Data:

Salt (First 16 bytes): Salt is used for key derivation

IV (Next 12 bytes): This is required for AES-GCM decryption

Tag (Last 16 bytes): It is used for authentication

Cyphertext (Middle portion): The actual encrypted message

Key Derivation

Uses PBKDF2HMAC to generate a secure key.

AES-GCM Decryption

Initializes an AES cypher in Galois/Counter Mode (GCM).

The authentication tag ensures message integrity.

The imported modules from the cryptography library provides robust cryptograph is recypes and

primitives. The function is a placeholder for the key derivation function (KDF) that should be

implemented to securely generate key from the password and salt. This is crucial for maintaining

security. The base64.b64decode function decode the Base64 encoded string into its original

binary format. The first 16 bytes are extracted as the salt, the next 12 bytes as the iv, the last 16

bytes as the authentication tag and the remaining bytes as the actual cyphertext. The cypher

object is created using the AES algorithm in GCM mode, initialized with the derived key, iv and

tag. The method processes the cyphertext and ensures that all data is decrypted properly. The

result is then decoded from bytes to a string.


4.3 Vigenère Cypher

The Vigenère cipher is a classic encryption technique that enhances the security of plaintext by

utilizing a keyword for character shifting. The provided Python implementation effectively

demonstrates how to encrypt text using this method, showcasing the conversion between

characters and their Unicode representations. With Unicode support, the cipher can be

implemented using modern programming languages and libraries that provide built-in support

for Unicode. Unicode support enables the cipher to handle non-ASCII characters, surrogate

pairs, and other special characters, making it more flexible and adaptable to different use cases.

4.3.1 Encryption
Input Parameters:

i. plaintext: The message that you want to encrypt.

ii. key: The keyword used for encryption. This key will repeat if it is shorter than the

plaintext.

Initializing variables

i. An empty list to store the encrypted characters.

ii. The length of the key.

iii. A list of integers representing the Unicode values of each character in the key.

iv. plaintext: A list of integers representing the Unicode values of each character in the

plaintext.

Loop through each character of the plaintext


i. The loop iterates over each character in the plaintext list.

ii. For each character:

1. The Unicode value of the current character from plaintext is added to the

corresponding character's value from the key where the key repeats if it is shorter

than the plaintext.

2. The modulo operation ensures that the resulting value stays within the full

Unicode range.

3. The result is then converted back to a character using and the character is

appended to encrypted text.

Return the result

Finally, the encrypted characters are joined together to form a single string and returned

as the encrypted text.

Modulo operation:

The value is used as a boundary for Unicode characters, which covers the entire Unicode

space, including supplementary characters. This is a valid approach, but the Vigenère

cypher traditionally works within a smaller character set, like the ASCII or extended

Latin characters, rather than the entire Unicode range.

Character wrapping:

If dealing with typical text (ASCII or similar), words are wrap around within a more

restricted range depending on the use case. Using the full Unicode range can produce
characters that are not normally visible or printable, which might not be suitable for

certain applications.

Handling non-printable characters:

Since the encryption could result in non-printable or special Unicode characters,

considering handling those cases explicitly, depending on the intended use of the

cyphertext.

Key length vs. Plaintext length:

The key repeats itself if it is shorter than the plaintext. However, it might be useful to

validate that the key is not empty, as that could lead to an exception when trying to

access with key

The function is defined to accept plaintext and key as inputs. And empty list is created to store

the resulting encrypted characters. The function is used to convert each character of the key and

plaintext into its corresponding Unicode integer value. A loop iterates over each character of

plaintext. The encryption is performed by adding the Unicode value of plaintext character to the

corresponding key character (using modulo to cycle through the key). The result is wrapped

around the full Unicode range using %0x110000. The resulting integer value is converted back

to the character using chr() and append to the encrypted_text list and finally, the list of encrypted

character is joined into a single string and returned

The combination of Unicode characters in the key can lead to a significantly larger key space,

thus enhancing the overall cryptographic strength due to a reduced likelihood of key collision.

4.3.2 Decryption
The Vigenère_decryption function provides a clear and efficient method for decrypting text

encoded with the Vigenère cypher. By leveraging the properties of Unicode and modular

arithmetic, the function successfully retrieves the original plaintext from the cyphertext

Input parameters:

i. cyphertext: The encrypted message to be decrypt.

ii. key: The same keyword used during encryption.

Initializing variables

i. An empty list to store the decrypted characters.

ii. The length of the key.

iii. A list of integers representing the Unicode values of each character in the key.

iv. A list of integers representing the Unicode values of each character in the cyphertext.

Loop through each character of the cyphertext

The loop iterates over each character in the list.

For each character:

i. The key value corresponding to the current position is subtracted from the cyphertext

character's value The modulo operation ensures that the result stays within the full

Unicode range (just like in the encryption process).

ii. The result is then converted back to a character using chr(value), and this character is

appended to decrypted_text.
Return the result

The decrypted characters are joined together to form a single string and returned as the

decrypted text.

Observations

Modulo operation:

Just like the encryption, the use of 0x110000 in the modulo operation ensures the result

stays within the full Unicode range. This is fine for a very broad character set (including

supplementary Unicode characters), but as mentioned previously, it may want to restrict

this to a smaller range if working with a more limited character set like ASCII.

Key length validation:

Just like in the encryption function, it would be a good idea to validate that the key is not

empty to avoid potential errors during the decryption process:

Handling non-printable characters:

Depending on the context in which the decrypted text is being used, it might be necessary

to handle non-printable characters or control characters that result from decryption.

General consistency:

Using cypher in a more general application, the possibility of mixing in additional steps

like padding or character encoding normalization to ensure that the encrypted text can be

safely transmitted or stored.


The function initializes and empty list, decrypted_text, to store the decrypted characters. It also

calculates the length of the key and converts both the key and the cyphertext into lists of their

corresponding Unicode integer values. The loop iterates over each character in the cyphertext.

For each character, it computes the decrypted value by subtracting the key’s corresponding

characters’ value (using modulo to cycle through the key) from the cyphertext’s characters value.

The result is wrapped around the full Unicode range to ensure valid character values. Each

computed integer value is converted back to character using the chr() function and appended to

the decrypted_text list and finally, the function joins the list of decryption characters into a single

string and returns it.

4.4 Steganography
The image steganography is the process of hidden data within an image so that there will not be

any perceived visible change in the original image. The conventional image steganography

algorithm used is LSB embedding algorithm. The most basic method which is the least

significant bit is used. A colour pixel is composed of red, green and blue, encoded on one byte.

The idea is to store information in the first bit of every pixel's RGB component. In the worst

case, the decimal value is different by one which is not visible to the human eye. If there is no

space to store all the data in the first bit of every pixel we start using the second bit, and so on.

The more we store data in an image the easy it gets detected and store less data in a single image

to avoid detection.

This structure prepares the object to modify pixel bits to hide or read data without significantly

altering the visible image.


The class initializes with an image, sets up masks for manipulating individual bits in pixel values

for steganography and tracks position within the image for sequential embedding/extraction.

This method encodes each bit of data into the least significant bit of a pixel's color channel, one

at a time. It carefully manipulates pixel values using bitwise operations to hide data without

significantly altering the image's appearance. The cursor management via next_slot() ensures

embedding proceeds sequentially through the image.

This function ensures sequential traversal of all possible embedding positions within the image,

moving through channels, pixels, rows, and bit positions systematically. It also manages the

transition between different bit significance levels, enabling multi-level steganography.

These methods work together to extract hidden data from an image:

The cursor advances automatically after each call, ensuring sequential reading of embedded data.

These methods facilitate converting numerical data into binary strings of specified sizes. uses

these to encode the length and content of a message into an image, suitable for steganography.

This process ensures data is stored in a predictable binary format, making it retrievable later.

The method first reads the length of the hidden message. Then, it reads that many characters

(bytes), each stored as 8 bits, converts each binary byte back into its character form and builds
and returns the complete hidden message string. This process is the reverse of the encoding

process, allowing the retrieval of data embedded in the image.

This method effectively hides an entire image within another by converting pixel data into binary

form and embedding it into the least significant bits of the carrier image. Checks if the carrier

image can hold the data, encodes the dimensions of the image to hide and iterates over every

pixel and channel of the image to hide. Embeds pixel data into the carrier image's binary data and

returns the final steganographed image.

Decryption:

This process reverses the encoding process, extracting embedded pixel data from the

steganographed image to recover the original image. Reads the dimensions of the hidden image.

Creates an empty image array, reads pixel values channel by channel, reconstructs the original

image pixel-by-pixel and outputs the hidden image.

This approach is useful for general data hiding, not just images (e.g., text, files).

encode_binary embeds the length and data into the image, decode_binary reads the length,

then retrieves the data accordingly. Together, they enable storing and retrieving arbitrary binary

data within an image.


4.5 Result of The Trained Model

Table 4.1: Result of the trained model


Proposed method
image PSNR Embedding Capacity MSE
(bits)
Lena 46.04 589,824 3.74

Baboon 46.04 589,824 3.74

Peppers 46.01 589,824 3.74

Boat 46.01 589,824 3.76

Airplane 45.88 589,824 3.76

House 45.83 589,824 3.74

Average 45.96 589,824 3.75

4.6 Evaluation Metric


Mean Square Error

(4.1)
m=1 n=1
1
∑ . ∑ . ( k ( i , j )−k^ ( i , j ) )
2
MSE=
m× n i=0 j=0

Where:
m=number of rows∈theimage

n=number of columns∈the image

k ( i , j )= pixel value at position ( i, j )∈the original image

k^ ( i , j )= pixel value at position ( i, j )∈the distort image


Peak Signal−¿−Noise Ratiousing MSE

(4.2)
( )
2
( MAX 1 )
PSNR=10 log 10
MSE
4.7 Result of The Model Compare with Previous Model
The table below shows the comparison between the proposed model and Ko et al., 2022. It

shows that despite that it passes through multiple security layers, it has a better evaluation

metrics. Across all images, the embedding capacity remains constant at 589,824 bits, which

suggests the proposed method maintains a fixed embedding size, likely optimized for the cover

images' capacity.

PSNR improvement in proposed method:

Comparing the original PSNR (e.g., 44.67 dB for Lena) to the proposed method PSNR (46.04 dB

for Lena), there's an increase (~1.37 dB). This suggests the proposed method enhances

embedding efficiency, allowing more data to be hidden without compromising image quality.

Table 4.2: Result of the model compare with Ko et al., 2022.

Ko et al., 2022 Proposed method

Images PSNR Embedding PSNR Embedding MSE


Capacity Capacity
(bits) (bits)
Lena 44.67 589,824 46.04 589,824 3.74

Baboon 44.67 589,824 46.04 589,824 3.74

Peppers 44.66 589,824 46.01 589,824 3.74

Boat 44.68 589,824 46.01 589,824 3.76

Airplane 44.66 589,824 45.88 589,824 3.76

House 44.67 589,824 45.83 589,824 3.74

Average 44.67 589,824 45.96 589,824 3.75

4.8 Interface

4.8.2 Vigenère cypher interface

i. Encryption:

For encryption example,

The plaintext (cypher text from AES encryption) is

ii. Decryption:

4.8.3 Steganography interface

Every byte of data is converted to its 8-bit binary code using ASCII and Unicode values. Now

pixels are read from left to right in a group of 3. The first 8-values are used to store the binary

data. The value is made odd, if 1 occurs and even, if 0 occurs.


How data is embedded:

Saving the word “HI” into the below image

Decode The Data

To decode, three pixels are read at a time, till the last value is odd, which means the message is

over. Every 3-pixels contain a binary data, which can be extracted by the same encoding logic. If

the value is odd the binary bit is 1 else 0.

4.9 Security Consideration

i. Layered security: By using both encryption algorithms (AES-GCM and Vigenère

Cypher with full Unicode range) and steganography, it creates layers of protection in a

multiple way which make it harder and difficult for attackers to access the hidden

message.

ii. Key Management: Use robust key generation practices for both AES and Vigenère

cypher.

iii. Steganography Robustness: Ensure the steganography method is resistant to detection

and extraction by unauthorized parties.

iv. Error Handling: Error handling for decryption and extraction processes ensure that the

process system is reliable, secure, and user-friendly.

v. Performance: Optimize the system for large messages with high-resolution cover media.

CHAPTER FIVE
5.0 SUMMARY, CONCLUSION AND RECOMMENDATION

5.1 Summary

This project successfully met its goal of creating a stego-image file for hidden information.

Chapter one provides the overview of steganography, cryptography, and watermarks.

Steganography, which involves different ways of hiding secret messages within other non-secret

digital media, such as text, images (using colour, binary, and grayscale), audio, video, text, and

networks, to protect important information from attacks. Cryptography uses techniques for

secure communication by converting plain text into ciphertext. Different methods and layers in

securing information have been deployed. The implementation of this project involves a multi-

layered security approach.

Chapter two gives the insight of the different methods proposed by different scholars to

implement security on a secret message. It helps to leverage the ways and techniques used in

steganography and cryptography.

Chapter three outlines the method to use for making secret messages more secure and not be

detected by any unauthorized person and steganography, which provides confidentiality and

integrity of the message. Using the multilayer method for securing the message, the first layer

AES encrypts the plain text message using symmetric encryption, ensuring confidentiality with a

unique key generator; the second layer is polyalphabetic substitution with Unicode compatibility

to create more layers of security. Third layer: Steganography embeds the encrypted data into a

cover medium via LSB substitution.


Retrieval: The process is reversed—extract bits, reconstruct ciphertext, reverse substitution to

recover the cypher text, and decrypt with AES to retrieve the plain text.

5.2 Conclusion

In conclusion, there is a significant difference in the method used compared with the previous

method used by Ko et al., 2022. Using a large embedding capacity of 589,824, the image use has

a preserving high image quality (PSNR around 45-46 dB). The low MSE shows little distortion

in the embedded images, visually not different from the original images. This method appears

effective across various images that balance capacity and imperceptibility. Despite the fact that

the stego image preserves high image quality, the embedded message resists potential weakness

by specific AES key schedules.

5.3 Recommendation

The assurance security of this system relies on steganography and AES-GCM in this approach,

while the Vigenère cipher adds an additional layer of security to the encrypted data. Using a

lengthened key for both AES and polyalphabetic substitution when encrypting to provide high

security against brute force attacks. This approach provides a good balance between

performance, security, and attack resistance from intruders.

AES-GCM Before Vigenère Cypher (Unicode full range) is considered to be the best approach

because
i. AES-GCM Provides Strong Encryption: AES-GCM is proven to be a more secure

encryption standard, so for best performance of encryption, AES-GCM should be the primary

layer of encryption.

ii. Vigenère Adds Obfuscation: Applying the Vigenère Cypher to the AES-GCM

encrypted data adds an additional layer of blur to make it harder for attackers to analyze the

encrypted data.

iii. Unicode Support: Applying the Vigenère Cypher with Unicode support increases the

complexity of the cyphertext to the AES-GCM cyphertext, making it more difficult to analyze.

You might also like