C++ Implementation of Neural Cryptography for Public Key
Exchange and Secure Message Encryption with Rijndael Cipher
Sagun Man Singh Shrestha
Department of Electronics and Computer Engineering,
Tribhuvan University – Kathmandu Engineering College, Nepal
Google: sagunms | github.com/sagunms/NeuroCrypto
Abstract: This work is the software implementation of the concept of neural cryptography, which is
a communication of two tree parity machines for agreement on a common key over a public channel.
This key is utilized to encrypt a sensitive message to be transmitted over an insecure channel using
Rijndael cipher. This is a new potential source for public key cryptography schemes which are not
based on number theoretic functions, and have small time and memory complexities. This paper will
give a brief introduction to artificial neural networks, cryptography and its types, which will help
explain why the two communicating terminals converge to a common key in neural cryptography
and will also cover the Rijndael (AES) cipher. This paper is intended to show that such neural key
exchange protocol and AES encryption can be practically implemented in a high-level programming
language viz. C++, which could be further extended in higher-level applications. Both CLI and GUI
implementations of the software created using Visual C++ (.NET framework) are presented.
Index Terms: Neural Networks, Tree Parity Machine, Mutual learning, Cryptography, Public Key,
Symmetric Key, Rijndael, Advanced Encryption Standard, Visual C++, .NET Framework
I. INTRODUCTION multiplied by its corresponding random weight
(constantly adjusted during learning), summed up,
To understand the underlying concepts of neural and if the end result exceeds the threshold value, then
cryptography and Rijndael, one should first have it fires an output which goes through activation
proper perspective on artificial neural networks, function. It is these values of the weights (synaptic
cryptography and its types in general. strengths) and the threshold value that is continuously
adjusted until the optimum solution is found.
1.1 Artificial Neural Networks
X1 W1 Activation Function
Artificial Intelligence is the study of the
computations that make it possible for computers to
∑
Net Input Activation
perceive, reason and act. An artificial neural network X2 W2
ϕ Ok
Vk
(ANN) is an information processing paradigm
inspired by the structure and functional aspects of
Summing Junction
biological neural networks, such as the brain, to (Transfer Function) θk
process information. The key element of this XN WN Threshold
paradigm consists of highly interconnected
Input Signals Synaptic Weights
processing elements called artificial neurons,
working in unison to solve specific problems, just Figure 1: Model of a single artificial neuron
like our own brains do - learning by example.
Taking inspiration from the working of biological An artificial neural network (Fig. 2) is an adaptive
neurons, the mathematical model of an artificial system that changes its structure based on external or
neuron (Fig. 1) can be formulated. The inputs are internal information that flows through the network
during the learning phase. It is configured for a
specific application, such as pattern recognition or stream ciphers (e.g. RC4) and to their applications. In
data classification, through a learning process. Just as this method, both the sender and the receiver use the
learning in biological systems involves adjustments same key for encryption and decryption of a message,
to the synaptic connections that exist between the though a message or group may have a different key
neurons, this is true of ANNs as well. than others. A disadvantage is the key management
necessary to use them securely. Each distinct pair of
communicants must share a different key and each
ciphertext exchanged as well. The number of keys
required increases as the square of the number of
network members, which requires complex key
management schemes to keep them all secret.
Outputs 1.2.2 Public-key (Asymmetric) cryptography
Inputs
In a groundbreaking 1976 paper, Whitfield Diffie and
Martin Hellman proposed the notion of public-key
cryptography in which two different but
mathematically related keys are used - a public key
Single Neuron and a private key. This was first practical method for
establishing a shared secret over an unprotected
communications channel. The system is so
Input Layer Hidden Layer Output Layer constructed that calculation of one key (private key)
Figure 2: Model of an artificial neural network is computationally infeasible from the other (public
key). Both keys are generated secretly, as an
interrelated pair. An unpredictable large random
1.2 Cryptography
number is used to begin generation of an acceptable
Cryptography, as derived from Greek kryptos, pair of keys suitable for use by an asymmetric key
"secret" and gráph, "writing", is the practice and algorithm. The public key may be freely distributed
study of hiding information. It intersects the and typically used for encryption, while its paired
disciplines of mathematics, computer science, and private key must remain secret and used for
engineering. Applications include ATM cards, decryption. Unlike symmetric key algorithms, it does
computer passwords, and e-commerce. Encryption is not require a secure initial exchange of one or more
the process of converting ordinary information secret keys to both sender and receiver. The critical
(plaintext) into unintelligible gibberish (ciphertext) advantage is that communicating parties (traditionally
while decryption is the reverse. A cipher is a pair of Alice and Bob) never need to send a copy of their
algorithms that create the encryption and the keys to each other therefore even if say, Bob’s key is
reversing decryption. The detailed operation of a known to a third party, Alice's secret messages to
cipher is controlled both by the algorithm and in each Bob would be compromised, but Alice’s messages
instance by a secret parameter called key. Keys are would still be secure to others.
important, as ciphers without variable keys can be
trivially broken with only the knowledge of the 2. NEURAL CRYPTOGRAPHY
cipher used and are therefore useless. The modern
field of cryptography can be divided into two broadly It is a new branch of cryptography which
areas - Symmetric and Public-key cryptography. incorporates neural networks with cryptography. It is
dedicated to analyzing the application of stochastic
1.2.1 Symmetric-key cryptography algorithms namely, neural network algorithm, for use
in either cryptanalysis or encryption. The first work
This was the only kind of encryption publicly known that is known on this topic can be traced back to 1995
until June 1976. Its modern study relates mainly to in an IT Master Thesis by Sebastien Dourlens. Due to
the study of block ciphers (e.g. DES, Rijndael) and its recent development, there hasn’t been much
practical applications as of yet. However, like any
nascent field, rapid research findings on neural ∑
cryptography is bringing new and exciting ideas. It
could be used specially where the keys are
continually generated and the system (both pairs and
Signum is a simple function, which returns -1, 0 or 1.
the insecure media) is in a continuously evolving
mode. ANNs are well known for their ability to
selectively explore the solution space of a given
problem. The ideas of mutual learning, self learning, {
and stochastic behavior of ANNs can also be
alternatively used for the realization of symmetric
key exchange protocols, mutual synchronization, and If the scalar product is 0, the output of hidden neuron
generation of pseudo-random numbers. With these is mapped to -1 to ensure a binary output value. The
features, two neural networks can be used to encrypt output of neural network is then computed as the
and decrypt or exchange messages in a network, product of all values produced by hidden elements.
which can be used for public-key encryption.
2.1 Neural key exchange protocol ∏
The most used protocol for key exchange between
two terminals in practice is Diffie-Hellman protocol.
Neural key exchange, based on the synchronization X11 W1
of two TPMs, each associated with parties wishing to
X12 1
share a secret, should be a secure replacement for this
method. Chaos theory studies the behavior of
dynamical systems that are highly sensitive to initial X1N 1
W1
conditions (aka. butterfly effect). Even small
N
differences in initial conditions (like rounding errors X21 W2
in numerical computation) yield widely diverging { }
outcomes for chaotic systems, rendering long-term X22 1
2
prediction impossible. Therefore, synchronization of ∏
these two TPMs at A and B in neural cryptography
X2N W2
has a striking similarity in synchronization of two
N
chaotic oscillators in chaos communications.
XK1 WK
2.2 Tree Parity Machines { }
1
XK1
The tree parity machine (Fig. 3) is a special type of
multi-layer feed-forward neural network. It consists XKN { }
of one output neuron, K hidden neurons and K * N
input neurons. Inputs to the network are binary. { }
Figure 3: Structure of Tree Parity Machine
{ }
Summarizing, the output of the TPM is in binary and
Weights of input and hidden neurons take the values. can be written as:
{ }
∏ ∑
Output of each hidden neuron is calculated as a sum
of all products of input neurons and its weights.
2.3 Feasibility and Security the two TPMs are equal. One of the following
learning rules can be used for the synchronization:
The dynamics of two networks and their weight
vectors found exhibit a novel phenomenon, where Hebbian learning rule:
network synchronize to a state with identical time { }
dependent weights. This concept of fast
synchronization by mutual learning can be applied to Anti-Hebbian learning rule:
secret key exchange protocol over a public channel { }
and the generated key can be used for encryption and
decryption of a given message. The algorithm does
not operate on large numbers and methods from Random walk:
number theory and therefore leads to fast { }
synchronization of the public key. The security of
neural cryptography is still being debated but since Where, (Theta) is the Heaviside step function, if
the method is based on a stochastic process, there is a the input is positive then the output is 1 and if input is
small chance that an attacker synchronizes to the key, negative then the function evaluates to 0.
as well. It has been found that the model parameter L
determines the security of the system. Theta(a, b)= 0 if a< >b; else Theta= 1. The function
g(…) keeps the weights in the range {-L, + L}. x is the
2.4 Synchronization Algorithm input vector and w is the weights vector. After the
machines are synchronized, their weights are equal:
Each terminal A and B uses its own tree parity we can use them for constructing a shared key. There
machine. Synchronization of the tree parity machines could be various possibilities that this algorithm
is achieved in the following steps. could be attacked. However, the possibilities of this
happening are very unlikely.
1. Initialize random weight values
2. Execute these steps until the full synchronization 2.5 C++ Implementation (NeuroCrypto CLI)
is achieved
1. Generate random input vector X 2.5.1 InputVector class structure
2. Compute the values of the hidden
neurons class TPMInputVector {
3. Compute the value of the output neuron public:
4. Compare the values of both tree parity DynamicArray <int> X;
void CreateRandomVector(int K,int N);
machines void xLength (int K, int N);
1. If outputs are others: go to 2.1 };
2. If outputs are same: one of the
suitable learning rules is applied
to the weights This class dynamically allocates and updates the
randomized input vector X which will be constantly
accessed by NeuroCrypto class for updating the
The flowchart of NeuroCrypto as shown in Fig. 4 is
weights during synchronization.
the high-level programming language representation
of neural cryptography. After the full synchronization
The input vector X is used throughout the
is achieved (weights wij of both tree parity machines synchronization process to randomly assign weights
are the same), A and B can finally use their weights of the neural network. The two functions of
as keys. This method is known as bidirectional TPMInputVector class are CreateRandomVector and
learning. xLength. CreateRandomVector assigns random
signed bits (-1 or 1) to all the K*N number of neurons
For synchronization, the weights of the TPM have to while, xLength is used to allocate the K*N long
be constantly checked for equality and then updated. dynamic array X to store the input vector.
The weights are updated only if the output values of
2.5.2 TreeParityMachine class structure
Get TPM
Parameters
class TreeParityMachine { (K, L, N)
public:
DynamicArray <int> W, H; Start Start End Start
int K, N, L;
int TPMOutput; Generate
void Initialize (); Initialize Random Random Initialize Random
void ComputeTPMResult (const Weights Input Vector Weights
DynamicArray <int> &X);
void UpdateWeight (const Input Vector
DynamicArray <int> &X);
void RandomWeight (); Get Input Get Input
}; Vector Vector
Output Output
Vector Vector
This is the TPM class which is based on the
mathematics and algorithm covered in Listings 2.2
Yes Yes
and 2.4 and is composed of weight and hidden Output Output
vectors vectors
intermediate DynamicArray objects, K, N and L equal? equal?
parameters to characterize TPM and a TPM output
Update Update
variable. Various functions for initialization, weight Weights Weights
updating and result computation are defined. No in ANN; in ANN; No
i++ i++
2.5.3 Tree Parity Machine class definitions I >= K*N* I >= K*N*
ln(2L+I) ln(2L+I)
No No
TreeParityMachine::ComputeTPMResult Yes Yes
(const DynamicArray <int> & X) {
End End
int i, j, sum; TPMOutput = 1;
for (i = 0; i < K; i++) {
sum = 0;
for (j = 0; j < N; j++) Terminal A Common Processes Terminal B
sum+=(W.Z[i*N+j] * X.Z[i*N+j]); (Alice) / Public Channel (Bob)
H.Z[i]= Signum(sum);
Figure 4: Overall program flowchart of High-level
TPMOutput *= Signum (sum);
} language implementation of Neural Cryptography
}
void TreeParityMachine::Initialize() {
W.length(K * N); H.length(K); 2.5.4 NeuroCrypto class structure
}
void TreeParityMachine::RandomWeight(){
int i; class NeuroCrypto {
for (i=0; i < K*N; i++) public:
W.Z[i] = L-(rand()%(2*L+1)); unsigned int itrMax;
} TreeParityMachine A, B;
void TreeParityMachine::UpdateWeight TPMInputVector objInput;
(const DynamicArray <int> & X) { char publickey[100];
int i, j, newW; //Default Constructor
for (i = 0; i < K; i++) { NeuroCrypto (int k,int n,int l);
for (j = 0; j < N; j++) { };
newW = W.Z[i * N + j];
newW += X.Z[i * N + j] *
TPMOutput*IsEqual(TPMOutput,H.Z[i]) This is the superset class of NeuroCrypto program
*IsEqual(TPMOutput, TPMOutput); module which consists of two TreeParityMachine
if (newW > L) newW = L; objects A and B for Alice and Bob’s TPM, a
if (newW < - L) newW = -L;
W.Z[i * N + j] = newW; }
TPMInputVector object objInput, a character array to
} store the final public key. The default constructor
} takes the common TPM parameters K, N and L
inputted by the user during runtime.
2.5.5 Global Function Declaration For each iteration of Alice’s TPM, random input
vector will be produced (using CreateRandomVector
int IsEqual (int A, int B); function of TPMInputVector class), the output value
int RandomBit (); of TPM will be computed (using ComputeTPMResult
int Signum (double r); function) and this output value will be available to
Bob’s TPM B. Bob’s TPM should follow the same
Global functions for checking equality, generating iterative procedure. During synchronization, both
random bit (either 1 or -1) and Signum are declared. parties A and B continually check for the equality of
both their TPMOutput values.
2.5.6 Variable Declarations and Initialization
2.5.8 Key Generation and Sync Outputs
int i, ii, ss, itrMax, j,K=0,sum,
key_size,key_length,initN,initL; cout<<"Iterations:"<< i <<"DataExchanged:"
TreeParityMachine A, B; << (i*(A.K*A.N+4)/1024) << " KiB";
TPMInputVector objInput; key_size = 37 / (A.L * 2 + 1);
DynamicArray <char> publickey; key_length = A.K * A.N / key_size;
const char Dictionary [38] =
"ABCDEFGHIJKLMNOPQRSTUVWXYZ_0123456789"; cout << "Key length: " << key_length;
srand (time(NULL)); //random generator publickey.length(key_length + 1);
cout << "Parameter settings (K, N, L)";
cin >> initK >> initN >> initL; for(i = 0; i < key_length; i++)
A.K=initK, A.N=initN, A.L=initL;//InitA publickey.Z[i] = 0;
A.Initialize (); A.RandomWeight (); for (i=1; i < key_length+1; i++) {
B.K=initK, B.N=initN, B.L=initL;//InitB K = 1;
B.Initialize (); B.RandomWeight (); for(j=(i-1)*key_size; j<i*key_size;j++)
itrMax=(A.L*A.L*A.L*A.L)*A.N*A.K; ///// K = K + A.W.Z[j] + A.L;
//
cout << "Maximum Iterations: "<<itrMax;
publickey.Z[i-1]=Dictionary[K];
objInput.xLength (B.K, B.N); }
cout<<"Synchronizing TPM Networks..."; publickey.Z[key_length]='\0'; //Null char
cout << "Public Key: " << publickey.Z;
A and B (TPMs of Alice and Bob) are instances of
TreeParityMachine class. The publickey object of The actual iterations that were required to accomplish
DynamicArray class stores the final publickey after TPMs of matching weights are recorded in i. For
successful sync. Dictionary array stores 38 symbols each iteration, the data transferred is K*N+4 bytes. So
(26 alphabets, 10 numbers, and an underscore) as a for whole synchronization, it would be
template for key generation. The system time is used i*(K*N+4)/1024 kilobytes. The length of the
as a seed to generate random numbers. The program public key is calculated as the total number of
takes the parameters K, N and L from the user and neurons of TPM divided by the key size, where again
initializes the TPMs with the common parameters. key size depends on the number of symbols assigned
in the dictionary and the depth of weights in both
2.5.7 Main Iteration within the range {-L, L}. This explains the formula
L*2+1 (to include origin). For publickey generation
of key_length characters, the ith character of publickey
for (i=1; i!=itrMax; i++) {
objInput.CreateRandomVector(B.K, B.N); will be assigned as the kth symbol in the dictionary.
A.ComputeTPMResult(objInput.X); This will be done such that the location K depends on
B.ComputeTPMResult(objInput.X); jth neural weight, which again depends on the ith
if(A.TPMOutput == B.TPMOutput){
A.UpdateWeight (objInput.X); position of publickey array. This iterative formula for
B.UpdateWeight (objInput.X); key generation is done by a nested loop to produce a
sum = 0; seemingly random public key which actually depends
for(ss=0;ss<A.K*A.N;ss++)//Find sum
sum += abs(A.W.Z[ss]- B.W.Z[ss]);
on the weights, dictionary size, and the TPM
if ( sum == 0 ) break; } } parameters (depends on key_size).
if (sum == 0) cout << "Status: SUCCESS!";
else cout << "Status: FAILED!";
3. RIJNDAEL (AES) 4. AddRoundKey
4. Final Round (no MixColumns)
Advanced Encryption Standard (AES) is a 1. SubBytes
symmetric-key encryption standard adopted by the 2. ShiftRows
U.S. government. The standard comprises three block 3. AddRoundKey
ciphers, AES-128, AES-192 and AES-256, adopted
from a larger collection originally published as 3.2 Security and Feasibility
Rijndael (coined after two Belgian cryptographers,
Vincent Rijmen and Joan Daemen). The AES ciphers Until May 2009, the only successful published
have been analyzed extensively and are now used attacks against the full AES were side-channel
worldwide, as the case with its predecessor, the DES. attacks on some specific implementations. The
National Security Agency (NSA) reviewed Rijndael,
AES has a fixed block size of 128 bits and a key size and stated it was secure enough for U.S. Government
of 128, 192 or 256 bits, whereas Rijndael can be non-classified data and that design and strength of all
specified with block and key sizes in any multiple of key lengths of the AES (128, 192 and 256) are
32 bits, with a minimum of 128 bits. The block size sufficient to protect classified information up to the
has a maximum of 256 bits, but the key size has no ‘Secret’ level. ‘Top Secret’ will require use of either
theoretical maximum. AES operates on a 4×4 array 192 or 256 key lengths. Thus, the use of Rijndael for
of bytes, termed the state (versions of Rijndael with a encrypting sensitive messages using the synchronized
larger block size have additional columns in the public key as used in this program could provide a
state). very strong level of security. The demonstration
software: NeuroCrypto uses the Cryptostream class
The cipher is specified as a number of repetitions of of System::Security::Cryptography namespace of
transformation rounds that convert the input plaintext Visual Studio .NET library to implement the 128 bit
into the final output of ciphertext. Each round key size, ECB mode Rijndael cipher. Since it can be
consists of several processing steps, including one configured to use any variant of Rijndael (including
that depends on the encryption key – the public key AES), the flexible term Rijndael is used.
obtained from the synchronization of two TPMs
during the neural cryptography stage. A set of reverse 4. CONCLUSION
rounds are applied to transform ciphertext back into
the original plaintext using the same encryption key. The difficulty of securely establishing a secret key
between two communicating parties, when a secure
3.1 High Level Algorithm channel does not already exist between them is a
considerable obstacle for using only a symmetric
1. KeyExpansion: round keys are derived from the cipher such as Rijndael. There are several attractive
cipher key using Rijndael's key schedule features of neural cryptography such as fast
2. Initial Round synchronization of a common key and its striking
1. AddRoundKey: each byte of the state is similarity with the synchronization of two chaotic
combined with the round key using oscillators which gives a very small chance for the
bitwise XOR eavesdropper to synchronize its TPM with the two
TPMs of the communicating parties. Taking these
3. Rounds
facts into consideration, a hybrid cascaded system
1. SubBytes: a non-linear substitution step
consisting of both neural cryptography and Rijndael
where each byte is replaced with another
cipher was created using Visual C++ .NET in both
according to a lookup table.
CLI and GUI versions as shown in the screenshot
2. ShiftRows: a transposition step where
(Listing 7). Exploiting the advantages from both the
each row of the state is shifted cyclically
ciphers, this could be a promising architecture for a
a certain number of steps.
stronger and more secure cryptography system that
3. MixColumns: a mixing operation which
could be used for various applications of data
operates on the columns of the state,
security over insecure channels.
combining the four bytes in each column.
5. FUTURE SCOPE 7. SCREENSHOTS (NeuroCrypto CLI/GUI)
This project: NeuroCrypto is just a proof-of-concept
demonstration to show that a strong cryptographic
system using neural cryptography cascaded with
Rijndael cipher can be implemented in using a high-
level language. It still has room for a lot of
improvement. It is released as open source under
Apache License v2 at github.com so anyone can
easily integrate this in higher-level software to share
sensitive data over the internet or any insecure
channel, provided, the channel is feasible for rapid
synchronization of both TPMs. Good performance
(high speed and low RAM requirements) were an
explicit goal of the AES selection process. Thus AES
performs well on a wide variety of hardware, from 8-
bit microcontrollers to high-performance computers.
So this software could also be optimized to work on
embedded platforms in applications such as Ethernet-
based or wireless systems. Therefore, this project can
be expanded to cover multiple cryptography domains.
6. REFERENCES
[1] N. Prabakaran, P. Saravanan, and P.
Vivekanandan, 2008. “A New Technique on
Neural Cryptography with Securing of Electronic
Medical Records in Telemedicine System”,
International Journal of Soft Computing 3 (5):
390-396, Medwell Journals, 2008.
[2] Andreas Ruttor, Ph. D. Thesis, “Neural
Synchronization and Cryptography”, Bayerische
Julius-Maximilians-Universität Würzburg, 2006
[3] CyberTrone, 2009. http://codeproject.com/
Articles/39067/Neural-Cryptography
[4] E. Klein, R. Mislovaty, I. Kanter, W. Kinzel;
“Synchronization of neural networks by mutual
learning and its application to cryptography”.
[5] A. Klimov, A. Mityaguine, A. Shamir; “Analysis
of Neural Cryptography”, Advances in
Cryptology, ASIACRYPT 2002.
[6] P. Revankar, W. Z. Gandhare and D. Rathod,
2010. “Private Inputs to Tree Parity Machine”,
International Journal of Computer Theory and
Engineering, Vol.2, No.4,Aug 2010,pp. 665-669.
[7] Neural Cryptography, Cryptography, Neural
Networks, Chaos Theory, Advanced Encryption
Standard. http://en.wikipedia.org
[8] S. Trenholme, 2005. “AES Galois field”,
“Rijndael’s key schedule”, “Mix column stage”,
“AES encryption”. http://www.samiam.org