0% found this document useful (0 votes)

29 views6 pages

5658 IJSRM Paper

Uploaded by

redmonter John

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

29 views6 pages

5658 IJSRM Paper

Uploaded by

redmonter John

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

International Journal of Scientific Research and Management (IJSRM)

||Volume||12||Issue||09||Pages||1422-1427||2024||
Website: [Link] ISSN (e): 2321-3418
DOI: 10.18535/ijsrm/v12i09.ec01

Text Compression Using the Shannon-Fano, Huffman, and Half–

Byte Algorithms
Eko Priyono1, Hindayati Mustafidah2*
1,2
Informatics Engineering, Universitas Muhammadiyah Purwokerto, Indonesia

Abstract
Background and Objectives: File sizes increase as technology advances. Large files require more
storage memory and longer transfer times. Data compression is changing an input or original data into
another data stream as output or compressed data which is smaller in size. Existing compression
techniques include the Huffman, Shannon-Fano, and Half-Byte algorithms. Like algorithms in computer
science, these three algorithms offer advantages and disadvantages. Therefore, testing is needed to
determine which algorithm is most effective for data compression, especially text data.
Methods: Applying the Huffman, Shannon-Fano, and Half-Byte algorithms to test their effectiveness in
compressing text data. The text data as a sample in the research carried out is a text file containing
abstracts from research articles published in scientific publications randomly selected from 100 journals.
The abstract text used as data is in Indonesian.
Results: Based on test findings, the Huffman algorithm outperforms the Shannon-Fano and Half-Byte
algorithms in terms of compression ratio. The Half-Byte algorithm has the lowest compression ratio
compared to the Huffman and Shannon-Fano algorithm. The Half-Byte compression algorithm is based
on the similarity of the first four bits of seven consecutive characters, whereas Huffman and Shannon-
Fano algorithms employ the number of character appearances. The Huffman method can be considered
for use in compressing Indonesian language text data according to its average compression ratio of
46.05%, while Shannon-Fano of 40.36%, and Half-Byte of 5.04%.

Keywords: compression ratio, text data, effectiveness in compressing

1. Introduction
The size of files increases as technology advances. This necessitates additional storage memory and
significant transmission times. Not everyone has a significant storage capacity and a high-speed internet
connection for file transfers. This issue can be addressed by the development of many file compression
technologies, including data compression.
Data compression is a technique that converts an input data stream, namely original data, into another data
stream, known as output or compressed data. Compressed output data has a smaller size (Salomon, 2007).
Some extant compression techniques are the Huffman algorithm, Lempel Ziv Storer Szymanski (LZSS),
Shannon-Fano, Half-Byte, Lempel Ziv Welch (LZW), and others.
Several research have created text compression techniques, including (Mizwar et al., 2017) which
implemented the J-Bit Encoding Algorithm, and (Irliansyah et al., 2017), which implemented the Deflate
method and the Goldbach Codes Algorithm. Two years later, (Darnita et al., 2019) invented the Sequitur
method, which was also used to compress text data. Aside from that, (Saragih and Utomo, 2020) used the
Prefix Code Algorithm to compress text data, (Rizky et al., 2020) used the Elias Delta Codes Algorithm, and
(Simanjuntak, 2020) applied the Elias Delta Code Algorithm with Levenstein to compress text files.
The Huffman, Shannon-Fano, and Half-Byte algorithms investigated in this study offer advantages and
downsides, as have previous studies (Siahaan, 2016) and (Puspabhuana, 2016). Other research that used the
Huffman method include (Supiyandi and Frida, 2018), (Widatama and Saputro, 2019), (Mahmoudi and
Zare, 2020), and (Pujianto et al., 2020). As a result, it is required to conduct tests to determine which
technique is best for text data compression. The efficacy of a compression technique can be assessed by

Eko Priyono, IJSRM Volume 12 Issue 09 September 2024 EC-2024-1422

comparing file sizes before and after compression. It can also be measured based on the algorithm's
processing time (Sayood, 2017).

2. Method
The research was conducted using a qualitative and experimental technique. The experiment in question
involves compressing text data with a compression method developed in Python computer language. The
experiment data will next be examined to determine the most optimal algorithm, following the processes
outlined in Figure 1.
Abstract texts in Indonesian were used to collect research data from journal articles selected at random from
100 sources. This text data is then converted in the *.txt format.
The three text data compression techniques, Huffman, Shannon-Fano, and Half-Byte, will be implemented
next. The algorithm is implemented by creating a program in Python. Visual Studio coding, a coding editor,
was utilized to assist with this process. After all of the algorithms have been implemented in a program, the
files are compressed one at a time. Before compressing, it needs to be noted the size of each file.
Start

Data collecting

Algorithm Implementation

Testing

Algorithm comparation

End

Figure 1: Research stages

The compression of each text file yields compressed characters in bit size. Each compressed file's
compression ratio is then measured. The compression ratio measurements are then compared to see which
algorithm is the most effective.
In this study, the three methods were compared based on their average compression ratio. The higher the
compression ratio, the more efficient the algorithm is at compressing data. The algorithm with the highest
average compression ratio is the most efficient or optimal.
The compression ratio calculation is shown in (1).
( ) (1)
The ratio is the result of compression and is used to assess the performance of a compression method
(Supiyandi and Frida, 2018).

3. Results and Discussion

a. Research Data
The data in this study is presented in the form of abstract text from scientific articles. The sample size was
100 Indonesian language text data. Indonesian was chosen to standardize the data parameters. The abstract
text was chosen because it is written in the form of sentences with a range of characters but a consistent
systematic structure. The diversity of characters has an impact on the compression process. Abstract
systematic homogeneity was employed as a benchmark for comparison. Aside from that, text in the form of
sentences is more similar to how text is used in everyday life than repeated letters such as
"AAAAABAAACCD".
The abstract content in the journal article retrieved is in the form of a PDF document, which must be
translated into TXT format. The size of each text file is recorded before compression is performed so that it
can be compared later to the compression results. Figure 2 shows an example of abstract text transferred into
a text file.

Eko Priyono, IJSRM Volume 12 Issue 09 September 2024 EC-2024-1423

b. Compression Process
The compression process is carried out by creating source code in Python to implement three compression
algorithms: Huffman, Shannon-Fano, and Half Byte. The Huffman algorithm's initial step is to determine the
probability or frequency with which a particular character appears. For example, the probabilities for the
word "goods" are: a:2, b:1, r:1, n:1, and g:1. From this data, a binary tree known as a Huffman tree is
produced. The following source code employs the Huffman technique to encode tree data in bit form. This
bit format seeks to reduce file size.
The Shannon-Fano algorithm works similarly to Huffman. This approach creates a binary tree in order to
obtain the binary code and then encodes it. The distinction between these two algorithms is the creation of a
binary tree. Shannon-Fano creates binary trees from the bottom up, whereas Huffman does so from the top
down.

Figure 2: Example of a text file in txt format

The Shannon-Fano method is implemented in stages, beginning with counting and sorting character
occurrences. Then, these sequences are divided into two to form a binary tree, which is repeated until all
characters have their corresponding values. Once all of the characters have a binary representation, they are
encoded using the new binary.
The Half-Byte algorithm uses the same four binary bits in consecutive characters. For example, in the word
"yaaaaaay", the letter a has the same four sequential left bits, 0110, as seen in Table 1. The Half-Byte
algorithm exploits this circumstance. When seven or more characters with the same first four bits are
received in sequence, this algorithm compresses the data with a marker bit, then the first character of the
same four-bit sequence, followed by the last pair of four bits in the next sequence, and finally with a closing
bit.

Table 1: Binary words "yaaaaaaay" before and after compression

Binary Word "yaaaaaaay" Binary Word "yaaaaaaay"
Before Compression After Compression
character binary character binary
y 0111 1001 y 0111 1001
a 0110 0001 marker 1111 1110
a 0110 0001 aa 0001 0001
a 0110 0001 aa 0001 0001
a 0110 0001 aa 0001 0001
a 0110 0001 marker 1111 1110
a 0110 0001 y 0111 1001
y 0111 1001

The initial step in implementing the Half-Byte technique is to transform the input text to binary. Following
that, the start or left bit is split from the right or end bit. Next, we look for commonalities in each character's
left bits. If there is, the left bit is removed, and a marker bit is assigned to the start and end bounds. The file
is then encoded as a new compressed file.

c. Testing
Tests were conducted on 100 text sample files. The compression ratio and size of the findings are used in
testing. The three approaches were tested by compiling a program in Python. Running the program produces

Eko Priyono, IJSRM Volume 12 Issue 09 September 2024 EC-2024-1424

the following output: the original uncompressed text, symbols or letters in the text, symbol likelihood, size
before and after compression, and bit representation after compression.
Figure 3 shows an example of a 6792-bit txt file used to test the Huffman method. The compression ratio
attained was 45.45%, or 3785 bits. Similarly, we tested the Shannon-Fano (Figure 4) and the Half-Byte
algorithm (Figure 5). With the identical text data example, the following two algorithms show compression
file values of 4090 bits for the Shannon-Fano algorithm, namely with a compression ratio of 39.78%, and
6444 bits for the Half-Byte algorithm or with a compression ratio of 5.12%. The compression ratio produced
by the evaluated algorithm is used to determine its effectiveness. The higher the compression ratio, the better
the compression, or in other words, the more successful the algorithm.

Figure 3: Huffman algorithm’s data compression

Figure 4: Shannon-Fano algorithm’s data compression

Figure 5: Half-Byte algorithm’s data compression

Table 2 displays the compression results of the three methods, Huffman, Shannon-Fano, and Half-Byte,
together with the average compression ratio shown in Figure 6.

Table 2: Test results for the Huffman, Shannon-Fano, and Half-Bye algorithms
File size (Bit) Compression ratio
File
name Original Huffman Shannon- Half- Huffman Shannon- Half-
Fano Byte Fano Byte
[Link] 6792 3705 4090 6444 45.45% 39.78% 5.12%
[Link] 6640 3465 3724 6338 47.82% 43.92% 4.55%
[Link] 6432 3493 3797 6056 45.69% 40.97% 5.85%
[Link] 7392 4015 4599 7035 45.68% 37.78% 4.83%

Eko Priyono, IJSRM Volume 12 Issue 09 September 2024 EC-2024-1425

[Link] 7552 4108 4498 7214 45.60% 40.44% 4.48%
[Link] 10920 5822 6254 10521 46.68% 42.73% 3.65%
[Link] 8680 4527 4860 8330 47.85% 44.01% 4.03%
[Link] 18632 10342 11842 17679 44.49% 36.44% 5.11%
[Link] 10576 5736 6238 10050 45.76% 41.02% 4.97%
[Link] 10704 5706 6210 10170 46.69% 41.98% 4.99%
... ... ... ... ... ... ... ...
[Link] 7256 3778 4012 6892 47.93% 44.71% 5.02%

Figure 6: Graph of the average compression ratio of the Huffman, Shannon-Fano, and Half-Byte algorithms
Figure 6 shows that the Huffman method outperforms the other two algorithms. Because of differences in
features, the Half-Byte algorithm has a low compression ratio compared to the Huffman and Shannon-Fano
algorithm. Half-Byte compression is based on the similarity of the first four bits of seven consecutive
characters, whereas Hufman and Shannon-Fano algorithms employ the number of character appearances.
The sentences in the sample contain different characters or letters. Because of the wide range of characters,
the likelihood of characters having the same and consecutive prefix bits is low. This has an effect on the
characters that can be compressed; specifically, just a few characters can be compressed. As a result, there is
only a tiny change in bit size between before and after compression.
The findings of comparing the text data compression algorithm with the more successful Huffman algorithm
were also revealed in study results (Supiyandi and Frida, 2018), with a compression ratio of 71.43%
compared to the Half-Byte technique's compression ratio of 22.33%. According to (Pujianto et al., 2020), the
Huffman approach performed better in compression than the Run Length Encoding method. The Huffman
method outperforms the Shannon-Fano technique in terms of compression gain for images of the same size
and number of colors (Widatama and Saputro, 2019).
The Huffman algorithm is said to offer a higher compression ratio than numerous other compression
algorithms for text and graphic data. However, this contradicts the research findings of [15]. The research
focused on audio file compression (*.wav). According to the research findings, the Huffman approach has a
compression ratio of 28.954%, whereas Run Length Encoding has a ratio of 46.77%.

4. Conclusion
The results of the investigation reveal that the Huffman algorithm outperforms the Shannon-Fano and Half-
Byte methods. The Huffman method has an average compression ratio of 46.05%, Shannon-Fano at 40.36%,
and Half-Byte at 5.04%. Thus, the Huffman method can be considered for use in compressing Indonesian
language text data. Data compression techniques require further testing to determine their effectiveness on
data other than text, such as video data, in order to acquire comprehensive information about the optimal
compression algorithm.
Conflicts of Interest: “This research was carried out collaboratively between authors and there was no
conflict of interest.”

References
3. Darnita, Y., Khairunnisyah, K. and Mubarak, H. (2019), “Text Data Compression Using the
Sequitur Algorithm”, SISTEMASI, Vol. 8 No. 1, pp. 104–113.
4. Fatmawaty, F. and Mufty, M. (2020), “Comparative Analysis of Wav File Compression Using
the Huffman Method and Run Length Encoding”, Jurnal Teknologi Informasi Dan Terapan,
Vol. 7 No. 1, pp. 61–65.

Eko Priyono, IJSRM Volume 12 Issue 09 September 2024 EC-2024-1426

5. Irliansyah, M.R., Nasution, S.D. and Ulfa, K. (2017), “Application of the Deflate Method and
Goldbach Codes Algorithm in Text File Compression”, KOMIK (Konferensi Nasional
Teknologi Informasi Dan Komputer), Vol. 1 No. 1.
6. Mahmoudi, R. and Zare, M. (2020), “Comparison of Compression Algorithms in text data for
Data Mining”, Int. J. Adv. Eng. Manag. Sci., Vol. 6, pp. 231–235.
7. Mizwar, T., Ginting, G.L., Mesran, M., Fau, A., Aripin, S. and Siregar, D. (2017),
“Implementation of the J-Bit Encoding Algorithm in Text File Compression”, KOMIK
(Konferensi Nasional Teknologi Informasi Dan Komputer), Vol. 1 No. 1.
8. Pujianto, M., Prasetyo, B.H. and Prabowo, D. (2020), “Comparison of the Huffman Method and
Run Length Encoding in Document Compression”, InfoTekJar J. Nas. Inform. Dan Teknol. Jar,
Vol. 5 No. 1, pp. 216–223.
9. Puspabhuana, A. (2016), “Three Steps Comparison of Text Compression Techniques”,
President University.
10. Rizky, N.F., Nasution, S.D. and Fadlina, F. (2020), “Application of the Elias Delta Codes
Algorithm in Text File Compression”, Building of Informatics, Technology and Science (BITS),
Vol. 2 No. 2, pp. 109–114.
11. Salomon, D. (2007), A Concise Introduction to Data Compression, Springer Science &
Business Media.
12. Saragih, S.R. and Utomo, D.P. (2020), “Application of the Prefix Code Algorithm in Text Data
Compression”, KOMIK (Konferensi Nasional Teknologi Informasi Dan Komputer), Vol. 4 No.
1.
13. Sayood, K. (2017), Introduction to Data Compression, Morgan Kaufmann.
14. Siahaan, A.P.U. (2016), “Implementation of Huffman Text Compression Technique”, Jurnal
Informatika Ahmad Dahlan, Universitas Ahmad Dahlan, Vol. 10 No. 2, p. 101651.
15. Simanjuntak, L.V. (2020), “Comparison of the Elias Delta Code and Levenstein Algorithms for
Text File Compression”, Journal of Computer System and Informatics (JoSYC), Vol. 1 No. 3,
pp. 184–190.
16. Supiyandi, S. and Frida, O. (2018), “Comparative Analysis of Text Data Compression Using
Huffman and Half-Byte Methods”, ALGORITMA: JURNAL ILMU KOMPUTER DAN
INFORMATIKA, Vol. 2 No. 1.
17. Widatama, K. and Saputro, W.T. (2019), “Comparison of the Performance of the Huffman
Algorithm and the Shannon-Fano Algorithm in Compressing Image Files”, INTEK: Jurnal
Informatika Dan Teknologi Informasi, Vol. 2 No. 2, pp. 70–77.

Eko Priyono, IJSRM Volume 12 Issue 09 September 2024 EC-2024-1427

Comparison of Lossless Compression Algorithms
No ratings yet
Comparison of Lossless Compression Algorithms
12 pages
Analysis and Comparison of Algorithms For Lossless Data Compression
No ratings yet
Analysis and Comparison of Algorithms For Lossless Data Compression
8 pages
Text Data Compression Algorithms
No ratings yet
Text Data Compression Algorithms
25 pages
A Novel Encoding Algorithm For Textual Data Compression
No ratings yet
A Novel Encoding Algorithm For Textual Data Compression
14 pages
Huffman Coding: Data Compression Study
No ratings yet
Huffman Coding: Data Compression Study
3 pages
Huffman Coding: Document Compression Study
No ratings yet
Huffman Coding: Document Compression Study
5 pages
Paper 1 Adaptive Huffman Algorithm For Data Compression Using Text Clustering
No ratings yet
Paper 1 Adaptive Huffman Algorithm For Data Compression Using Text Clustering
11 pages
Huffman Coding Case Study Analysis
No ratings yet
Huffman Coding Case Study Analysis
2 pages
TP02
No ratings yet
TP02
16 pages
Data Compression With Huffman Coding: An Efficient Dynamic Implementation Using File Partitioning
No ratings yet
Data Compression With Huffman Coding: An Efficient Dynamic Implementation Using File Partitioning
7 pages
A Software Implementation of The Shannon-Fano Coding Algorithm
No ratings yet
A Software Implementation of The Shannon-Fano Coding Algorithm
4 pages
Text and Text Compression
No ratings yet
Text and Text Compression
28 pages
Assignment No-05
No ratings yet
Assignment No-05
3 pages
24mcs10054 Adsa Mini Project Report
No ratings yet
24mcs10054 Adsa Mini Project Report
21 pages
Comparison of Huffman Algorithm and Lempel-Ziv Algorithm For Audio, Image and Text Compression
No ratings yet
Comparison of Huffman Algorithm and Lempel-Ziv Algorithm For Audio, Image and Text Compression
7 pages
Text Data Compression
No ratings yet
Text Data Compression
13 pages
05 Compression
No ratings yet
05 Compression
46 pages
Data Compresion 1
No ratings yet
Data Compresion 1
2 pages
Mobile Text Compression Efficiency
No ratings yet
Mobile Text Compression Efficiency
5 pages
IJCST V4I3P43 With Cover Page v2
No ratings yet
IJCST V4I3P43 With Cover Page v2
7 pages
A Comparative Study Between Two Lossless Compression Algorithms-James-Edited
No ratings yet
A Comparative Study Between Two Lossless Compression Algorithms-James-Edited
7 pages
10 1016@j Aei 2008 05 001
No ratings yet
10 1016@j Aei 2008 05 001
8 pages
Data Compression Techniques
No ratings yet
Data Compression Techniques
5 pages
Group-8 DIP Presentation
No ratings yet
Group-8 DIP Presentation
100 pages
Assignment 6: Huffman Encoding: Assignment Overview and Starter Files
No ratings yet
Assignment 6: Huffman Encoding: Assignment Overview and Starter Files
20 pages
Content-Based Textual Big Data Analysis and Compression: Fei Gao Ananya Dutta Jiangjiang Liu
No ratings yet
Content-Based Textual Big Data Analysis and Compression: Fei Gao Ananya Dutta Jiangjiang Liu
6 pages
Comparative Analysis of Text Compression Algorithms
No ratings yet
Comparative Analysis of Text Compression Algorithms
9 pages
Dmhuff
No ratings yet
Dmhuff
8 pages
Huffman Text Compression Report
No ratings yet
Huffman Text Compression Report
3 pages
Compression (Compatibility Mode)
No ratings yet
Compression (Compatibility Mode)
12 pages
Huffman Coding for Tech Enthusiasts
No ratings yet
Huffman Coding for Tech Enthusiasts
5 pages
Ultimedia OF ATA Ompression: IS502:M D I S
No ratings yet
Ultimedia OF ATA Ompression: IS502:M D I S
29 pages
Data Compression Techniques Overview
No ratings yet
Data Compression Techniques Overview
4 pages
Huffman Coding Compression Intro
No ratings yet
Huffman Coding Compression Intro
4 pages
Documentation in Daa
No ratings yet
Documentation in Daa
16 pages
A New Approach For Compression On Textual Data
No ratings yet
A New Approach For Compression On Textual Data
4 pages
Huffman Encoding Supplement
No ratings yet
Huffman Encoding Supplement
10 pages
Application of Compression
No ratings yet
Application of Compression
14 pages
Bec613a - MMC - Module 3
No ratings yet
Bec613a - MMC - Module 3
55 pages
Module 5 IVP
No ratings yet
Module 5 IVP
112 pages
Lossless Data Compression Techniques and Their Performance
No ratings yet
Lossless Data Compression Techniques and Their Performance
6 pages
Digital Data Compression
No ratings yet
Digital Data Compression
10 pages
Data Compression for Tech Experts
100% (1)
Data Compression for Tech Experts
14 pages
Paper 3-A New Algorithm For Data Compression Optimization
No ratings yet
Paper 3-A New Algorithm For Data Compression Optimization
4 pages
Modification of Adaptive Huffman Coding For Use in
No ratings yet
Modification of Adaptive Huffman Coding For Use in
6 pages
Conjugation-Based Compression For Hebrew Texts
No ratings yet
Conjugation-Based Compression For Hebrew Texts
10 pages
Huffman Coding in C++
No ratings yet
Huffman Coding in C++
10 pages
Image Compression by Retaining Image Quality - Ieee Format
No ratings yet
Image Compression by Retaining Image Quality - Ieee Format
4 pages
Wa0023.
No ratings yet
Wa0023.
28 pages
Mmis G1 Ass
No ratings yet
Mmis G1 Ass
13 pages
Huffman Coding: Brief Theory
No ratings yet
Huffman Coding: Brief Theory
6 pages
Block Sorting Text Compression - Final Report: Peter Fenwick, Technical Report 130 ISSN 1173-3500 23 April 1996
No ratings yet
Block Sorting Text Compression - Final Report: Peter Fenwick, Technical Report 130 ISSN 1173-3500 23 April 1996
25 pages
16 San
No ratings yet
16 San
7 pages
Data Compression
No ratings yet
Data Compression
7 pages
Huffman
No ratings yet
Huffman
15 pages
ASCII File Compression Techniques
No ratings yet
ASCII File Compression Techniques
1 page
Chapter Four Indexing Structure
100% (2)
Chapter Four Indexing Structure
60 pages
Data Compression
No ratings yet
Data Compression
35 pages
AI Tools Research
No ratings yet
AI Tools Research
24 pages
Research Method Proposal
No ratings yet
Research Method Proposal
77 pages
Entropy
No ratings yet
Entropy
19 pages
4.6 Shannon - Fano Encoding:: Part 4: Information Theory
No ratings yet
4.6 Shannon - Fano Encoding:: Part 4: Information Theory
2 pages
Cbe ReceiptFT250081RN51
No ratings yet
Cbe ReceiptFT250081RN51
1 page
2016 EDP Worksheet - Cahpter3,4,5
No ratings yet
2016 EDP Worksheet - Cahpter3,4,5
1 page
Python Lab Class
No ratings yet
Python Lab Class
37 pages
2 - Text Operation - 1
No ratings yet
2 - Text Operation - 1
28 pages
7 Query Languages Operations
No ratings yet
7 Query Languages Operations
12 pages
Chater 1
No ratings yet
Chater 1
14 pages
Python Basics for Beginners
No ratings yet
Python Basics for Beginners
46 pages
Presentation 2
No ratings yet
Presentation 2
21 pages
Mobile App Dev Guide for Students
No ratings yet
Mobile App Dev Guide for Students
21 pages
Shanon - Fannon Compression
No ratings yet
Shanon - Fannon Compression
1 page
Analysis of Snapdragon 8 Gen 1: Mehar Anjum Khan, Prof. Sandhya Dahake
No ratings yet
Analysis of Snapdragon 8 Gen 1: Mehar Anjum Khan, Prof. Sandhya Dahake
6 pages
KIIT Recruitment Drive for Tech Support
100% (1)
KIIT Recruitment Drive for Tech Support
3 pages
1.2 NuvoScan - 3D - White - Paper
No ratings yet
1.2 NuvoScan - 3D - White - Paper
4 pages
Cabling Lans & Wans: Semester I
No ratings yet
Cabling Lans & Wans: Semester I
32 pages
Planning Website Design - DM
No ratings yet
Planning Website Design - DM
6 pages
Qs. What Is SAP HANA Studio?: Modeler
No ratings yet
Qs. What Is SAP HANA Studio?: Modeler
9 pages
CMMI Overview and Maturity Levels
No ratings yet
CMMI Overview and Maturity Levels
21 pages
Control Unit Operations in Computing
No ratings yet
Control Unit Operations in Computing
34 pages
Chapter 1. Introduction To Software and Software Engineering
No ratings yet
Chapter 1. Introduction To Software and Software Engineering
21 pages
1Z0-1127-25 Exam Dumps - Pass With Confidence
100% (1)
1Z0-1127-25 Exam Dumps - Pass With Confidence
19 pages
FortiAnalyzer 7.2.5 Release Notes
No ratings yet
FortiAnalyzer 7.2.5 Release Notes
58 pages
Hackathon - DevOps & Microservices
No ratings yet
Hackathon - DevOps & Microservices
13 pages
8051 LCD, Keyboard, Stepper Motor, ADC, DAC
100% (1)
8051 LCD, Keyboard, Stepper Motor, ADC, DAC
76 pages
FYP Booklet Template (R&D) - 1
No ratings yet
FYP Booklet Template (R&D) - 1
19 pages
Stakeholder 8
No ratings yet
Stakeholder 8
2 pages
DNS Server
No ratings yet
DNS Server
12 pages
Analytical Modeling of Parallel Systems: Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar
No ratings yet
Analytical Modeling of Parallel Systems: Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar
36 pages
Welcome To 6.004! Computation Structures
No ratings yet
Welcome To 6.004! Computation Structures
28 pages
JQuery Unit-2
No ratings yet
JQuery Unit-2
18 pages
Mastering Spark SQL PDF
100% (1)
Mastering Spark SQL PDF
1,776 pages
Creative Computing Curriculum Guide
No ratings yet
Creative Computing Curriculum Guide
19 pages
Flight Management System Project
No ratings yet
Flight Management System Project
19 pages
BHS Inggris 3
No ratings yet
BHS Inggris 3
6 pages
IoE Data Analytics & Security Insights
No ratings yet
IoE Data Analytics & Security Insights
15 pages
Introduction to Programming Basics
No ratings yet
Introduction to Programming Basics
34 pages
OSPF MD5 Authentication Setup Guide
No ratings yet
OSPF MD5 Authentication Setup Guide
2 pages
CIS Risk Assessment Method (RAM) : Implementation Group 2 (IG2)
No ratings yet
CIS Risk Assessment Method (RAM) : Implementation Group 2 (IG2)
53 pages
TM-1815 AVEVA Everything3D Cableway and Cable Modelling Rev 1.0
86% (7)
TM-1815 AVEVA Everything3D Cableway and Cable Modelling Rev 1.0
136 pages
Cognos Reportnet (CRN) Framework Manager
No ratings yet
Cognos Reportnet (CRN) Framework Manager
24 pages
Ytmp3 Digital, Safe Fast YouTube mp3 Converter (Ytmp3
No ratings yet
Ytmp3 Digital, Safe Fast YouTube mp3 Converter (Ytmp3
2 pages

5658 IJSRM Paper

Uploaded by

5658 IJSRM Paper

Uploaded by

International Journal of Scientific Research and Management (IJSRM)

Text Compression Using the Shannon-Fano, Huffman, and Half–

Keywords: compression ratio, text data, effectiveness in compressing

Eko Priyono, IJSRM Volume 12 Issue 09 September 2024 EC-2024-1422

Figure 1: Research stages

3. Results and Discussion

Eko Priyono, IJSRM Volume 12 Issue 09 September 2024 EC-2024-1423

Figure 2: Example of a text file in txt format

Table 1: Binary words "yaaaaaaay" before and after compression

Eko Priyono, IJSRM Volume 12 Issue 09 September 2024 EC-2024-1424

Figure 3: Huffman algorithm’s data compression

Figure 4: Shannon-Fano algorithm’s data compression

Figure 5: Half-Byte algorithm’s data compression

Eko Priyono, IJSRM Volume 12 Issue 09 September 2024 EC-2024-1425

Eko Priyono, IJSRM Volume 12 Issue 09 September 2024 EC-2024-1426

Eko Priyono, IJSRM Volume 12 Issue 09 September 2024 EC-2024-1427

You might also like