0% found this document useful (0 votes)

32 views66 pages

Proposal Bangla LGR 20may20 en

Uploaded by

TAPAS SAU

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

32 views66 pages

Proposal Bangla LGR 20may20 en

Uploaded by

TAPAS SAU

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 66

Proposal for a Bangla (or Bengali) Script

Root Zone Label Generation Ruleset (LGR)

LGR Version: 4.0
Current Date: 2020-05-20
Document version: 5
Authors: Neo-Brahmi Generation Panel [NBGP]

1. General Information
This document lays down the Label Generation Rule Set (LGR) for the Bangla (or
‘Bengali’)1 script under the general rubric of the Neo-Brāhmī Writing System. Three
main components of the Bangla Script LGR i.e. (i) Code point repertoire, (ii) Variants
and (iii) Whole Label Evaluation Rules which have been described in detail here, having
given a brief historical background of the Script under Section 3.

All these components will be incorporated in a machine-readable format in an XML file

named "proposal-bengali-lgr-20mar20-en.xml". Labels for testing can be found in the
accompanying text document “bangla-test-labels-20mar20-en.txt”.

2. Script for Which the LGR Is Proposed

ISO 15924 Code: Beng
ISO 15924 Key N°: 325
ISO 15924 English Name: Bengali (Bangla)
Latin transliteration of native script names [in IPA]: bɑːŋlɑː, ôxômiya
Native names of the script: বাংলা, অসমীয়া
Maximal Starting Repertoire (MSR) version : MSR-4

3. Background on Script & Principal Languages Using It

3.0. Introduction
‘Bangla’ (or Bengali) is historically and genealogically regarded as an eastern Indo-
Aryan language with around 178.2 million speakers in Bangladesh (98% speakers), and
83.4 million speakers in the Indian states of West Bengal (68.37 million), Tripura (2.15
million), South Assam (7.3 million), Odisha (0.49 million) and Delhi (0.21 million) as

1 The term ‘Bangla’ is used in the descriptive text and the term ‘Bengali’ is used in the normative part of this
proposal.
well as in the Andaman and Nicobar Islands (close to a hundred thousand) - accounting
for 8.3% of India. It is a major language in Jharkhand (2.6 million), too and a language
with a sizable population in Bihar (0.44 million). Apart from these, there are a huge
number of Bangla-speaking diasporas spread all over the world. It is the seventh largest
spoken and written language in the world. Bangla is the national and official language of
Bangladesh, and one of the 22 Official languages in India (listed in the 8th Schedule of
the Indian Constitution). It is also one of the official languages of Sierra Leone. The
script is also called Bangla [102], which is an eastern variety of the ‘Brāhmī’ Writing
System, written from left to right. Historically it derives from the Brāhmī alphabet as
used in the Ashokan inscriptions (269-232 BC).

Bangla and its cognate languages, as mentioned above, together form a linguistic group
known as the Eastern New Indo-Aryan (NIA). There is a gross inadequacy of the
inscriptions and manuscripts in the Eastern Apabhraṅ śa or ‘Avahaṭṭha’ except for small
inscriptions and the manuscripts of the Tantric Buddhist text titled
‘Caryyācaryyaviniścaya’ or the Caryā-Pada [114] dating back to the 9th-11th century. As
a result, there is not much epigraphic evidence for the development of its writing
system. However, what evidence is available of the genesis of Bangla writing system is
discussed in the section 3.1 [109].

Historically, the Bangla language is divided into three periods as evident from various
sources:

(i) Firstly, Old Bangla Period (roughly 950/1000 to A.D.1200/1350) of which

three specimens are found: (a) 47 Caryā songs, the Dohākōṣa of Saraha and
the Dohākōṣa of Kānha (mostly in Apabhraṅ śa), and the Ḍākārṇava (in a
variety of Prā kṛ t), (b) Old Bangla specimens of over 300 words in a
commentary [141].
(ii) Then there is Middle Bangla Period - 1200-1800 AD, again divided into three
stages: (a) Transitional Middle Bangla (1200-1300 A.D, for which no genuine
specimens are found) [147], (b) Early Middle Bangla (1300-1500 A.D), and
(c) Late Middle Bangla (1500-1800 A.D).
(iii) Finally, after 1800 AD, we find the Modern or New Bangla, marked by the
introduction of written prose [109] in the books of Fort William College
(established in 1800) The colloquial variety of Bangla based on the speech
variety of Calcutta (called ‘Kolkata’ now) made its first appearance through
the Hutōm Pẽcāra Nakśā (1862) by Peari Chand Mitra. The influence of
English in the vocabulary, idioms, and expressions as well as in the writing
styles of Bangla is significant by this time. The fonts and types for Bangla
developed during this time also spread to all parts of Bangla speech
community [101, 120]. The same fonts with some extensions were also used
for the neighbouring languages deploying this writing system.

2
Bangla prose had developed two literary styles during the 19th-20th Century: The
Sādhubhāṣā (সাধু ভাষা - "Elegant Language or Style") and the Calitabhāṣā (চিলতভাষা
"Current Language, or Modern Style"). It is the latter style that is prevalent today in
written prose.

The Language Movement in Bangladesh (the then East Pakistan) began in 1948, as civil
society dissented to the elimination of the Bangla script from currency and stamps,
which were in use since the British Raj. The movement reached its pinnacle in 1952,
when on 21 February the police fired on demonstrating students and civilians,
triggering numerous injuries and deaths2. Later, following the Language movement, on
27 April 1952, the All Party National Language Committee decided to demand
establishment of an organization for the promotion of Bengali language. Bangla
Academy, Dhaka right from its inception in 1955 has been engaged in promoting and
fostering Bangla as the lingua franca of the country before and after independence from
Pakistan in 1971. Through the various commissions and committees constituted by the
Government of Bangladesh (Bā ṅ lā deś a Jā tı̄ya Sy ikṣ ā Kamiś ana in 1972, Jā tı̄ya Sy ikṣ ā
Upadeṣ ṭā Pariṣ ad in 1979, Bā ṅ lā Bhā ṣ ā Bā stabā yana Sela in 1982, Bā ṅ lā Bhā ṣ ā Kamiṭi in
1983, etc.3) after independence in 1971 Bangla was made the primary medium of
instruction/communication in all Governmental and educational activities. Through a
great struggle and bloodshed, the Bengalis established Bangla as an official language of
the state.4.

3.1. Written Bangla

The ‘Bangla alphabet’ (বাংলা িলিপ - Bānglā lipi, ISO 15924) is derived from the Brāhmī
writing system, which is related to the Nā garı̄ (also known as Devanāgarī5) script [108]
as well as to Tirhutā writing system [106]. Considered to be fifth most widely used
writing system in the world, this combined Bangla-Asamiyā-Maṇ ipuri Script (showing
some variations for Asamiyā and Meitei or Biṣ ṇupriyā Manipuri) (130), was used in the
eastern Indian Sanskrit manuscripts too. For Chā kmā in India and Bangladesh and for
Kokborok in Tripurā , it was and still is one of the scripts used. A close variant, called
Tirhutā (123; now available also in UNICODE 10.0 as 11480 114DF; See 110) or

2
The UN declared Ekuśe February (21st February) as the International Mother Language Day at the UNESCO
General Conference in Paris on 17 November 1999 “in recognition of the sanctity and preservation of all
vernacular languages in the world.”22
3
Bāṅlā Bhāṣā Kamiṭi. 1983. Bāṅlā Bhāṣā Kamiṭi Riporṭ (Report of the Bangla Bhasha Committee). Dhakaː Śikṣā,
Dharma, Krīṛā O Saṅskṛti Mantraṇālaya, Peoples Republic of Bangladesh.
4
Chakraborty, Rajib. 2018. The Fishermen’s Community: A Language-Culture Interplay (A Study of Post-1971
Select Bangla Novels). Unpublished Ph.D. Dissertation, Visva-Bharati.
5 William Dwight Whitney in his Sanskrit Grammar unequivocally said, “This name (Devanā garı̄) is of
doubtful origin and value” (Whitney, William Dwight. 1994 reprint. Sanskrit Grammar. New Delhiː Motilal
Banarasidass Publishers, p. 1)

3
Mithilākṣara was used for Maithili from the 14th Century until the early-20th century
[106]. In this context, one finds a mention of ‘Sylheti Nā garı̄ lipi’ or ‘Siloṭi’ (added to the
Unicode Standard in March 2005 with the release of version 4.1) the details of which
could be of interest only to historians and historical linguists (See 137 and 144). But
Sylheti Bangla is generally written by many in the modern-day Bangla script now for all
practical purposes. Originally, during the reign of the Pāla dynasty (750-1154 AD) in
the eastern India, and even earlier, perhaps during the Malla period (694 AD onwards),
the present-day Bangla writing system got a shape comparable to the modern-day ones
[111, 119]. A pictorial description of Brāhmī to Modern Bangla Script could be
presented here in a tabular form:

Modern
ক জ ম র স অ

k j m r s a

Table 1: Pictorial depiction of Evolution of Brāhmī to Bangla

The inscriptional evidence in Brāhmī is found in the Archaic Brāhmī from the 3rd
century B.C. to the 1st century B.C, and in Middle Brāhmī – soon after (1st-3rd Century
A.D.) and then on in the Late Brāhmī (4th-6th Century A.D.). This evidence could be seen
in both Bangladesh and West Bengal [108] by 1) The Mahā sthā nagaṛ a (Bogra district,
Bangladesh — the ancient name being Puṇ ḍ ranagara or Pauṇ ḍ ravardhanapura)
inscriptions, 2) Brāhmī (and Kharoṣṭhī) inscriptions from the lower ‘Gangetic Bengal’
and (3) Copper plate inscriptions of the Imperial Guptas from Northern part of West
Bengal and North-West Bangladesh — in the areas under Dharmā ditya, Gopachandra
and Samācāradeva (about whom one only knows from five Copper-plates found in
Kotā lipā ṛ ā in the Faridpur district in Bangladesh, one in Mallasā rul in the Burdwan
district (West Bengal), and one in Jayrā mapura (Balleś vara district, now in Odisha).

4
These epigraphs from the eastern part of Undivided India (dating back to the 4th-6th
Centuries A.D.) showed some characteristic features of letters (especially in ম ‘ma’, ল ‘la’,
শ ‘ś a’, স ‘sa’ and হ ‘ha’), which led to the development of eastern variety of Gupta script.
Epigraphic records from Bangladesh demonstrate remarkable developments in Eastern
Brāhmī. In this context, the Tippera copper plate inscription of the ‘Samataṭa’ rulers
(139, pp 265) such as Lokanātha (dated 7th Century A.D., during the latter half), the
Kailan inscription of Sy ridharaṇ a Rāta as well as the Astafpur copper plates. The letters
seem to hang down from wedge shaped solid triangles with right hand verticals bending
down at the bottom, because of which it was described by Prinsep and Fleet as Kuṭila-
lipi (literally, ‘Cursive writing style’), whereas the term Siddhamātrikā (as a mā trā or
bar is placed over each of the letters) was used by Al Biruni (973-1048) to designate the
script of Northern India. The next stage of development is illustrated by the 9th Century
copper plate inscriptions from Khalimpur of the reign of Dharmapāla, from Monghyr
and Nā landā of the time of Devapāla in Bihar, and from Jagjı̄vanpura (Malda) of the
reign of Mahendrapāla. The Siddhamātrikā (mentioned as ‘Siddham’ in Chinese sources)
is said to have been prevalent also in this region up to the end of the tenth century. Also
called the Gauri (i.e. Gandi) in Pūrvadeśā or the Eastern country, it was regarded as the
same script to which is given the appellative Proto-Bangla characteristics in
rudimentary forms, in the period between A.D. 875 and A.D. 1025.

In some epigraphs it is considered as belonging to the second quarter of the eleventh

century A.D. Flattening of head-marks becomes prominent in comparison to the wedge-
shaped serifs. An important landmark in the development of the Bangla script is the
Rā magañ ja copper plate inscription of Mahāmāṇ ḍalika in the last quarter of the
eleventh century A.D. It is the earliest document from this entire region which bears the
letter m, with a tick rising upwards. The full vowel i develops a tick at the right end of
the upper horizontal bar above and a curved hook below. Initial e approaches the
modern Bangla character. A mature form of Proto-Bangla, the immediate precursor of
Bangla script, is illustrated in the inscriptions of the Varmaṇ a, Sena and Deva rulers of
the twelfth and thirteenth centuries [104].

The evolution of the Bangla script (Cf. 136) is aligned with the story of advancement of
printing technology. The first “Movable type” scripts technically created and used while
printing Nathaniel Brassey Halhed's (1751-1830) 1778-book titled, 'A Grammar of the
Bengal Language'. In 1785, Governor-General Warren Hastings (1732-1818) requested
another civilian, Charles Wilkins (1749-1836) to cut punches for Bangla printing
characters. The current printed form of Bangla script appeared soon after. It is generally
agreed that Wilkins developed Bangla print script [111]. He passed on this knowledge
to Pañ cā nana Karmakā ra (?-1804), a renowned artist in Bengal. Later it was Karmakar
and his family that became famous in Bangla printing technology. Shepherd was
another assistant of Wilkins in this designing of script, which became more angular with
sharper turns and edges [133]. A few archaic letters were modernized during the 19th

5
century. It was standardized by Pandit Ishwar Chandra Vidyasagar when the Bangla
type fonts were to be used to publish on a large scale under the Calcutta School Book
Society [116 for several references].

Much later, in 1935, the Linotype technique, invented by Ottmar Mergenthaler (1854-
1899) in 1886, was introduced into Bangla printing in 1935, by the efforts of Suresh
Chandra Majumdar (1888-1954), Rajsekhar Basu (1880-1960), Jatindra Kumar Sen
(1882-1966) and his disciple, Sushil Kumar Bhattacharya and had begun being used by
the Aƒ nandabā zara Patrikā group, later followed by others. Within a few years the more
advanced monotype technology came to be used in Bangla printing. However, in Bangla
printing culture, monotype has a very limited acceptance and linotype held stage till,
eventually, the digital technology came in to replace all earlier techniques.

All these could be presented in a table:

PERIOD DESCRIPTION NAMES

3rd Century B.C. Use of Brāhmī and Kharoṣ ṭhī scripts begin in the Brāhmī
subcontinent. Brāhmī was widely used during the
Mauryan King, Aśoka. In one theory, Brāhmī is
based on North Semitic alphabet but suitably
modified to fit the need of local languages. It is
currently believed to have been an independent
development.

1st-3rd Century The Kuṣ āṇ a script, named after the Kuṣ āṇ a royal Kuṣ āṇ a script
AD dynasty.

4th-5th Century The next stage of its evolution was into the Gupta Gupta script
AD script, named after the Gupta royal dynasty.

7th Century AD Epigraphic records from Bangladesh demonstrate Kuṭila-lipi

remarkable developments in Eastern Brāhmī, giving
rise to the Kuṭila-lipi

8th Century AD Some copper plate inscriptions are found in the Siddhamātikā
Khalimpur, Bangladesh during the reign of
Dharmapāla, from Monghyr and Nālandā in Bihar, of
the time of Devapāla, and from Jagjı̄vanapura in
West Bengal of the reign of Mahendrapāla.

6
PERIOD DESCRIPTION NAMES

9th Century AD Proto-Bangla characteristics in rudimentary forms Proto-Bangla

until 1025 AD develop. An important landmark in the Script &
development of the Bangla script is the Rā magañ ja Language
copper plate inscription of Mahāmāṇ ḍ alika found in
the last quarter of the eleventh century A.D.

12th-13th A mature form of Proto-Bangla, the immediate Matured

Century AD precursor of Bangla script, is found in the Proto-Bangla
inscriptions of the Varmaṇ a, Sena and Deva rulers
of the twelfth and thirteenth centuries.

14th-15th The characteristics of typical Bangla script began to Modern

Century AD develop, as could be seen in the copper plate Bangla Script
inscription of Vijayamāṇ ikya-I of Tripura dated era begins
1478 AD - also Illustrates forms of Bangla letters in (See Ross
the fifteenth century A.D. 1999)

16th-17th The chart of the Bangla alphabet, appended to the Printed

Century AD China Monuments, published from Amsterdam in Charts of
1667 and The code of Gentoo law, published from Bangla
London in 1776, both show a chart of the Bangla
alphabet. They show 16 Vowel letters, including the
̥̄ Anusvāra and Visarga, and 34
Long ‘ৡ’ ‘li’,
Consonants.

18th-19th Charles Wilkins develops printing in Bangla in 1778 Bangla Type

Century AD and Vidyasagar reforms it. Fonts

Table 2: Development of the Bangla Writing System

The overall development of Bangla Script from the Kuṭila-lipi period to Modern Bangla
could be seen here in Table 3 ([102 and 146] and also see the web-page in 147).

7
Table 3: Bangla Script in Different Centuries

3.2. Languages Considered

Below is the tabular representation of the languages using Bangla script that are placed
on EGIDS Scale 1-6. (See 117 for details.) Some languages under EGIDS 5 and 6 have
also developed their own scripts for printing and publishing. Some had used Bangla
script earlier (such as Bodo), or used it in West Bengal at some point of time (Santali)
but have later shifted to another writing system. Bodo is now written in Nāgarī or
Devanāgarī and for Santali one uses both Nāgarī/Devanāgarī and Ol-chiki (145). For the
purposes of the Bangla LGR, only languages belonging to the EGIDS scale 1 to 4 have
been considered. Consider the following table:

8
EGIDS EGIDS EGIDS EGIDS EGIDS EGIDS 6
Scale 1 Scale 2 Scale 3 Scale 4 Scale 5

Bangla Santali, Bodo, Lepcha

(Bengali) Riang, Khumi, Pnar, Koda/
Mru(ng), Asho Kora, Chak

Asamiyā Koch or Mā lto or

(Assamese) Rā jabaṅ ś ı̄ Mā lpā hā ṛ iyā

Maṇ ipuri or Biṣ ṇupriyā Chā kmā , Toto,

Meitei Maṇ ipuri, Hā jong, Rohingyā ,
Kok-Borok Muṇ ḍ ā ri & Tippera,
(Tripura & Kurux (of Megam,
Bangladesh) Bangladesh) Tanchangya

Usoi Limbu, Sadri or Bhumij or

Oraon Muṇ ḍ ā ri,
Bawm, Chin

Table 4: Main languages in India and Bangladesh

that use Bangla Script on the EGIDS Scale

3.3. Notable Features of Bangla Script [150]

Bangla Writing System has certain features that show how it has to be written in or how type-
setting in Bangla could be done. This section is followed by a section that explains the Code-
points (and fixed Code-point sequences) which show certain distinctive characteristics of
Bangla and which make the Repertoire. The next sections will also cover the ‘akshar’-formation
rules (ABNF) showing character class, Word Level Evaluation (WLE) and Context Rules as well
as In-Script and Cross-Script Variants. Here, we present some basic features of the Script and
Pronuncition:

● The Bangla script is an alpha-syllabic writing system in which writing of all

consonants are assumed to contain an accompanying ‘inherent’ vowel
(theoretically before or after each consonant). It varies between /ɔ/ and /o/
depending on the position of the consonant in the word. At times, these
‘assumed’ or ‘inherent’ vowels are not pronounced at all [142].

9
● Vowels can be written as independent letters, or by using a variety of diacritical
marks which are written above, below, before, after or both of the last two
positions the consonant they follow in pronunciation [105].
● All Bangla consonants when pronounced in isolation are uttered with an inherent
vowel - / ɔ/; hence ক ‘k’, খ ‘kh’ or গ ‘g’ are usually pronounced as [kɔ], [khɔ], or
[gɔ], etc. Phonologically, Bangla vowel - / ɔ/ corresponds to the Hindi schwa /ə/
● When consonants occur together in clusters, special conjunct letters are formed.
In printed Bangla, many of these consonantal clusters or conjoined consonants
are in use. The letters for the consonants other than the final one in the group are
generally reduced. But there are a few special conjunct characters which are
compounds of the consonant characters, e.g. 7(k)+ষ(ṣ )=8(kṣ )/,
9(ñ )+জ(j)=:(ñ j), ;(j)+ঞ(ñ )==(jñ ), >(h)+ম(m)=?(hm). There are other issues
also—র as the second member of a cluster is reduced to a secondary symbol, e.g.
@(p)+র(r)=A(pr), B(ṣ )+C(ṭ)+র(r)=D(ṣ ṭr) (as in উD uṣ ṭra “camel”); য (y), when used
as a primary symbol, represents /jɔ/ in Bangla. But its secondary symbol
(allograph) jɔ-phalā has two phonetic values. When added to the initial
consonant in a word, it is a vowel /æ/ (as in শGামল (ś yā mala) “green”, রGাপার
(ryā pā ra) “wrapper”, etc.). But after a non-initial consonant, it just doubles it in
pronunciation (as in কাযH, ধাযH, etc.). The I(r)+য(y) combination has two
renderings—রG(ry) and যH(ry). In case of J(d)+ধ(dh), K(g)+ধ(dh), L(n)+ধ(dh) the
shape of the second member is changed—e.g. M(ddh), N(gdh), and O(ndh)
respectively. The solitary example of I(r)+ঋ(ṛ )=ঋH (as in QনঋHত nairṛ t
"Southwest") – used mostly in cases of Classical borrowings, shows the use of
secondary symbol of a consonant followed by the primary symbol of a vowel.
The inherent vowel only applies to the final consonant of the cluster.
● In consonant clusters, many consonants took a completely different form. Some
typical examples are S (kt), T (kr), 8 (kṣ ), N (gdh), = (jñ ), U (ñ c), : (ñ j), V (ṭṭ), W
(nt), O (ndh), X (bdh), Y (bhr), Z (mb), [ (st) etc. র has two allographs, apart from
this full shape : one is ‘repha’, as found in কH (rk), পH (rp); and another is ra- phalā ,
as in A (pr), T (kr). \ (ṣ +ṇ ) is another one, where the cerebral nasal consonant
sign takes a queer shape. [151]
● The Bangla script has at least fifty-two primary symbols and quite a few
allographs (positional variants of them), corresponding to forty-four (7 oral and
7 nasal vowels and 30 consonants) phonemes (150) or functional speech sounds,
with some obvious redundancies, although in one of the first phonemic analysis,
the number was thought to be thirty-five phonemes [140].

10
● As mentioned above, in Bangla, several graphemic symbols have secondary
shapes, technically called ‘allographs’ with a complementary distribution in each
case. These graphs or markings are generally added to the following positions of
the primary symbol [113] in the following manner:

1) Below (e.g. কু (ku), W (nta), কূ (kū ), ^ (hra), etc.)

2) Above (e.g. চঁ (cã ), কH (rka), etc.)
3) Right side (e.g. কা (kā ), কং (kaṅ ), etc.)
4) Left side (e.g. `ক (ke))
5) Left Side and above simultaneously (e.g. Qক (kai), িক (ki) etc.)
6) Right side and above simultaneously (e.g. কী (kı̄))
7) Right side and left side simultaneously (e.g. `কা (ko))
8) Right side, left side and above simultaneously (e.g. `কৗ (kau)).
● As for complementary distribution of vowel letters (word- or syllable-initial) and
Vowel Mā trā s, which are relevant for ABNF, let us consider the following.
Besides some simple Vowel Modifiers called ‘Kā rs’ in Bangla (also referred to as
Mā trā in the other LGR documents of Neo-Brāhmī) there are some combinatory
modifiers of Bangla Vowels with certain consonants. For example, whereas

আ U+0986 BENGALI LETTER AA is substituted by

◌া U+09BE BENGALI VOWEL SIGN AA,
ই U+0987 BENGALI LETTER I is substituted by
pre-posed ি◌ U+09BF BENGALI VOWEL SIGN I,
ঈ U+0988 BENGALI LETTER II is substituted by
◌ী U+09C0 BENGALI VOWEL SIGN II or
উ U+0989 BENGALI LETTER U is substituted by
◌ু U+09C1 BENGALI VOWEL SIGN U by marking below the primary
grapheme, there are some special vowel modifiers of উ as in the following
combined letters:
‌ gu, rather than writing as গ (g) + ◌ু (u)
h ru, rather than writing as র (r) + ◌ু (u)
‌ śu, rather than writing as শ (ś )+ ◌ু (u)
j hu, rather than writing as হ (h) + ◌ু (u)
k/! ntu, rather than writing as L (n) +ত (t) +◌ু (u)

11
Similarly, there could be vowel modifiers of ঊ or ‘(Long) ū’ as well; e.g.

m (bh) + র (r) (n bhrū “eyebrow”), o (ś ) +র (r) (p ś rū ), ঋ (ṛ ) after হ (h) (q hṛ ), etc.

● There have been many notable contributions in simplifying and modifying Bangla
spellings and combinatory techniques, especially by scholars such as Pabitra
Sarkar (1992) [134]. In this there has been an attempt to reduce the number of
allographs of both vowels and consonants in clusters, and it has been widely
accepted in the printing of school texts in both Bangladesh and West Bengal [151,
152]. As of now, two systems, the old (traditional), and the new, go on side by
side, operative in different domains.
However, in preparation of this LGR document, the aim has been to consider the widely
used and usable sequences and combinations and their variations across the sister
scripts belonging to the basket of Brāhmī writing systems.

Bangla Academy, Dhaka published Standard Bangla Spelling Rules in 1992 following the
recommendations of a committee constituted through a workshop jointly organized by
the Jā tı̄ya Sy ikṣ ākrama and Pā ṭhyapustaka Board in 1988. A throughly revised edition of
the Rules was published in September 2012.6

After the establishment of Bā ṅ lā Aƒ kā demi of West Bengal in 1986, its first President,
Annadasankar Ray (1904-2002), in his inaugural address, gave a direction for
standardization of Bangla alphabet, script, the spelling system and clearly argued that
they would not blindly follow the Sanskritic model of conventional grammar. A broad
list of proposals was sent to experts on Bangla, and a broad agreement was reached for
‘homogenization of Bangla spelling’ by 1988. Based on opinions received from different
quarters, a unanimous list of ‘rules’ was agreed upon. This was published by a ‘Spelling
Dictionary’ titled, Ākādemi Bānāna Abhidhāna (1997), which was obviously more
comprehensive than ‘The University of Calcutta proposals’, made in 1936. Along with
the ‘rationalization’ of spellings, another step was taken to make the writing system
easier to read, by making the symbols used, both single and combined ones, more
‘transparent’. These reforms were originally suggested by Sarkar (1987, first published
in 1978) [134] [153] where he used the terms Swaccha (‘Transparent’) and Aswaccha
(‘Opaque’ or non-transparent), even adding Ardha Swaccha (‘half transparent) in
between the two. Some sample examples are:

Transparent: r (nn), s (pt), [ (st), where both member of the cluster can be
recognized.

6Bangla Academy. 2012. Bāṅlā Ekaḍemī Pramita Bāṅlā Bānānera Niyama (Bangla Academy Standard
Bangla Spelling Rules). Dhakaː Bangla Academy.

12
Opaque: where neither of the two could be (easily) recognized—8 kṣ (7 k + ষ ṣ ), = jñ
(; j + ঞ ñ ), t ṅ g (u ṅ + গ g), ? hm (> h + ম m).
Semi-transparent: A (pr), পH (rp) where one symbol is recognizable and the other is
not. In case of three-term clusters, at least one symbol will not be transparent, e.g. v str
(w s+x t+র r), D ṣ ṭr (B ṣ +C ṭ+র r), etc.

There were, in fact, two types of proposals. One concerned the shape of the letters,
those of consonant + vowel (CV) combinations and conjuncts, which is consonant +
consonant combinations. There were further complex shapes, i.e. those of consonant +
consonant+ (consonant+) vowel (CC(CV) signs, as in y (pru), or z (skru). Some
decisions in this area were necessary because a few of the CC(C) symbols represented
complexities that made learning them difficult for the children. The other dealt with the
spellings of words only, without any reference to the shapes of letters in which they
were written. The basic objective here was ‘one word, one spelling’, to the greatest
extent that was possible. [151]
Below we place a statement of the most salient changes that affect the consonant +
vowel combinations. [153]
a. The variants of the short u (^{ উ-কার hrasva u-kāra) vowel sign have been
brought down to one, i.e., ◌ু. So ‌ (gu) is now গু. Similarly h (ru) > রু, ‌
(śu)> শ,ু j (hu)>হ.ু and therefore, cluster + short u sign : k (ntu)> Wু
(ন+◌্+ত+উ), } (stu)>[ু (স+◌্+ত+উ)
b. The variants of long u (দীঘH ঊ-কার dīrgha u-kāra) have also been reduced. €
(rū)> র;ূ n (bhrū) > Yূ (ভ bh+◌্+র r+ঊ ū); • (drū)> ‚ূ (দ d+◌্+র r+ঊ ū); p (śrū)>
ƒূ (শ ś+◌্+র r+ঊ ū)
c. The variants of ঋ-কার (ṛ-kāra "secondary symbol of ṛ") have been brought
down to one: q (hṛ) > হৃ
Regarding consonant + consonant + (consonant)…+ (vowel) clusters Paschimbanga
Bangla Akademi proposed transparent or semi-transparent shapes for clusters to the
extent admissible in Bangla writing system. Some examples will clarify the proposal (A
slash will mean that the traditional cluster-shape precedes it, while the Bangla Akademi
innovation follows.) [153]
X/…ধ bdh († b+ ধ dh), M/‡ধ ddh (J d+ধ dh), ˆ/‰থ, " nth (L n+থ th), U/‹চ, # ñc (9
ñ+চ c), Œ/‹ছ, $ ñch (9+ছ), :/‹জ, % ñj (9 ñ+জ j), S/Žত, & kt (7 k+ত t), T/' kr (7
k+র r), N/•ধ, ( gdh (K g+ধ dh), •/) ṅk (u ṅ+ক k), t/ * ṅg (u ṅ+গ g), \/+ ṣṇ (B ṣ+ণ
ṇ), ’/‰“, , ndhr (L n+” dh+র r), •/- ṇḍr (– ṇ+— ḍ+র r), ˜/. ktr (7 k+x t+র r)

13
3.3.1 The Consonants
As per traditional classification Bangla Consonants are categorized according to their
phonetic properties, especially in terms of place and manner of articulation [107]. There
are Five ‘Varga’ (pronounced as ‘Barga’ in Bangla) or Groups (sets or classes)
distinguished by Place of Articulation, and one Non-‘varga’ group [105]. Each Varga,
which corresponds to Stops at a certain place of articulation, contains a series of five
consonants classified as per their phonetic qualities (i.e. manner of articulation),
beginning from Unvoiced and Unaspirated to Voiced and Aspirated (in the fourth
column), finally ending with a Homorganic or Corresponding nasal [107]. Consider the
following table:

‘Varga’ or Unvoiced Voiced Nasal

Sets

-Asp +Asp -Asp +Asp

Velar ক ‘K’ খ ‘KH’ গ ‘G’ ঘ ‘GH’ ঙ ‘NG’

U+0995 U+0996 U+0997 U+0998 U+0999

Palatal চ ‘C’ ছ ‘CH’ জ ‘J’ ঝ ‘JH’ ঞ ‘NY’

U+099A U+099B U+099C U+099D U+099E

Retroflex ট ‘TT’ ঠ ‘TTH’ ড ‘DD’ ঢ ‘DDH’ ণ ‘NN’

U+099F U+09A0 U+09A1 U+09A2 U+09A3

Dental ত ‘T’ থ ‘TH’ দ ‘D’ ধ ‘DH’ ন ‘N’

U+09A4 U+09A5 U+09A6 U+09A7 U+09A8

Bilabial প ‘P’ ফ ‘PH’ ব ‘B’ ভ ‘BH’ ম ‘M’

U+09AA U+09AB U+09AC U+09AD U+09AE

Table 5: Varga classification of Bangla consonants

(Falling into a Pattern of Five Sets of Unvoiced Unaspirated, Unvoiced Aspirated, Voiced Unaspirated,
Voiced Aspirated and Nasals, called five ‘Varga’)

য ‘Y’ য় ‘YY’ র ‘R’ ড় ‘RR’ ঢ় ‘RH’

Non- U+09AF U+09DF U+09B0 U+09DC U+09DD
Varga
ল ‘L’ শ ‘SH’ ষ ‘SS’ স ‘S’ হ ‘H’
U+09B2 U+09B6 U+09B7 U+09B8 U+0939

Table 6: Non-Varga consonants (Not falling into any of the five categories)

14
3.3.2 The Implicit Vowel Killer: Hasanta (called ’Halant’ or ‘Halanta’ in other
Brā hmı̄-based scripts)
As stated earlier, all consonants are pronounced in isolation with an implicit vowel
(central back /-ɔ/ in Bangla as the neutral vowel) assumed to be associated with them
[121]. The ‘Hasanta’ (=’ Halant’ or ‘Halanta’ in other Brā hmı̄-based scripts) or the term
‘Virāma’7 (=’Dā ̃ri’ in Bangla) as preferred in UNICODE (cf. Unicode 3.0 and above) have
been used in this report as terms that have been used to denote the character that mark
the absence of this inherent vowel. It may be noted that the term virā ma has been
adopted in UNICODE in a sense that is different from the traditional definition of
grammar, and hence it requires some explanation here. Considering the importance of
the document this note should be a part of this LGR document, so that anybody refering
to it should be able to know the proper grammatical explanation of the term. Because a
special sign is needed whenever this implicit vowel is stripped off, the symbol is known
as the Hasanta (= Halant) "◌्" (U+09CD). By placing the Hasanta under the first
consonant of a combination or cluster, one could – in common parlance, “kill” its vowel,
and create conjuncts. In this manner, conjunct characters can be generally written by
joining two to four consonant combinations. In rare cases, this process can join up to
five consonants. However, the notion of a maximum number of consonants joining to
form one akṣara8 is to be bounded empirically. This is an observation based on the CIIL-
Emille Corpora of Bangla words [132 & 133] as seen in print these days. Given the
mixture of scripts and languages happening on the web, the possibility that one may
want a generic Top Level Domain [gTLD] which may have more than the observed
maximum cannot be ruled out. This can be the case when a foreign language word,
which admits a large number of consonants, is transliterated into Bangla. Hence, in the
Bangla LGR work, this limit will not be enforced.

3.3.3 Vowels
Separate symbols exist for all ‘Swara’ or Vowels in Bangla, which are pronounced
independently either at the beginning of the word or after another vowel or consonant
sound. To indicate a Vowel sound other than the implicit one, a Vowel sign, called ‘kār’
in Bangla or Mātrā in Nā garı̄9 is attached to the consonant. Since the consonant has this
built in neutral vowel at the end, there are equivalent kāras (Mātrās) for all vowels
except the অ (pronounced /-ɔ/). The correlation is shown as follows:

7 Virāma, as used here, is also a misnomer according to the Indian grammatical traditions. No where mere
absence of a vowel is marked as virā ma. Hasanta just marks the absence of a vowel, nothing else.
(Abhyankar, Kashinath Vasudev & J. M. Shukla. 1961. A Dictionary of Sanskrit Grammar. Barodaː Oriental
Institute.)
8 This term needs to be disambiguated. Akṣ ara also means ‘syllable‘ in Indian grammatical treaditions
9 Although the term ‘Mā trā ‘ in Bangla stands for an altogether different concept, viz.the top bar placed

over a letter – typically available in Hindi and Bangla but missing in Gujarati.

15
Vowel Corresponding vowel sign
(kāras (Mātrās)

অ ‘A’ U+0985

আ ‘AA’ U+0986 ◌া U+09BE

ই ‘I’ U+0987 ি◌ U+09BF

ঈ ‘II’ U+0988 ◌ী U+09C0

উ ‘U’ U+0989 ◌ু◌ U+09C1

ঊ ‘UU’ U+098A ◌ূ◌ U+09C2

ঋ Vocalic ’R’ U+098B ◌ৃ◌ U+09C3

ৠ Vocalic ‘RR’ U+09E0 ◌ৄ◌ U+09C4

ঌ Vocalic ‘L’ U+098C ◌ৢ◌ U+09E2

ৡ Vocalic ‘LL’ U+09E1 ◌ৣ◌ U+09E3

এ ‘E’ U+098F `◌ U+09C7

ঐ ‘AI’ U+0990 Q◌ U+09C8

ও ‘O’ U+0993 `◌া U+09CB

ঔ ‘AU’ U+0994 `◌ৗ U+09CC

- ◌ৗ U+09D7

Could appear on top of অ ‘A’ U+0985 or ◌ঁ U+0981 Candrabindu

any other vowel
Could appear after অ ‘A’ U+0985 or ◌ং U+0982 Anusvā ra
any other vowel
Could appear after অ ‘A’ U+0985 ◌ঃ U+0983 Visarga
or any other vowel
After any consonant ◌্ U+09CD (Hasanta )

- ঽ U+09BD Avagraha

Table 7: Bangla Vowels with corresponding kārs

16
3.3.4 The Anusvāra /onuʃʃār/ (◌ং - U+0982)
The Anusvāra or /onuʃʃār/ in Bangla at times represents a homorganic nasal but not
always. It replaces a conjunct group of a ‘Nasal Consonant+Hasanta +Consonant’ where
the second consonant belongs to the Velar varga or set as in লংকা. But it often appears
also for such combinations involving non-velars appearing as the last member of the
combination as in লGাংটা “naked”, or লGাংচা “a kind of sweet/to limp”. Before a non-varga
consonant, the Anusvā ra represents a nasal sound that may have an alternative
conjoined writing symbol representing the corresponding nasal consonant of the
particular set. Although Modern Hindi, Marathi and Konkani prefer the anusvāra to the
corresponding Half-nasal, in Bangla it is clearly demarcated as to where one must use
the Anusvāra and where it has to be a conjunct cluster with a nasal as the first or the
second component.

3.3.5 Nasalization: Candrabindu (◌ँ - U+0981)

Candrabindu denotes nasalization of the preceding vowel as in চাঁদ /cā̃ d/ ‘moon’
(U+099A U+09BE U+0981 U+09A6). This sign with a dot inside the half-moon mark is
used as nasalization marker in many Brā hmı̄-based scripts. [143]

3.3.6 Nukta (◌़ - U+09BC)

The nukta sign does not exist in Bangla orthography. It is predominantly used in many
Brā hmı̄ derived scripts, such as Devanā garı̄ (for Hindi, Bodo, Maithili, Santali, Kashmiri
and Sindhi. The term and the concept of nukta are borrowed in Bangla.

The IDNA Protocol (RFC 5891) states that IDNs must be in Unicode Normalization Form
C (NFC). RFC 7940 applies this requirement to LGRs. The definition of NFC in the
Unicode Standard contains a number of composition exclusions. As a result, the Bangla
letters য় YYA, ড় RRA and ঢ় RRHA have to be represented in the this LGR by using the
sequences (YA +Nukta: U+9AF + U+09BC), (DDA + Nukta: U+9A1 + U+09BC), and
(DDHA + Nukta: U+9A2 + U+09BC) instead of the single code points YYA (U+9DF), RRA
(U+09DC), and RRHA (U+09DD), although the use of ‘Nukta’ is otherwise completely
unnatural in Bangla.

It is noted that in the current Unicode Standard chart, these characters are listed as
additional consonants. As per the LGR Procedure, however, these decisions depend on
the IDNA Protocol through a set of prodedures developed by the IETF. Even though the
Unicode Standard also prescribes methods to produce these three characters both as
atomic characters (for example, 09DC for ড় [ṛ ], 09DD for ঢ় [ṛ h], and 09DF as য় [y] as
single key stroke), the IDNA protocol requires that we treat them as conjunct characters
and then allocate codes for these in the Unicode Bengali Block.

17
It may be noted that there could be sporadic attempts or cases of writing Muslim names,
Urdu poetic words and Perso-Arabic loan words with nukta under ক (k), খ (kh), গ (g), জ
(j) and ফ (ph) only for the sake of correct pronunciation and for maintaining the
sanctity of the loan word. These were also like using Bangla writing system to work like
the IPA script. It is, however, not in use in Bangla writing in printing.

3.3.7 Visarga /biʃɔrgo/ (◌ঃ - U+0983) and Avagraha (ঽ - U+09BD)

The Visarga /biʃɔrgo/ U+0983 is frequently used in Bangla loanwords borrowed from
Sanskrit and represents a sound very close to /h/. One could quote, as an example: দু ঃখ
/duhkho/ “sorrow’’, “unhappiness’’ (U+0926 U+0941 U+0983 U+0916).

The Avagraha "ঽ" (U+09BD) is mainly used in Sanskrit, Pā li, Prā kṛ t or Maithili texts
written in Bangla. It is gradually being replaced by an upper comma (e.g. নেরাঽপরািণ re-
written as নেরা’পরািণ). It is rarely used now even in other languages using Bangla script.
In case of LGR, the Avagraha is not part of the repertoire. It has been decided, therefore,
not to retain Avagraha (ঽ) (U+09BD) because it is blocked in TLDs as per the Maximal
Starting Repertoire (MSR).

Please see Appendix II in section 11 for a complete list of Bangla consonants and their
allographs.

3.3.8 Zero Width Non-joiner (U+200C) and Zero Width Joiner (U+200D)

This note is pertinent to the use of Zero Width Joiner (ZWJ) and Zero Width Non Joiner
(ZWNJ) as used in Bangla. It needs to be noted that Nepali, Konkani and Hindi use these
two signs in a different manner.

ZWJ (U+0200D) and ZWNJ (U+0200C) are code points that have been provided by the
Unicode standard to instruct the rendering of a string where the script has the option
between joining and non-joining characters. Without the use of these control codes, the
string may be rendered in an alternate form from what is intended.

Use of ZWJ

• Insofar as Bangla is concerned ZWJ is used for the proper rendering of characters
such as khaṇḍa-ta /ৎ/ as in সতGিজৎ (satyajit) “Satyajit” and সৎ (sat) “honest”. This
is typed as follows:
ta + Hasanta + ZWJ (U+0200D)

18
• However, ZWJ is more important where same combination of consonantal
characters is represented differently depending upon the contexts. E.g. র+◌্+য
have two representations in Bangla—as যH and as রG. To get the form যH one has to
type in the following manner—র+◌্+য, but for রG the sequence would be
র+ZWJ+◌্+য. [154]. In other words, ZWJ is used in the rendering of words
demanding ya-phalā after ra which is otherwise not possible to type (render)
due to the same order of ra+hasanta+antastha ja in the medial and/or final
position. Interestingly, ra+hasanta+antastha ja is used to type repha on the
consonant - antastha ja as in কায6 (kaarjo). In order to get a ya-phalā after the
consonant -ra it is therefore obligatory to use ZWJ after -ra as in রGাপার
(wrapper), রGাশ (rash), রGািল (rally) etc. The typing sequence is given below:
ra (র) + ZWJ + hasanta (◌্) + antastha ja (য) = রG

Use of ZWNJ

• The use of ZWNJ in Bangla is used to represent the explicit Hasanta or Halant. In
order to avoid conjunct formation in cases where there is an explicit hasanta
before the succeeding consonant the ZWNJ is used.

Consonant + hasanta + ZWNJ + consonant = explicit hasanta

Example: Aা7কথন (prā kkathana /prakkɔtʰon)

The use of ZWJ/ZWNJ have been ruled out from the root zone by the [Procedure]. Used
in Bangla, to create alternate renderings, the insertion of these two signs can affect
searching as well as NLP.

The Zero Width Non-joiner (ZWNJ) is an invisible character used in certain cases (after
Hasanta) where default conjunct formation is to be explicitly restricted and the Hasanta
joining the two consonants participating in the conjunct formation needs to be explicitly
shown.

3.3.9 Use of Ya-phalaa

Ya-Phalaa sequences are two instances in Bangla where Hasanta is preceded by a full
vowel (U+0985 অ - BENGALI LETTER A and U+098F এ - BENGALI LETTER E).
• অ"া 0985 09CD 09AF 09BE
BENGALI LETTER A + BENGALI SIGN VIRAMA +
BENGALI LETTER YA + BENGALI VOWEL SIGN AA
• এ"া 098F 09CD 09AF 09BE

19
BENGALI LETTER E+ BENGALI SIGN VIRAMA + BENGALI LETTER YA +
BENGALI VOWEL SIGN AA
For rendering Ya-phalā followed by অ and এ, it is necessary to type U+09CD Hasanta
plus U+09AF ya preceded by the said vowels. This is a purely ligatural entity and the
addition of Ya-phalā and ākā ra is used to elicit the /æ/ sound as in English 'acid' অGািসড,
'association' অGােসািসেয়শন, ‘bat’ বGাট, ‘fat’ ফGাট, ‘mat’ মGাট, ‘cap’ কGাপ etc.

The Brāhmī script, by nature does not have Hasanta after a vowel. Hasanta is generally
described as ‘vowel killer’, although it actually indicates absence of a vowel after the
marked consonant. Only the consonants can have the Hasanta marked. But as we see
here, Bangla ends up with a deviant feature in the orthography here in which Hasanta
comes immediately after a vowel in ligatures অ8া and এ8া (Cf Unicode 10.0 p. 473 [100]).

3.3.10 Formation of Ra-phalaa and Ref Sequences

This case refers to the formation of repha and ra-phalā as follows:

Ra-Hasanta = (C2 H)
where C2 is either
09B0 (র - BENGALI LETTER RA) or
09F0 (ৰ - ASSAMESE LETTER RA/ Unicode name:
BENGALI LETTER RA WITH MIDDLE DIAGONAL)
H is 09CD (◌্ - BENGALI SIGN VIRAMA)

Owing to co-occurrence with HASANTA, RA either loses its own implicit vowel (REPHA),
or suppresses the implicit vowel of the preceding consonant (RA-PHALAƒ ). For instance,
repha = ra + Hasanta + C (e.g. কH i.e. ra + Hasanta + ka, as in অকH arka “the sun“); ra-phalā=
C + Hasanta + ra (e.g. T i.e. ka + Hasanta + ra, as in চT chakra “cycle”). The point is in
both the cases the slot for ra could be Bangla ra র (U+09B0) or the Assamese ra ৰ
(U+09F0), followed/ preceded by the common Hasanta (U+09CD), whereas the shapes
of repha and ra-phalā in both the cases remain the same. The LGR makes a note of this
point of concern with respect to the two RAs in disguise as it would be compeltely
impossible to distinguish between them with naked eyes in a lable so generated which
may consequently lead to concerns related to spoofing and other kind of cyber
irregularities. The motive to class these two CPs as (blocking) variants is because fully
rendered labels may mask the distinction between Bangla ra র (U+09B0) or the
Assamese ra ৰ (U+09F0). That provides the justification for Variant Set 4, though only in
the context of following Hasant. The difference between the RAs is only distinguishable
if one looks into their Unicode values. Therefore, labels such as অকH arka, শীষH ś ır̄ ṣ a ‘top/
apex’, অY abhra ‘cloud/the sky’, ƒম śrama ‘physical labour’ could be extremely
dangerous as the web-user may never verify the digital content (the labels) with its
unicode value/code points. This point is made explicitly, with reference to Table 9 (of
sequences, p. 36) and Table 16 (of WLE Symbols, p. 47) that are to follow. Moreover, it

20
is noteworthy that the REPHA can also occur with KHANDA TA. The conditions in this
context of KHANDA TA are liable to be such that the C should be either RA U+09B0 (র)
(used in Bangla) or RA U+09F0 (ৰ) (used in Assamese).

4. Overall Development Process and Methodology

The Neo-Brāhmī Generation Panel (NBGP) has been formed by members having
experience in Linguistics (especially in NLP / Computational linguistics), Literature,
Language History and Epigraphy. Under the Neo-Brāhmī Generation Panel, Bangla and
eight other scripts belonging to separate Unicode blocks are being taken up to assign a
separate LGR for each. However, an attempt is made to ensure that the fundamental
philosophy behind building those LGRs consistent with all other Brāhmī-derived scripts.
The present LGR will cater to multiple languages belonging to EGIDS scale 1 to 4 (see
Table 4) that use Bangla script.

The following guiding principles are used in making decisions about Bangla LGR Code-
points:

4.1 Guiding Principles

The NBGP adopts following broad principles for selection of code-points in the code-
point repertoire across the board for all the Neo-Brāhmī scripts within its ambit.

4.1.1 Inclusion Principles

4.1.1.1 Modern Usage

Every character proposed should be in the everyday usage of a particular linguistic
community. The characters, which have been encoded in the Unicode for transcription
purposes only or for archival purposes, will not be considered for inclusion in the code-
point repertoire.

4.1.1.2 Unambiguous Use

Every character proposed should have unambiguous understanding among linguists
about its usage in the language.

4.2 Exclusion Principles

The main exclusion principle is that of External Limits on Scope. These consist of
protocols or standards, which are prerequisites to the Label Generation Rule-sets. All
further principles are in fact subsumed under these limitations but have been spelt out
separately for the sake of clarity.

21
4.2.1 External Limits of Scope
The code point repertoire for root zone being a very special case, at the top of protocol
hierarchies, the canvas of available characters for selection as a part of the Root Zone
code point repertoire is already constrained by various protocol layers beneath it. The
following three main protocols/standards act as successive filters:

i. The Unicode Chart

Out of all the characters that are needed by the script in question, if a particular
character is not encoded in Unicode, it cannot be incorporated in the code point
repertoire. Such cases are quite rare, and especially so in Bangla-Asamiyā-Maṇ ipuri
Writing System, given the elaborate and exhaustive character inclusion efforts made by
the Unicode consortium.

ii. IDNA Protocol

Unicode being the character-encoding standard for providing the maximum possible
representation of a given script/language, it has encoded as far as possible all the
possible characters needed by the script. However, the Domain name being a
specialized case, it is governed by an additional protocol known as IDNA
(Internationalized Domain Names in Applications). The IDNA protocol excludes some
characters out of Unicode repertoire from being part of the domain names.

iii. Maximal Starting Repertoire (MSR)

The Root-zone LGR being the repertoire of characters which are going to be used for
creation of the Root-zone TLDs, which in turn constitute an even more specialized case
of domain names, the ROOT LGR procedure introduces additional exclusions on the
IDNA’s allowed set of characters.

Example: Bangla Sign Avagraha "ঽ" (U+093D) even if allowed by IDNA protocol, is not
permitted in the Root Zone Repertoire as per the MSR.

To sum up, the restrictions start off with admitting only such characters as are part of
the code-block of the given script/language. The IDNA Protocol further narrows this
down and finally an additional filter in the form of Maximal Starting Repertoire restricts
the character set associated with the given language even more.

4.2.1.1 No Punctuation Marks

The TLDs being identifiers, punctuation markers present in BraHami-based scripts will
not be included.

22
4.2.1.2 No Symbols and Abbreviations
Abbreviations, weights and measures and other such iconic characters like BANGLA
ISSHAR "৺" (U+09FA), BANGLA CURRENCY DENOMINATOR SIXTEEN "৹" (U+09F9) etc.
will also not be included.

4.2.1.3 No Rare and Obsolete Characters

There are characters which have been added to Unicode to accommodate rare forms
such as Sanskritic VOCALIC RR "ৠ" (U+09E0) and VOCALIC L “ঌ” (U+098C) as well as
VOCALIC LL "ৡ" (U+09E1) and the allographic –kā ra forms of the latter two symbols -
VOWEL SIGN VOCALIC L "◌ৢ" (U+09E2) and VOWEL SIGN VOCALIC LL “◌ৣ" (U+09E3). All
such characters are excluded, which complies with the Conservatism principle as laid
down in the Root Zone LGR procedure. However, in Bangla, the -kā ra corresponding to
VOCALIC RR "ৠ" (U+09E0) which is VOWEL SIGN VOCALIC RR “◌ৄ ” (U+09C4) is still in
active use in certain limited borrowed or Sanskritic words, and are, therefore, retained.

4.2.1.4 No Stress Markers of Classical Sanskrit and Vedic

Stress markers for classical Sanskrit will not be included. This is also in consonance
with the Letter principle as laid down in the Root Zone LGR procedure.

4.2.1.5 ABNF
The Augmented Backus-Naur Formalism (ABNF) is described in Section 5.4.1 and
Appendix (Section 10.1).

5. Repertoire
The Bangla Writing System is represented in UNICODE using the Bengali (Bangla) script
name as enumerated in ISO 15924 corresponding to languages such as Asamiyā
(Assamese), Bangla (Bengali) and Maṇ ipuri. The BENGALI block used for Bangla-
Asamiyā-Maṇ ipuri in the UNICODE has 93 entries. This section details the code-point
repertoire that the Neo-Brāhmī Generation Panel [NBGP] proposes to be included in the
Bangla LGR.

It may be mentioned here that the Government of Assam has submitted a proposal to
Bureau of Indian Standards (BIS) on 26th February 2016 for dis-unification of Bangla
and Asamiyā Scripts. The BIS in its 8th Meeting of Indian Language Technologies and
Products Sectional Committee, LITD 20, held on 23rd Aug 2017, decided to refer the
proposal for recognition of Assamese script in ISO/IEC 10646 to ISO. Until the UNICODE
Consortium takes any further action, it will be assumed that the Code Point Repertoire
under Table 11 will be valid for all the three languages as above.

23
For each of the code points, language references have been given in the last column
titled "Reference" under Table 8 titled the “Code Point Repertoire”. For entire coverage
of Bangla code points, references of Bangla, Asamiyā (Assamese), Maṇ ipuri (Meitei), and
Bishnupriya are given. Kokborok, written in Bangla script, is not known to have
introduced many new complications, except for one particular character. Though only a
few representative languages under EGIDS Scale 1-4 have been chosen for referencing,
they together cover all the code-points required for all the languages that NBGP has
considered as given under Bangla Unicode Points (as given in UNICODE 6.3).

However, before the details are presented, it is ideal to look at the Bangla Code Point
Chart from Maximal Starting Repertoire [MSR] Version 3. It may be noted that the shapes of
the reference glyphs given below in the code charts are based on one of the many fonts
designed, and are not prescriptive, because there could be some variations in actual
fonts – both UNICODE-compatible and True-Type ones. Consider the following Code
point table:

24
Colour convention:

All characters that are included in the [MSR]

- Yellow background

PVALID in IDNA2008 but excluded from the

[MSR] - Pinkish background

Not PVALID in IDNA2008, or are ineligible

for the root zone (digits, hyphen) - White
background

Figure 1: Bangla Code Page from [MSR] for

Bangla- Asamiyā -Maṇ ipuri

Given the Bangla Unicode Block as in Figure 1, for the code points those are included in the
MSR, the following symbols will need a separate treatment:
ৎ U+09CE Bangla Letter Khaṇ ḍ a-Ta
ৰ U+09F0 Asamiyā -Bangla Letter Ra With Middle Diagonal
ৱ U+09F1 Asamiyā -Bangla Letter Ra With Lower Diagonal

25
5.1 Code Point Repertoire Inclusion
No. Unicode Gly Character Category Language(s), References and
Code ph Name with EGIDS Comment
Point Value

1. U+0981 ◌ঁ BENGALI Candra- 1 Bangla, [112], [122], [125]

SIGN bindu 2 Maṇ ipuri,
CANDRABIN 2 Assamese
DU

2. U+0982 ◌ং BENGALI Onushshar 1 Bangla, [112], [122], [125]

SIGN (Anusvā ra) 2 Maṇ ipuri,
ANUSVARA 2 Assamese

3. U+0983 ◌ঃ BENGALI Biśarga 1 Bangla, [112], [122], [125]

SIGN (Visarga) 2 Maṇ ipuri,
VISARGA 2 Assamese

4. U+0985 অ BENGALI Vowel 1 Bangla, [112], [122], [125]

LETTER A 2 Maṇ ipuri,
2 Assamese

5. U+0986 আ BENGALI Vowel 1 Bangla, [112], [122], [125]

LETTER AA 2 Maṇ ipuri,
2 Assamese

6. U+0987 ই BENGALI Vowel 1 Bangla, [112], [122], [125]

LETTER I 2 Maṇ ipuri,
2 Assamese

7. U+0988 ঈ BENGALI Vowel 1 Bangla, [112], [122], [125]

LETTER II 2 Maṇ ipuri,
2 Assamese

26
No. Unicode Gly Character Category Language(s), References and
Code ph Name with EGIDS Comment
Point Value

8. U+0989 উ BENGALI Vowel 1 Bangla, [112], [122], [125]

LETTER U 2 Maṇ ipuri,
2 Assamese

9. U+098A ঊ BENGALI Vowel 1 Bangla, [112], [122], [125]

LETTER UU 2 Maṇ ipuri,
2 Assamese

10. U+098B ঋ BENGALI Vowel 1 Bangla, [112], [122], [125]

LETTER 2 Maṇ ipuri,
VOCALIC R 2 Assamese

11. U+098F এ BENGALI Vowel 1 Bangla, [112], [122], [125]

LETTER E 2 Maṇ ipuri,
2 Assamese

12. U+0990 ঐ BENGALI Vowel 1 Bangla, [112], [122], [125]

LETTER AI 2 Maṇ ipuri,
2 Assamese

13. U+0993 ও BENGALI Vowel 1 Bangla, [112], [122], [125]

LETTER O 2 Maṇ ipuri,
2 Assamese

14. U+0994 ঔ BENGALI Vowel 1 Bangla, [112], [122], [125]

LETTER AU 2 Maṇ ipuri,
2 Assamese

27
No. Unicode Gly Character Category Language(s), References and
Code ph Name with EGIDS Comment
Point Value

15. U+0995 ক BENGALI Consonant 1 Bangla, [112], [122], [125]

LETTER KA 2 Maṇ ipuri,
2 Assamese

16. U+0996 খ BENGALI Consonant 1 Bangla, [112], [122], [125]

LETTER KHA 2 Maṇ ipuri,
2 Assamese

17. U+0997 গ BENGALI Consonant 1 Bangla, [112], [122], [125]

LETTER GA 2 Maṇ ipuri,
2 Assamese

18. U+0998 ঘ BENGALI Consonant 1 Bangla, [112], [122], [125]

LETTER GHA 2 Maṇ ipuri,
2 Assamese

19. U+0999 ঙ BENGALI Consonant 1 Bangla, [112], [122], [125]

LETTER NGA 2 Maṇ ipuri,
2 Assamese

20. U+099A চ BENGALI Consonant 1 Bangla, [112], [122], [125]

LETTER CA 2 Maṇ ipuri,
2 Assamese

21. U+099B ছ BENGALI Consonant 1 Bangla, [112], [122], [125]

LETTER CHA 2 Maṇ ipuri,
2 Assamese

28
No. Unicode Gly Character Category Language(s), References and
Code ph Name with EGIDS Comment
Point Value

22. U+099C জ BENGALI Consonant 1 Bangla, [112], [122], [125]

LETTER JA 2 Maṇ ipuri,
2 Assamese

23. U+099D ঝ BENGALI Consonant 1 Bangla, [112], [122], [125]

LETTER JHA 2 Maṇ ipuri,
2 Assamese

24. U+099E ঞ BENGALI Consonant 1 Bangla, [112], [122], [125]

LETTER NYA 2 Maṇ ipuri,
2 Assamese

25. U+099F ট BENGALI Consonant 1 Bangla, [112], [122], [125]

LETTER TTA 2 Maṇ ipuri,
2 Assamese

26. U+09A0 ঠ BENGALI Consonant 1 Bangla, [112], [122], [125]

LETTER 2 Maṇ ipuri,
TTHA 2 Assamese

27. U+09A1 ড BENGALI Consonant 1 Bangla, [112], [122], [125]

LETTER DDA 2 Maṇ ipuri,
2 Assamese

29
No. Unicode Gly Character Category Language(s), References and
Code ph Name with EGIDS Comment
Point Value

28. 09A1 ড় Normalized Consonant 1 Bangla, [112], [122], [125]

09BC form of 2 Maṇ ipuri,
(U+09DC BENGALI 2 Assamese 09DC is the preferred
) LETTER RRA code point, however it
is not available for
LGR as per the
standards governing
this LGR development

29. U+09A2 ঢ BENGALI Consonant 1 Bangla, [112], [122], [125]

LETTER 2 Maṇ ipuri,
DDHA 2 Assamese

30. 09A2 ঢ় Normalized Consonant 1 Bangla, [112], [122], [125]

09BC form of 2 Maṇ ipuri,
(U+09DD BENGALI 2 Assamese 09DD is the preferred
) LETTER RHA code point, however it
is not available for
LGR as per the
standards governing
this LGR development

31. U+09A3 ণ BENGALI Consonant 1 Bangla, [112], [122], [125]

LETTER NNA 2 Maṇ ipuri,
2 Assamese

32. U+09A4 ত BENGALI Consonant 1 Bangla, [112], [122], [125]

LETTER TA 2 Maṇ ipuri,
2 Assamese

30
No. Unicode Gly Character Category Language(s), References and
Code ph Name with EGIDS Comment
Point Value

33. U+09A5 থ BENGALI Consonant 1 Bangla, [112], [122], [125]

LETTER THA 2 Maṇ ipuri,
2 Assamese

34. U+09A6 দ BENGALI Consonant 1 Bangla, [112], [122], [125]

LETTER DA 2 Maṇ ipuri,
2 Assamese

35. U+09A7 ধ BENGALI Consonant 1 Bangla, [112], [122], [125]

LETTER DHA 2 Maṇ ipuri,
2 Assamese

36. U+09A8 ন BENGALI Consonant 1 Bangla, [112], [122], [125]

LETTER NA 2 Maṇ ipuri,
2 Assamese

37. U+09AA প BENGALI Consonant 1 Bangla, [112], [122], [125]

LETTER PA 2 Maṇ ipuri,
2 Assamese

38. U+09AB ফ BENGALI Consonant 1 Bangla, [112], [122], [125]

LETTER PHA 2 Maṇ ipuri,
2 Assamese

39. U+09AC ব BENGALI Consonant 1 Bangla, [112], [122], [125]

LETTER BA 2 Maṇ ipuri,
2 Assamese

31
No. Unicode Gly Character Category Language(s), References and
Code ph Name with EGIDS Comment
Point Value

40. U+09AD ভ BENGALI Consonant 1 Bangla, [112], [122], [125]

LETTER BHA 2 Maṇ ipuri,
2 Assamese

41. U+09AE ম BENGALI Consonant 1 Bangla, [112], [122], [125]

LETTER MA 2 Maṇ ipuri,
2 Assamese

42. U+09AF য BENGALI Consonant 1 Bangla, [112], [122], [125]

LETTER YA 2 Maṇ ipuri,
2 Assamese

43. 09AF য় Normalized Consonant 1 Bangla, [112], [122], [125]

09BC form of 2 Maṇ ipuri,
(U+09DF BENGALI 2 Assamese, 09DF is the preferred
) LETTER YYA code point, however it
is not available for
LGR as per the
standards governing
this LGR development

44. U+09B0 র BENGALI Consonant 1 Bangla, [112], [122]

LETTER RA 2 Maṇ ipuri

45. U+09B2 ল BENGALI Consonant 1 Bangla, [112], [122], [125]

LETTER LA 2 Maṇ ipuri,
2 Assamese

46. U+09B6 শ BENGALI Consonant 1 Bangla, [112], [122], [125]

LETTER SHA 2 Maṇ ipuri,
2 Assamese

32
No. Unicode Gly Character Category Language(s), References and
Code ph Name with EGIDS Comment
Point Value

47. U+09B7 ষ BENGALI Consonant 1 Bangla, [112], [122], [125]

LETTER SSA 2 Maṇ ipuri,
2 Assamese

48. U+09B8 স BENGALI Consonant 1 Bangla, [112], [122], [125]

LETTER SA 2 Maṇ ipuri,
2 Assamese

49. U+09B9 হ BENGALI Consonant 1 Bangla, [112], [122], [125]

LETTER HA 2 Maṇ ipuri,
2 Assamese

50. U+09BE ◌া BENGALI Kāra 1 Bangla, [112], [122], [125]

VOWEL SIGN (Mātrā) 2 Maṇ ipuri,
AA 2 Assamese

51. U+09BF ি◌ BENGALI Kāra 1 Bangla, [112], [122], [125]

VOWEL SIGN (Mātrā) 2 Maṇ ipuri,
I 2 Assamese

52. U+09C0 ◌ী BENGALI Kāra 1 Bangla, [112], [122], [125]

VOWEL SIGN (Mātrā) 2 Maṇ ipuri,
II 2 Assamese

53. U+09C1 ◌ু BENGALI Kāra 1 Bangla, [112], [122], [125]

VOWEL SIGN (Mātrā) 2 Maṇ ipuri,
U 2 Assamese

33
No. Unicode Gly Character Category Language(s), References and
Code ph Name with EGIDS Comment
Point Value

54. U+09C2 ◌ূ BENGALI Kāra 1 Bangla, [112], [122], [125]

VOWEL SIGN (Mātrā) 2 Maṇ ipuri,
UU 2 Assamese

55. U+09C3 ◌ৃ BENGALI Kāra 1 Bangla, [112], [122], [125]

VOWEL SIGN (Mātrā) 2 Maṇ ipuri,
VOCALIC R 2 Assamese

56. U+09C4 ◌ৄ BENGALI Kāra 1 Bangla, [112], [125]

VOWEL SIGN (Mātrā) 2 Assamese
VOCALIC RR

57. U+09C7 l◌ BENGALI Kāra 1 Bangla, [112], [122], [125]

VOWEL SIGN (Mātrā) 2 Maṇ ipuri,
E 2 Assamese

58. U+09C8 m◌ BENGALI Kāra 1 Bangla, [112], [122], [125]

VOWEL SIGN (Mātrā) 2 Maṇ ipuri,
AI 2 Assamese

59. U+09CB l◌া BENGALI Kāra 1 Bangla, [112], [122], [125]

VOWEL SIGN (Mātrā) 2 Maṇ ipuri,
O 2 Assamese

60. U+09CC l◌ৗ BENGALI Kāra 1 Bangla, [112], [122], [125]

VOWEL SIGN (Mātrā) 2 Maṇ ipuri,
AU 2 Assamese

34
No. Unicode Gly Character Category Language(s), References and
Code ph Name with EGIDS Comment
Point Value

61. U+09CD ◌্ BENGALI Hasanta 1 Bangla, [112], [122], [125]

SIGN (=Halant)/ 2 Assamese
VIRAMA Virā ma 2 Maṇ ipuri
(=Dār̃ i)

62. U+09CE ৎ BENGALI Consonant 1 Bangla, [112], [122], [125]

LETTER 2 Maṇ ipuri,
KHANDA TA 2 Assamese

63. U+09F0 ৰ BENGALI Consonant 2 Assamese [125]

LETTER RA
WITH
MIDDLE
DIAGONAL

64. U+09F1 ৱ BENGALI Consonant 2 Maṇ ipuri [122],[125]

LETTER RA 2 Assamese
WITH
LOWER
DIAGONAL

Table 8: Bangla Code-Point Repertoire

Apart from the above individual code-points, the Neo-Brāhmī Generation Panel also
proposes some specific sequences which enable conditional inclusion of the "Bangla
LETTER A and E" followed by Bangla SIGN VIRAMA and Bangla LETTER YA again
followed by Bangla VOWEL SIGN AA in the repertoire for enabling inclusion of /æ/
sound as in English ‘bat’, ‘cat’ etc.

Sr. Unicode Seque Character Names Example Reference

No. Code nce languages
Points using the
code-point
(Not
exhaustive
list)

35
Sr. Unicode Seque Character Names Example Reference
No. Code nce languages
Points using the
code-point
(Not
exhaustive
list)

S1. 0985 অ8া BENGALI LETTER A Bangla, [112], [122]

09CD BENGALI SIGN VIRAMA Assamese
09AF BENGALI LETTER YA
09BE BENGALI VOWEL SIGN AA

S2. 098F এ8া BENGALI LETTER E Bangla [112]

09CD BENGALI SIGN VIRAMA
09AF BENGALI LETTER YA
09BE BENGALI VOWEL SIGN AA

Table 9: Sequences

5.2 Code Point Repertoire Exclusion

There are some characters of the Bangla script that find place in the Unicode but have
not been included in the repertoire in the LGR proposal. The reason for excluding ঌ
(U+098C) and ◌ৗ (U+09D7) is that they are rare and obsolete characters.

Sr. No. Code Glyph Character Names Note

Points

1. U+098C ঌ BENGALI LETTER VOCALIC L Limited or declining use

2. U+09D7 ◌ৗ BENGALI AU LENGTH MARK Limited or declining use

Table 10: Excluded Code Points

36
5.3 Code point not used alone
BENGALI SIGN NUKTA U+09BC (See 3.3.6) is excluded from repertoire since it will
never be used alone. It will be used as sequence in three special characters in
normalized form for ড়, ঢ়, য়.

Unicode Glyph Character Name Reason for exclusion

Code
Point
U+09BC ◌় BENGALI SIGN Never used alone. Only used
NUKTA together with U+09A1 ড,
U+09A2 ঢ, U+09AF য as to
form ড়, ঢ়, য় respectively

Table 10b: Excluded Code Points

5.4 The Basis of Present IDN

The present LGR has also benefited from the earlier work on IDN for Bangla (different
versions) done for .भारत or .ভারত drafted between 20.11.2009 and 18.07.2013.

5.4.1 The ABNF Variables

The Augmented Backus-Naur Formalism (ABNF) began with the following variables:
C → Consonant
V → Vowel
M → kāra (Mātrā)
B → Anusvāra (/onuʃʃār /)
D → Candrabindu
X → Visarga (/biʃɔrgo)
H → Hasanta /Virā ma
Z → Khaṇ ḍ a Ta

The Augmented Backus-Naur Formalism (ABNF) will use the following Operators:

Sr. Number Operator Function

1 “|“ Alternative

2 “[ ]” Optional

3 “*” Variable Repetition

4 “( )” Sequence Group

37
Table 11: The ABNF Formalism

In what follows, the Vowel Sequence and the Consonant Sequence pertinent to Bangla
are given to facilitate understanding.

5.4.2 The Vowel Sequence

In what follows, the Vowel Sequence and the Consonant Sequence pertinent to Bangla
are given. To facilitate understanding of other Brahmi script users, equivalents in
Devanāgarī are provided, wherever necessary.

A vowel sequence is made up of a single vowel. It may be followed but not necessarily
(optionally) by an Anusvāra /onuʃʃār/ (B), Candrabindu (D) or a Visarga /biʃɔrgo/ (X).
The number of D, B or X which can follow a V in Bangla may not be restricted to one.
Going by the rules illustrated in the document it is clear that formations such as VDD,
VBB and VXX are invalid orthographic units. However, it is valid and possible to have
formations or sequences such as anusvā ra followed by a chandrabindu on one hand and
visarga followed by a chandrabindu on the other as in হ8াঁংচা ‘hænchā’ and ‘hæn’ হ8াঁঃ
respectively.

The possibility of a Visarga or Anusvāra (/onuʃʃār/) following a Candrabindu exists in

Bangla. Vowel can optionally be followed by a combination of Hasanta / Virā ma [H],
Consonant [C] to form a Ya-phalā . “Ya-phalā is a presentation form of U+09AF Bangla
letter য or ‘ya’. Represented by the sequence < U+09CD, i.e. ◌্ BENGALI SIGN VIRAMA,
Bangla SIGN Hasanta or VIRAƒ MA, U+09AF - য BENGALI LETTER YA>, Ya-phalā has a
special form: য়. Again, when combined with U+09BE ◌া BENGALI VOWEL SIGN AA, (i.e.
‘aa’(ā)), it is used for transcribing [æ] as in the “a” in the English word “bat” written in
Bangla as ব8াট.

A Vowel-sequence admits the following combinations:

5.4.2.1 A Single Vowel

Examples: V অ अ

5.4.2.2 A Vowel with Conditions

A Vowel can optionally be followed by Anusvāra [B] or Candrabindu [D] or Visarga [X]
or Candrabindu+ Anusvāra [DB] or Candrabindu+ Visarga [DX] or combination of
Hasanta (or Virama) [H] followed by Consonant [C] followed by kāra (Mātrā) [M].

38
Examples:

VB অং अं
VD অঁ अँ
VX অঃ अः
VDB অঁং अँ◌ं◌ं
VDX অঁঃ अँ◌ं◌ः
VHCM অ8া /এ8া

5.4.2.3 VHCM Sequence

A VHCM sequence can optionally be followed by Anusvāra [B] or Candrabindu [D] or
Visarga [X] or Candrabindu+ Anusvāra [DB] or Candrabindu+ Visarga [DX].

Examples:
VHCMB অ8াং/এ8াং
VHCMD অ8াঁ/এ8াঁ
VHCMX অ8াঃ/এ8াঃ
VHCMDB অ8াঁং/এ8াঁং
VHCMDX অ8াঁঃ/এ8াঁঃ

5.4.3 The Consonant Sequence

5.4.3.1 A Single Consonant (C)

Example: C ক क

5.4.3.2 A Consonant with Conditions

A Consonant optionally followed by dependent vowel sign / kāra (Mātrā) [M] or
Anusvāra [B] or Candrabindu [D] or Visarga [X] or Hasanta (also known as
Virā ma) [H] or Candrabindu+ Anusvāra [DB] or Candrabindu+ Visarga [DX]

Example:
CM িক/ কৃ -क/ कृ
CB কং कं

39
CD কঁ कँ
CX কঃ कः
CH p क् (Pure consonant)
CDB কঁ ং कँ◌ं ◌ं
CDX কঁ ঃ कँः

5.4.3.3 CM Sequence
A CM sequence can be optionally followed by B, D, X, DB or DX.

Example:
CMB কীং/ কৃ ং क/ं/ कं ृ
CMD কাঁ काँ
CMX বীঃ वीः
CMDB কাঁং काँ ◌ं
CMDX কাঁঃ काँः

5.4.3.4 Sequence of Consonants

A sequence of consonants (up to 4) joined by Hasanta (also known as Virama).

*3(CH)C
Example:
CHC W → ন+◌্+ ত न ्+ त

CHCHC ² → ন+◌্ + ত+◌্ + র न ्+ त ्+ र

CHCHCHC q8 → ন+◌্+ত+◌্+র+◌্+য় न ् + त ् + र् + य

5.4.3.5 Subsets:

While considering its subsets, as a representative example, we will consider the

combination CHC only, however the same is equally applicable to CHCHC and CHCHCHC.

[A]. The combination may be followed by M, B, D, X, DB or DX.

Example:
CHCM ³ী →ক ◌্ ক ◌ী 4क/ → क ◌् क ◌ী
CHCB ³ং →ক ◌্ ক ◌ং 4कं → क ◌् क ◌ं◌ं
CHCD ³ঁ →ক ◌্ ক ◌ঁ 4कँ→ क ◌् क ◌ं◌ँ

40
CHCX ³ঃ →ক ◌্ ক ◌ঃ 4कः → क ◌् क ◌ঃ
CHCDB ³ঁ ◌ং →ক ◌্ ক ◌ঁ ◌ং 4कँ◌ं◌ं→ क ◌् क ◌ं◌ँ ◌ं
CHCDX ³ঁঃ →ক ◌্ ক ◌ঁ ◌ঃ 4कँ◌ं◌ः→ क ◌् क ◌ं◌ँ ◌ः

[B]. *3(CH)CM may further be followed by a B, D, X, DB or DX

Example:
CHCMB ³ীং → ক ◌্ ক ◌ী ◌ং 4क/ं → क ◌् क ◌ी ◌ं
³ৃং → ক ◌্ ক ◌ৃ ◌ং 4कं ृ → क ◌् क ◌ृ ◌ं
CHCMD ³াঁ → ক ◌্ ক ◌া ◌ঁ 4काँ → क ◌् क ◌ा ◌ं◌ँ
CHCMX ³ীঃ → ক ◌্ ক ◌ী ◌ঃ 4क/ः → क ◌् क ◌ी ◌ः
CHCMDB ³াঁং→ ক ◌্ ক ◌া ◌ঁ ◌ং 4काँ→ क ◌् क ◌ा ◌ँ ◌ं

CHCMDX ³াঁঃ → ক ◌্ ক ◌া ◌ঁ ◌ঃ 4काँः → क ◌् क ◌ा ◌ं◌ँ ◌ः

5.4.4 The Khaṇ ḍ a-Ta sequence

5.4.4.1 A single ‘Khaṇ ḍ a’-Ta (Z)

Example: Z ৎ = x

5.4.4.2 A Khaṇ ḍ a Ta Combination10

A Khaṇ ḍ a Ta can be preceded by a consonant and Hasanta (also known as Virā ma)

[CH]Z

Example:
র + ◌্ + ৎ = ৎH as in ভৎHসনা (bhartsanā) "scolding"
Note: The conditions in this context of KHANDA TA are that the C should be either RA
U+09B0 (র) (used in Bangla) or RA U+09F0 (ৰ) (used in Assamese).

5.4.5 Special Cases S and P

Two special cases involving Sequences (referred to as S and P in Table 16 under Section
7) could be described briefly here. Let us take up S in the first instance. It is noteworthy
that there are two instances in Bangla where Hasanta (U+09CD) is preceded by a full

10
Refer to Rule P in Section 7, Table 16.

41
vowel (U+0985 অ - BANGLA LETTER A and U+098F এ - BANGLA LETTER E). For
rendering Ya-phalā followed by অ and এ, it is necessary to type U+09CD plus U+09AF ya
preceded by the said vowels. This is a purely ligatural entity and the addition of Ya-
phalā and ā-kāra is used to elicit the /æ/ sound as in English ‘bat’, ‘fat’ etc. The Brā hmı̄
script, by nature does not have Hasanta after a vowel. Hasanta is generally described as
‘vowel killer’, although it actually indicates absence of a vowel after the marked
consonant. Only the consonants can have the Hasanta marked. But as we see here,
Bangla ends up with a deviant feature in the orthography here in which Hasanta comes
immediately after a vowel in ligatures অGা and এGা (Cf Unicode 10.0 p. 473 [100]).

Another case refers to the formation of repha and ra-phalā in the said script and
mentioned in the table above as P. Owing to co-occurrence with HASANTA, RA either
loses its own implicit vowel (REPHA), or suppresses the implicit vowel of the preceding
consonant (RA-PHALAƒ ). For instance, repha = ra + Hasanta + C (e.g. কH i.e. ra + Hasanta +
ka, as in অকH arka “the sun“); ra-phalā= C + Hasanta + ra (e.g. T i.e. ka + Hasanta + ra, as
in চT cakra “cycle“). The point is in both the cases the slot for ra could be Bangla ra র
(U+09B0) or the Assamese ra ৰ (U+09F0), followed/ preceded by the common Hasanta
(U+09CD), whereas the shapes of repha and ra-phalā in both the cases remain the same.

42
6. Variants

This section talks about the variants in the Bangla script. The NBGP categorizes these
confusingly variants in two groups.

Group 1: Confusing due to pure visual similarity.

Group 2: Confusing due to deviation from normally perceived character formations by
larger linguistic community.

For Group 1, any identical code points are defined as variants. The confusable, but not
identical, cases are not proposed, as there is another panel (String similarity assessment
panel) entrusted to deal with such cases. However, cases which belong to Group 2 are
proposed to be considered as variants. These cases are not of mere visual similarity as
they involve some deviations from the widely accepted norms of Bangla Akshar
formations. These can cause confusion even to a careful observer and hence being
proposed as variants.

The variants are generated in a script when two or more forms are formed with
different storage or code points. In Bangla the e-kāra, ā-kāra and the o-kāra have
different code points. One can type o with a consonant at one go and the same by typing
e-kāra and ā-kāra as two separate keys getting the same results. A reader cannot
differentiate between the two ko (`কা), one typed with a single key and the other one
typed with two different keys. Moreover, this will not be considered as a case of variant
because a kāra followed by a kāra is not allowed.

6.1 In Script Variants

However, we propose two cases of true in-script variants in Bangla script.

CASE I:
As far as true variants in Bangla are concerned, we may draw our attention to cases
wherein Hasanta with (U+09A5) থ (tha) appears as conjunct with (U+09B8) স (sa) and
(U+09A8) ন (na).

1. স + Hasanta + থ (U+09B8 + U+09CD + U+09A5) versus

স + Hasanta + হ (U+09B8 + U+09CD + U+09B9)
2. ন + Hasanta + থ (U+09A8 + U+09CD + U+09A5) versus
ন + Hasanta + হ (U+09A8 + U+09CD + U+09B9)

43
The above combinations, if written in traditional orthography, could be little confusing,
where the থ (tha) in conjunct appears like a হ (ha). The conjunct could be in the initial,
medial or final positions (as shown below in e.g. no 1). It could be typed wrong as well,
thinking it was a হ (ha) U+09B9, increasing the chances of risks in label writing and
identification.

Examples:
1. ´ and µহ (as in ´ান sthāna, ´ূল sthū la, {া´G svāsthya, অ´ায়ী asthāyı̄)
2. ˆ and ‰হ (as in ¶ˆ grantha)

The fonts which represent traditional Bangla writing system could tend to create this
problem. Therefore, these may be taken as cases of variants in Bangla.

CASE II:
Another interesting example of variant is encountered in ra + Hasanta and Hasanta + ra
combinations in writing labels in the Bangla script (for languages such as Bangla,
Assamese and Maṇ ipuri). The variant cases arise in typing ‘repha’ (involving ra +
Hasanta) and ‘ra-phalā’ (involving Hasanta + ra).

‘Repha’ could be formed by two sequences (mainly because both Assamese and Bangla
find place in the same UNICODE points, and ‘B_RA’ as well as ‘A_RA’ refer to the same
phonetic element). Here, the final ligatures look the same, and will be as follows:
(1) B_RA + H + C
(2) A_RA + H + C

Where
B_RA = U+09B0 BENGALI LETTER RA (র) or
A_RA = U+09F0 BENGALI LETTER RA WITH MIDDLE DIAGONAL (ৰ)
H = U+09CD BENGALI SIGN VIRAMA (◌্)
C = any consonant (theoretically)
Example:

Sequence1 Ligature Sequence2 Ligature

(Using Bangla RA) 1 (Using Assamese RA) 2

U+09B0 (র) U+09CD (◌্)U+0995 (ক) কH U+09F0 (ৰ) U+09CD (◌্) U+0995 (ক) কH

U+09B0 (র) U+09CD (◌্)U+09A0 (ঠ) ঠH U+09F0 (ৰ) U+09CD (◌্) U+09A0 (ঠ) ঠH

Table 12: Example of Repha

44
Note: As Bangla and Assamese ক and ঠ look exactly the same, the resultant
combinations with 'Repha' look identical. Addition of 'Repha' does not make any
difference.

‘Ra-phalā ’ could be formed by two sequences on similar grounds, and the final ligatures
would look the same
(1) C1 + H + B_RA
(2) C1 + H + A_RA
Where
C1 = any consonants except Khaṇ ḍ a-ta
Example:

Sequence1 Ligature Sequence2 Ligature

(Using Bangla RA) 1 (Using Assamese RA) 2

U+0995 (ক) U+09CD (◌্) U+09B0 (র) ' U+0995 (ক) U+09CD (◌্) U+09F0 (ৰ) '

U+09A8 (ন) U+09CD (◌্) U+09B0 (র) ) U+09A8 (ন) U+09CD (◌্) U+09F0 (ৰ) )

Table 13: Example of Ra-phalā

As the Assamese and Bangla Repha and Ra-phalā conjunct forms look the same, this
could cause confusability to the end-users. Hence, the repha and ra-phalā cases need to
be defined as variants.

NBGP concluded to define র and ৰ as variant code points, where only one variant set
between র and ৰ could cover all cases. But this will create blocked variant labels, e.g. if
someone registers “র র র” the variant label “ৰৰৰ” will be generated as variant and will
be blocked and vice versa. However, it is only blocked at the label level, if someone else
needs to register other labels e.g. ৰৰ or ৰৰৰৰ, it is still possible.

After the public comment, the NBGP reviewed the disposition for র and ৰ variants.
These code points are used equally. Therefore, for the usability, the NBGP decided that র
and ৰ are variant “allocatable”. In addition, these code points 09B0 and 09F0 should
not be used in the same label, therefore the no-mix rule should be implemented.

45
6.2 Cross Script Variants
A crisp cross script study for Bangla has been done with respect to sister scripts such as
Devanāgarī, Gurmukhı̄ and Odia11 (formerly Oriya) keeping in mind the visual and
technical confusions they may cause as labels on the web domain. Moreover, there is no
in-script variant in Bangla as far as the orthography is concerned. The following
characters are being proposed by the NBPG as variants. Although there are certain
characters which are somewhat similar they but have not been included here. They
have been provided in the Appendix (10.2) for reference.

1. Bangla and Nāgarī /Devanāgarī Script

Bangla Devanāgarī
ম म
U+09AE U+092E

ি◌ ि◌
U+09BF U+093F

Table 14 - Bangla and Devanāgarī cross-script variant code point

2. Bangla and Gurmukhi Script

Bangla Gurmukhı̄
ম ਸ
U+09AE U+0A38
ি◌ ਿ◌
U+09BF U+0A3F

Table 15 - Bangla and Gurmukhı̄ cross-script variant code point

7. Whole Label Evaluation Rules (WLE)

This section provides the WLEs that are required by all the languages mentioned in
section 3.2 when written in Bangla12 Script. The rules have been drafted in such a
way that they can be easily translated into the LGR specifications.

11
Unicode uses Oriya for the script, although Odia is now the official term used.
12
As used by the Unicode, denoting and including both Assamese and Maṇipuri.

46
Below are the symbols used in the WLE rules, for each of the "Indic Syllabic
Category" as mentioned in the table provided in Code point repertoire (Section 5.1).

C → Consonant
M → Kāra (Mātrā)
V → Vowel
B → Anusvāra

D → Candrabindu

X → Visarga
H → Hasanta
Z → Khaṇ ḍ a Ta
S → S1, S2 (from Table 9)

(a/e) Ya-phalā (V1 H C1 M1)

where
V1 is either 0985 (অ - BENGALI LETTER A)
or 098F (এ - BENGALI LETTER E)
H is 09CD (◌্ - BENGALI SIGN VIRAMA)
C1 is - 09AF (য - BENGALI LETTER YA)
M1 is - 09BE (◌া - BENGALI VOWEL SIGN AA)

S1 and S2 are valid, even they are not allowed by

the other context rules.

P → Ra-Hasanta (C2 H)
where
C2 is either 09B0 (র - BENGALI LETTER RA)
or 09F0 (ৰ - ASSAMESE LETTER RA/
Unicode name: BENGALI LETTER RA
WITH MIDDLE DIAGONAL)
H is 09CD (◌্ - BENGALI SIGN VIRAMA)

Table 16 - Symbols used in WLE rules

47
It is also perhaps ideal to mention here that in Bangla, the consonant letters (or
graphemes) are physically joined to form “clusters” that could theoretically conjoin
from two to four consonants and combine to create new shapes. Dash and Chaudhuri
(1998) state that there are “nearly 380 unique consonant...clusters” out of which Bi-
consonantal combinations are 290, three-letter combinations account for another 80
and the rarer ones with four letters number 10 more [136, Pg 4]. More details of such
combinations could be seen in Pabitra Sarkar (1993) [135].

7.1 Final Set of WLE Rules

The prevalent patterns in Bangla, and various restrictions, below are the specific WLE
rules that need to be implemented.

1. C is a set of C and CN where CN is the set of normalized forms of {ড়, ঢ়, য়}.

2. H: must be preceded by C
Example: #
3. M: must be preceded by C
Example: কা
4. D: must be preceded by either of V, C, or M
Example: আঁ, খঁ, খাঁ, হ"াঁ
5. X: must be preceded by either of V, C, M or D
Example: উঃ, খঃ, বঃ, ◌া◌ঃ, দুঁ ঃ
6. B: must be preceded by either of V, C, M or D
Example: আং, ইং, কং
7. Z: must be preceded by V, C, M, D, B, X or P
Example: ইৎ, কৎ, ◌াৎ, ◌া◌ঁৎ, প6ৎ, rৎ (S is not listed, because S ends with M, Z
may also follow S)”.
8. V: CANNOT be preceded by H
Details in 7.1.1 Case of V preceded by H
9. S: CANNOT be preceded by H
10. 09B0(র)and 09F0(ৰ)CANNOT be mixed
Details in 6.1 CASE II

Now let us elaborate each rule with examples from the script keeping in mind
the Bangla, Assamese and Maṇ ipuri communities. Some combinations of
characters may seem unrealistic or rare in usage but there is no harm in adding
such ligatures because it is possible to create them by any user easily but may
not be attested combinations.

48
7.1.1 Case of V Preceded by H:

There could be cases involving multi-word domains where V may need to be

allowed to follow an H

e.g. ব8াtঅuইিvয়া /bæŋk ʌv ɪndiə / (U+09AC U+09CD U+09AF U+09BE U+0999

U+09CD U+0995 U+0985 U+09AB U+09CD U+0987 U+09A8 U+09CD U+09A1
U+09BF U+09DF U+09BE) (meaning: Bank of India)

This is the case where two different words are joined together first of which ends
with an H (অu) and the second word begins with a V (ইিvয়া). Some sections of
the linguistic community require the explicit presence of H for full
representation of the sound intended. However, by and large, the form of the
first word without an H (U+09CD) is considered enough for full representation of
the sound intended for the first word.

This is a unique situation necessitated by the lack of hyphen, space or the Zero
Width Non-joiner character in the permissible set of characters in the Root zone
repertoire. Otherwise, V is never required to be allowed to follow an H.
Permitting this may create a perceptive similarity between two labels (with and
without H) for majority of the linguistic communities hence this is explicitly
prohibited by the NBGP.

In future if required, depending on the prevailing requirements from the

community, the future NBGP may consider revisiting this rule.

7.2 Additional Examples from Bangla ABNF:

Below are a few examples which help one understand some of the rules ABNF
puts in place. These are just given for reference purposes and are not meant to
be comprehensive.

1. H, M, B, D or X cannot occur in the beginning of a Bangla word. Example:

◌্ক ◌्क
◌াক ◌ाक
◌ংক ◌ं◌क
ं
◌ঁক ◌ं◌ँक
◌ঃক ◌ःक
As can be seen such combination will result automatically in a “golu” or a dotted
circle marking it as an invalid formation. This is an intrinsic property of the

49
Indian language syllable and is quasi-automatically applied wherever supported
by the OS.

2. H is not permitted after V, B, D, X, M, S

Example:
অ্ अ्
অং◌্ कं ◌्
কঁ ◌্ कँ ◌्
কঃ◌্ कः◌्
ি* )क्

3. Number of B, D or X permitted after Consonant or Vowel or a kāra (Mātrā) is

restricted to one thus the following combinations are invalidated.
Example:
কংং कं ◌ं◌ं
কঁ ◌ঁ कँ ◌ं◌ँ
কঃঃ कःः
কাঁ ◌ঁ काँ ◌ँ
কীঃঃ क*ःः
অংং अं◌ं◌ं
অঁ ◌ঁ अँ ◌ँ
অঃঃ अःः

4. Number of M permitted after Consonant is restricted to one.

Example:
কীী क*ी

5. M is not permitted after V.

Example:
ইা/ .ঈৗ ईा/ ईौ

6. The combinations of Anusvāra + Visarga as well as Visarga + Anusvāra are not

permissible.
Example:

50
কংঃ कं ◌ः
কঃং कः ◌ं

8. Contributors

8.1 Experts from India

Professor Udaya Narayana Singh, Chair-Professor of Linguistics & Dean, Faculty of Arts,
Amity University Haryana, Gurgaon; Pachgaon, Manesar PIN 122431 (Haryana), India.

Professor Pabitra Sarkar, formerly Vice-Chancellor, Rabindra Bharati University,

Kolkata.

Dr Atiur Rahman Khan, Principal Technical Officer, GIST Group, C-DAC, Pune, PIN
411008 (Maharashtra), India.

Mr Rajib Chakraborty, Linguist, Society for Natural Language Technology Research

(SNLTR), Module 114 & 130, SDF Building, Salt Lake, Sector-V, Kolkata-700091 (West
Bengal), India.

Mr Akshat Joshi, Project Engineer, GIST Group, C-DAC, Pune, PIN 411008 (Maharashtra),
India.

Ms Moumita Chowdhury, Senior Technical Officer, GIST Group, C-DAC, Pune, PIN
411008 (Maharashtra), India.

Mr Chandrakanta Murasingh, Agartala, Tripura.

Some other NBGP members.

8.2 Contributors from Bangladesh

Janab Mustafa Jabbar, Honorable Minister, Ministry of Posts, Telecommunications &

Information Technology, Govt of Bangladesh

Prof Shamsuzzaman Khan, Former Director-General, Bangla Academy, Dhaka

51
Prof Rafiqul Islam, National Professor of Humanities, Dhaka.

Prof Swarochis Sarkar, Director, Institute of Bangladesh Studies, Rajshahi University,

Rajshahi, Bangladesh

Prof Jinnat Imtiaz Ali, Director-General, International Mother Language Institute, Dhaka

Mr Mohammad Mamun Or Rashid , Department of Bangla, Jahangirnagar University &

Member, Bangladesh Computer Council

Prof Maniruzzaman, formerly Professor, Chittagong University, Chattagram, Bangladesh

Mr Shyam Sunder Sikder, Secretary, Secretary, Post & Telecommunications Division

Govt of Bangladesh

Mr Md. Mustafa Kamal, Former Director General, Bangladesh Telecommunications

Regulatory Authority, Government of Bangladesh, Dhaka

Brigadier General Md Mahfuzul Karim Majumder, Director-General, Engineering &

Operations Division, Bangladesh Telecommunications Regulatory Authority,
Government of Bangladesh, Dhaka

Md. Ziarul Islam, Programmer, Posts & Telecommunications Division, Government of

Bangladesh, Dhaka

Prof Syed Shahriyar Rahman, Department of Linguistics, University of Dhaka

Dr Mizanur Rahman, Director (In-Charge), Translation, Text Book and International

Relations Division, Bangla Academy, Dhaka

Dr Aparesh Bandyopadhyay, Director, Bangla Academy, Dhaka

Mr Md Mobarak Hossain, Director, Bangla Academy, Dhaka

Dr Jalal Ahmed, Director, Bangla Academy, Dhaka

Mr Jahangir Hossain, Internet Society Bangladesh (ICANN ALS)

Janab Sarwar Mostafa Choudhury, Bangladesh Computer Council, Dhaka

Janab Md Rashid Wasif, Bangladesh Computer Council, Dhaka

52
Janab Istiaque Arif, Senior Assistant Director, Bangladesh Telecommunications
Regulatory Authority, Dhaka

Ms. Afifa Abbas, Information Security and Governance Lead Engineer at Banglalink, and
ICANN Fellow.

Mr Mohammad Abdul Haque, Secretary General, Bangladesh Internet Governance

Forum

Mr Imran Hossen, CEO, EyeSoft and key member of Bangladesh Association of Software
& Information Services (BASIS).

Ms Shahida Khatun, Director, Folklore, Museum & Archive Division, Bangla Academy,
Dhaka

Mr Syed Ashik Rehman, CEO, Bengal Media Corporation, Dhaka

Mr Haseeb Rahman, CEO, Professionals’ Systems, Dhaka

9. References
[100] Unicode Consortium. 2017. Unicode Standard 10.0. Mountain View CA.

[101] Bandyopadhyay, Chittaranjan. 1981. Dui Shataker Bangla Mudran o Prakashan.

Kolkata: Ananda Publishers.

[102] Banerji, R.D. 1919. The Origin of the Bengali Script. Kolkata. New Delhi; Asian
Educational Services; 2003 reprint.

[103] Chatterji, S.K. 1926. The Origin and Development of the Bengali Language.
Calcutta: Calcutta University Press.

[104] -----. 1939. Bhasha-prakash Bangala Vyakaran (A Grammar of the Bengali

Language), Calcutta: University of Calcutta.

[105] Hai, Muhammad Abdul. 1964. Dhvani Vijnan O Bangla Dhvani-tattwa (Phonetics
and Bengali Phonology), Dhaka: Bangla Academy.

[106] Jha, Subhadra. 1958. The Formation of Maithili. London: Luzac & Co.

[107] Kostic, Djordje; Das, Rhea S. 1972. A Short Outline of Bengali Phonetics, Calcutta:
Statistical Publishing Company.

[108] Majumdar, R.C. 1971. History of Ancient Bengal, Calcutta: G. Bhardwaj.

[109] Mazumdar, Bijaychandra. 1920/2000. The History of the Bengali Language (Repr.
Calcutta, 1920. ed.). New Delhi: Asian Educational Services.

[110] Pandey, Anshuman. 2001. Proposal to Encode the Tirhuta Script in ISO/IEC
10646.

53
[111] Pal, Palash Baran. 2001. Dhwanimala Barnamala. Kolkata: Papyrus.

[112] -----. 2007. ‘Bangla Harapher Panch Parba’. In Swapan Chakraborty, ed.
Mudraner Sanskriti O Bangla Boi. Kolkata: Ababhas.

[113] Ross, Fiona. 1999. The Printed Bengali Character and its Evolution. London:
Curzon.

[114] Shastri, Mahamahopadhyay Hara Prasad. 1916. Hājār Bacharēr Purāṇa Bāṅgālā
Bhāṣāy Bauddha Gān ō Dōhā. Calcutta: Bangiya Sahitya Parishat.

[115] Singh, Udaya Narayana (Jointly Maniruzzaman). 1983. Diglossia in Bangladesh

and language planning. Calcutta: Gyan Bharati. 214 pp.

[116] -----. 1987. A Bibliography of Bengali Linguistics. Mysore: CIIL. xii+316 pp.

[117] -----. 2017. (with Rajib Chakraborty, Bidisha Bhattacharjee & Arimardan Kumar
Tripathy) Languages and Cultures on the Margin: Guidelines for Fieldwork on Endangered
Languages. Mimeo. Centre for Endangered Languages, Visva-Bharati.

[118] -----. 1980. Scriptal choice and spelling reform: An essay in language and
planning. Journal of the M.S. University of Baroda, Social Science Number, 29.2 : 173-
186. A modified version reprinted E. Annamalai, Bjorn Jernudd and Joan Rubin, eds.
Language Planning: Proceedings of an Institute. Mysore: CIIL. 405-417.

[119] Sripantha. 1996. Jakhan Chapakhana Elo. Kolkata: Paschim-Banga Bangla

Academy.

[120] Sur, Atul. 1986. Bangla Mudraner Dusho Bachar. Kolkata: Jijnasa.

[121] Script Behaviour for Bengali, Version 1.1, TDIL and C-DAC Pune.

[122] Bora, Mahendra. 1981. The Evolution of Assamese Script. Jorhat: Assam Sahitya
Sabha.

[123] Proposal to Encode the Tirhuta Script in ISO/IEC 10646,

http://www.unicode.org/L2/L2011/11175r-tirhuta.pdf accessed on 25.11.2017

[124] Ethnologue, Assamese in the Language Cloud,

https://www.ethnologue.com/cloud/asm accessed on 25.11.2017

[125] Bengali alphabet for Manipuri, found in Ethnologue, Manipuri (Meeteilon/

Meithei), https://www.omniglot.com/writing/manipuri.htm accessed on 20.10.2019

[126] Wikipedia, Bengali alphabet, https://en.wikipedia.org/wiki/Bengali_alphabet

accessed on 25.11.2017

[129] Omniglot, Slyheti, http://www.omniglot.com/writing/syloti.htm accessed on

10.5.2018

[130] Wikipedia, Bishnupriya Manipuri language,

https://en.wikipedia.org/wiki/Bishnupriya_Manipuri_language accessed on 25.11.2017

54
[131] The EMILLE/CIIL Corpus, http://metashare.elda.org/repository/browse/the-
emilleciil-
corpus/abdd35c8de6f11e2b1e400259011f6ea6bce74d38dbb42d881da76c64a6adb20
/ accessed on 10.5.2018

[132] The EMILLE/CIIL Corpus,

http://catalog.elra.info/product_info.php?products_id=696 accessed on 10.5.2018

[133] Bangla Language & Script, https://www.isical.ac.in/~rc_bangla/bangla.html

accessed on 10.5.2018

[134] Sarkar, Pabitra. 1992. Bangla Banan Sanskar: Samasya o Sambhabana. Kolkata:
Chirayata Prakashan.

[135] Sarkar, Pabitra. 1993. Bangla Bhashar Yuktabyanjan. Bhasha 1.1: 23-45.

[136] Dash, Niladri Shekhar and B.B.Chaudhuri. 1998. Bangla Script: A Structural
Study. Linguistics Today 1.2: 1-28. Also available at
https://www.academia.edu/9967428/Bangla_Script_A_Structural_Study

[137] Dani, Ahmed Hasan. (1957) ‘Srīhaṭṭa-Nāgarī Lipir Utpatti o Bikāś.’ Bangla
Academy Patrika (Dhaka), Vol 1.2. (Bhadra-Agrahayan, 1364 Bangabda Number).pg 1.

[138] Wikipedia, Sylheti Nagari,

https://en.wikipedia.org/wiki/Sylheti_Nagari accessed on 19.5.2018

[139] Furui, Ryosuke. (2015). ‘Variegated Adaptations: State Formation in Bengal from
the Fifth to Seventh Century’, in Bhairabi Prasad Sahu & Hermann Kulke, eds.
Interrogating Political Systems: Integrative Processes and States in Pre-Modern India.
Chapter 9. Pp 255-73. New Delhi: Manohar.

[140] Ferguson, Chares A. and Munier Chowdhury. (1960) ‘Phones of Bengali’,

Language, Vol. 36, No. 1, pp. 22-59.

[141] Shahidullah, Muhammad. (2007) Buddhist Mystic Songs. Dhaka: Mowla Brothers.

[142] Ray, Punya Sloka. (1966) Bengali Language Handbook. Washington.

[143] Hai, Muhammad Abdul. (1960) A phonetic and phonological study of nasals and
nasalization in Bengali. Dhaka: University of Dhaka.

55
[144] Unicode Consortium, Proposal Summary Form to Accompany Submissions for
Additions to the Repertoire of ISO/IEC 10646 / UNICODE,
https://www.unicode.org/L2/L2002/02387r-syloti-form.pdf accessed on May 21,
2018

[145] Wikipedia, Ol Chiki (Unicode block),

https://en.wikipedia.org/wiki/Ol_Chiki_(Unicode_block) accessed on May 21, 2018

[146] Bangla Script, http://www.bangladesh2000.com/bd/bangla_script.html

accessed on May 21, 2018

[147] Bhattacharya, Ashutosh ed. (1942) Gopichandrer Gan, Calcutta: Calcutta

University.

[149] Das, Sisir Kumar. (1975) Sahibs and Munshis: An Account of the College of Fort
William. Calcutta.

[150] Islam, Rafiqul, Pabitra Sarkar, Mahbubul Haq & Rajib Chakraborty (eds.). (2014)
Bangla Academy Promito Bangla Byabaharik Byakaran (A Functional Grammar of
Standard Bangla). Dhaka: Bangla Academy.

[151] Sarkar, Pabitra. [2013] ‘Bangla Spelling Reform: the Long and Short of It’. Bangla
Journal 19: 215-232.

[152] Bangla Academy. (2012) Bangla Academy Promito Bangla Bananer Niyam
(Standard Bangla Spelling as adopted by Bangla Academy). Dhaka: Bangla Academy.

[153] Sarkar, Pabitra & Rajib Chakraborty. 2018. “What has happened So Far In terms
of Script Reforms”. Paper presented at the Face to Face meeting jointly held by the
Bangla Academy, Dhaka & ICANN at Bangla Academy, Dhaka on 10.07.2018.

[154] The Unicode Consortium. 2018. The Unicode® Standard Version 11.0 – Core
Specification. Chapter 12, P. 473.

56
10. Appendix- I

10.1 Augmented Backus-Naur Formalism (ABNF)

The Augmented Backus-Naur Formalism (ABNF) is generic in nature and when
applied to a specific language/script, certain restriction rules apply. In other words,
in a given language some of the Formalism structures do not necessarily apply. To
take care of such cases restriction rules are set in place. These restrictions will help
to fine-tune the ABNF.

In case of Bangla13 in particular the following rules apply:

1. Khaṇḍa ta (ৎ) is NOT allowed at the beginning of an IDN label. The same
applies to ঞ and the velar nasal ঙ in the Bangla Scheme of five-fold ‘varga’ (as
defined under Table 5). Moreover, Bangla does not allow ya (য়) in the
beginning of a word either but we can cite a couple of native examples, for
example, the word য়8াwেড়া (yæbbɔRo) from the poem ‘Lichuchor’ written by
Kazi Nazrul Islam. However, there are instances of it being used in names,
mostly of foreign origin such as Yaqub which may be written with ya (য়) in
the beginning as in য়াxব). In very recent times, while transliterating some
Chinese and Japanese names in Bangla, one does come across the possibility
of Khaṇḍa ta (ৎ) followed by sa (স) in the beginning of a word, for example
yেসিরং (Tsering).

2. CH can come with Khaṇ ḍ a Ta in only the case where C is ra (র) (09B0).
ৎ6 as in ভৎ6 সনা

3. Only following combinations with VHCM will be allowed.

→ অ8া (together pronounced as æ) as in অ8ািসড (acid)
→ এ8া (together also pronounced as æ) as in এ8ািসড, এ8ােসািসেয়শান
(acid, association)

10.2 ‘Sylheti Nā garı̄ lipi’ or ‘Siloṭi’

This version of Bangla script resembles the ‘Kaithī’ script (ISO 12954) used by the
Accountants (perhaps by the Kāyastha community) in Eastern Uttar Pradesh and
Bihar – widely in use during the 1880s. There were several other names of Sylheti

13 This section specifically takes up issues of restrictions pertaining to Bangla (Bengali) language. Assamese
and Maṇipuri have not been covered in this section.

57
Nā garı̄ or Siloti (129) – such as ‘Jā lā lā bā da Nā garı̄’, ‘Fula (flower) Nā garı̄’, ‘Muslim
Nā garı̄’, or ‘Muhā mmad Nā garı̄’. It is said that Shā h Jā lā la had brought the script with
him in 13th-14th Century in Sylhet (138), although some suggested that it was an
invention by the Afghan rulers of Sylhet (137). Some ascribe the credit to the
Buddhist Bhikkhus from Nepal. Purely for historical reasons, the details of the script
with 32 symbols are reproduced here (138):

Table 17 – The Script Table of Sylheti Nā garı̄ or Siloṭi

10.3 Confusable code points

The following code points were analysed and concluded that they are either (a)
distinguishable or (b) confusable but not enough to be defined as variant code
points.

10.3.1 Bangla and Nāgarī or Devanāgarī

NBGP
Bangla Devanāgarī Decision
◌ঃ U+0983 ◌ः U+0903 Confusable
ও U+0993 उ U+0909 Confusable
ঘ U+0998 घ U+0918 Confusable
◌ঁ U+0981 ◌ॅ U+0945 Confusable
Table 18: Bangla and Devanāgarī confusable code points

58
10.3.2 Bangla and Gurmukhi

NBGP
Bangla Gurmukhi decision
ঘ U+0998 ਬ U+0A2C Confusable
◌ঁ U+0981 ◌ੱ U+0A71 Confusable

Table 19: Bangla and Gurmukhi confusable code points

Gurmu NBGP decision

Bangla khi
ও U+0993 ਤ Distinguishable
U+0A24
শ U+09B6 ਅ Distinguishable
U+0A05
ম U+09AE ਮ Distinguishable
U+0A2E
বা U+09AC and ਗ Distinguishable
U+09BE U+0A17

Table 20 – Bangla and Gurmukhı̄ distinguishable code points

10.3.3 Bangla and Oriya (Odia)

Bangla Oriya (Odia) NBGP

Decision
ও U+0993 ଓ U+0B13 Confusable

Table 21 – Bangla and Oriya distinguishable code points

Bangla Oriya (Odia) NBGP

Decision
ঘ U+0998 ସ U+0B38 Distinguishable

Table 22 – Bangla and Oriya distinguishable code points

59
11. Appendix -II
Bengali consonants and their allographs
Consonants Phonetic Value Allographs

Clusters Transparent
Form (Bangla
Akademi font)

প /p/ z ({+ত), | ({ + ন), } ({ + প), প8

({ + য), r ({ + র), ~ ({ + ল), •
({ + স)

€/€ (•+প), ‚ (ƒ+প)

ফ /pʰ/ „ (u+ র), … (u + ল)

†/† (•+ফ)

ব /b/ ‡ (ˆ + জ), ‰ (ˆ + দ), Š (ˆ + ধ), / (0+ধ)

w (ˆ+ব), ব8 (ˆ+য), ‹ (ˆ+র), Œ
(ˆ+ল), •ভ (ˆ+ভ)

Ž (•+ব), • (•+ব) 2 (3+ব)

ভ /bʱ/ ভ8 (‘+য), ’ (‘+র), “ (‘+ল)

ত /t/ ” (y+ত), ”8 (y+y+য), •

(y+y+ব), – (y+থ), — (y+ন), ত8
(y+য), ˜ (y+ম), ˜8 (y+™+য), š
(y+ব), › (y+র)

z ({+ত), œ (p+ত), • (p+y+ব),

ž (Ÿ+ত), q8 (Ÿ+y+ +য), ¡
(•+y+র)
& (5+ত)
There is a marked form of
ত+◌্=ৎ, ৎ6 ( +y/ৎ)

60
Consonants Phonetic Value Allographs

Clusters Transparent
Form (Bangla
Akademi font)

থ /tʰ/ থ8 (¢+য), £ (¢+র)

¤ (•+থ), – (y+থ), ¥ (Ÿ+থ) " (7+থ), 9 (:+থ)

(§+ধ), দ8 (§+য), « (§+ব), ¬
(§+ভ), - (§+র)
‰ (ˆ+দ), ® (Ÿ+দ), ¯ (Ÿ+§+র), -6
( +§+র)

ধ /dʱ/ ° (±+ন), ² (±+ম), ধ8 (±+য), ³

(±+র)

( (?+ধ), > (<+ধ),

´ (µ+ধ), ª (§+ধ), Š (ˆ+ধ), ¶
/ (0+ধ), @ (7+ধ)
(Ÿ+ধ)

ট /ʈ/ · (¸+ট), ট8 (¸+য), ¹ (¸+ব), º

(¸+র)

» (p+ট), ¼ (½+ট)

ঠ /ʈʰ/ ঠ8 (¾+য)

¿ (À+ঠ), Á (½+ঠ)

ড /ɖ/ Â (Ã+ড), ড8 (Ã+য), Ä (Ã+র)

ঢ /ɖʱ/ ঢ8 (Å+য)
Æ (À+ঢ)

চ /t͡ʃ/ Ç (È+চ), É (È+ছ), Ê (È+Ë+র),

Ì (È+ঞ), চ8 (È+য)

Í (Î+চ), Ï (Ð+চ)
# (A+চ)

61
Consonants Phonetic Value Allographs

Clusters Transparent
Form (Bangla
Akademi font)

ছ /t͡ʃʰ/ Ñ (Ë+র)

É (È+ছ), Ò (Î+ছ), Ó (Ð+ছ)

$ (A+ছ)

জ /dʒ/ Ô (Õ+জ), Ö (Õ+Õ+ব), ×

(Õ+ঝ), Ø (Õ+ঞ), জ8 (Õ+য), Ù
(Õ+র)

Ú (Î+জ) % (A+জ)

ঝ /dʒʱ/ (not privileged enough to have

clusters as a first member)

× (Õ+ঝ), Û (Î+ঝ)

ক /k/ Ü (p+ক), » (p+ট), œ/œ (p+ত), & (5+ত), .

Ý (p+y+র), • (p+y+ব), Þ (5+E+র), G
(5+E+ব), ' (5+র)
(p+ন), ß (p+ব), à (p+ম), ক8
(p+য), á (p+র), â (p+ষ), ã
(p+½+ণ), ä (p+½+ম), å
(p+½+ব), â8 (p+½+য), æ (p+স)

t (ç+ক), è (•+p+র)

) (H+ক), J
(:+5+র)

খ /kʰ/ (not privileged enough to have

clusters as a first member)

é (ç+খ)

62
Consonants Phonetic Value Allographs

Clusters Transparent
Form (Bangla
Akademi font)

গ /g/ ê (µ+গ), ë (µ+দ), ´ (µ+ধ), ì ( (?+ধ)

(µ+ন), í (µ+ব), î (µ+ম), গ8
(µ+য), ï (µ+র), ð (µ+ল)

ñ (ç+গ), ñ6 ( +ç+গ) * (H+গ), *K

(L+H+গ)

ঘ /gʱ/ ò (ó+ন), ঘ8 (ó+য), ô (ó+র)

õ (ç+ঘ)

ঞ This letter does not Í (Î+চ), Ò (Î+ছ), Ú (Î+জ), Û # (A+চ), $

have any particular (Î+ঝ) (A+ছ), % (A+জ),
phonetic value, but M (A+ঝ)
mostly pronounced
Ø (Õ+ঞ),
as /n/.

ণ /n/ ö (À+ট), ¿ (À+ঠ), ÷ (À+ড), ø O (P+ড), -

(À+Ã+র), Æ (À+ঢ), ù (À+ণ), ণ8 (P+R+র)

(À+য), ú (À+ব)

ã (p+½+ণ), û (½+ণ), ü (•+ণ) + (S+ণ)

ঙ/◌ং /ŋ/ t (ç+ক), ý (ç+p+র), é (ç+খ), ) (H+ক), *

ñ (ç+গ), õ (ç+ঘ), þ (ç+p+ষ), (H+গ), U (H+ঘ)
(In some contexts ç is
replaced by ◌ং )

কং, অং

63
Consonants Phonetic Value Allographs

Clusters Transparent
Form (Bangla
Akademi font)

ম /m/ ÿ (™+ল), ! (™+প), " (™+{+র),

# (™+ভ), $ (™+‘+র), % (™+ম),
& (™+র),

˜ (y+ম), ² (±+ম), ' (•+ম), ä W (3+ম)

(p+½+ম)

ন /n/ ( (Ÿ+ট), ) (Ÿ+¸+র), * (À+ঠ), " (7+থ), @ (7+ধ),

v (Ÿ+ড), + (Ÿ+Ã+র), ž (Ÿ+ত), , (7+Y+র)

q (Ÿ+y+র), q8 (Ÿ+y+ +য), ¥

(Ÿ+থ), ® (Ÿ+দ), ¯ (Ÿ+§+র), ¶
(Ÿ+ধ), , (Ÿ+±+র), - (Ÿ+§+ব),
. (Ÿ+ন), / (Ÿ+ম), ন8 (Ÿ+য), 0
(Ÿ+স)

1 (•+ন)

শ /ʃ/ Ï (Ð+চ), Ó (Ð+ছ), 2 (Ð+ন), 3

(Ð+ম), 4 (Ð+র), 5 (Ð+ল), শ8
(Ð+য)

ষ /ʃ/ 6 (½+ক), ¼ (½+ট), Á (½+ঠ), û + (S+ণ)

(½+ণ), 7 (½+প), 8 (½+{+র), 9
(½+ফ), ¼ (½+ট), : (½+¸+র), Á
(½+ঠ), û (½+ণ), ষ8 (½+য)

â (p+ষ), ã (p+½+ণ), ä
(p+½+ম)

64
Consonants Phonetic Value Allographs

Clusters Transparent
Form (Bangla
Akademi font)

স /s/ & /ʃ/ ;/; (•+ক), < (•+ট), € (•+প), 9 (:+থ)

† (•+ফ), = (•+ত), ¤ (•+থ), <
(•+ট), ; (•+ক), > (•+খ), স8
(•+য), ? (•+র), @ (•+ল)

æ (p+স)

হ /h/ ü (•+ণ), 1 (•+ন), ' (•+ম), হ8 W (3+ম)

(•+য), A (•+র), B (•+ল)

ড় /ɽ/ C (D+গ)

ঢ় /ɽʱ/ (not privileged enough to have

clusters)

য /dʒ/ ক8 (p+য), স8 (•+য), র8 ( +য)

The secondary [Just র8 is never used in Bangla
symbol (allograph)
orthography. র8া is, but then
jɔ-phalā has two
its last two symbols, Ya-phalā
phonetic values.
ā -kā ra, constitute a vowel sign,
When added to the
initial consonant in a representing the vowel অ8া.]
word, it is a vowel
/æ/ (as in শ8ামল,
র8াপার, etc.). But after
a non-initial
consonant, it just
doubles it in
pronunciation (as in
কায6, ধায6, etc.). The
+য combination has
two physical
manifestations—র8
and য6.

65
Consonants Phonetic Value Allographs

Clusters Transparent
Form (Bangla
Akademi font)

র /r/ Two manifestations—

i. lরফ /repʰ/ as the first
member of a cluster,
e.g., প6, ৎ6 , -6 , য6, E6
( +±+ব) (earlier
F6 = +§+±+ব, a four-
term cluster), etc.
(placed over the
following
consonant)
ii. র-ফলা /rɔ-pʰɔla/ as the
second/third
member of a cluster,
e.g., &, ¡, etc. (placed
under the
consonant it
follows)
ল /l/ G (ƒ+গ), ‚ (ƒ+প), H (ƒ+ব), I
(ƒ+ম), J (ƒ+ট), K (ƒ+ড), L
(ƒ+ক), G (ƒ+গ), M (ƒ+দ), ল8
(ƒ+য)

ð (µ+ল), “ (‘+ল), ÿ (™+ল)

◌ঃ /h/ word finally, অঃ, কঃ

word medially it
doubles the
pronunciation of the
following consonant.

◌ঁ / ̃/ অঁ, বঁ

Bengali Vs English Language
100% (1)
Bengali Vs English Language
2 pages
Vedic Accents: Four Sounds in Vedic Texts
No ratings yet
Vedic Accents: Four Sounds in Vedic Texts
20 pages
BAngla Script
No ratings yet
BAngla Script
29 pages
LinguisticsToday 02
No ratings yet
LinguisticsToday 02
29 pages
Bangla Language & Script:: Linguistic History Style Bengali Dialect Bengali Script Technical Characteristics
No ratings yet
Bangla Language & Script:: Linguistic History Style Bengali Dialect Bengali Script Technical Characteristics
19 pages
Assignment On Evaluation of Bangla Language
No ratings yet
Assignment On Evaluation of Bangla Language
5 pages
Bangla Abcd
No ratings yet
Bangla Abcd
6 pages
His 101 Lecture 13 History of Bangla Language
No ratings yet
His 101 Lecture 13 History of Bangla Language
46 pages
Bengali
No ratings yet
Bengali
13 pages
Bengali: Anusvāra Bisarga Candrabindu (Anunāsika) Abagraha
100% (1)
Bengali: Anusvāra Bisarga Candrabindu (Anunāsika) Abagraha
25 pages
Chapter 3 (About Bangla Character)
No ratings yet
Chapter 3 (About Bangla Character)
6 pages
Bangla Tanmoy Bhattacharya
100% (1)
Bangla Tanmoy Bhattacharya
23 pages
BS Lecture 03 (Language, Linguistics, & Literature)
No ratings yet
BS Lecture 03 (Language, Linguistics, & Literature)
22 pages
Pahari Language and Writing Systems
No ratings yet
Pahari Language and Writing Systems
14 pages
Lecture 3 Origin of The Name of Bangladesh, Bangla Language & People of Bangladesh
No ratings yet
Lecture 3 Origin of The Name of Bangladesh, Bangla Language & People of Bangladesh
17 pages
Lecture 3 Origin of The Name of Bangladesh, Bangla Language People of Bangladesh
No ratings yet
Lecture 3 Origin of The Name of Bangladesh, Bangla Language People of Bangladesh
17 pages
Bengali Alphabet and Numbers 04580185
No ratings yet
Bengali Alphabet and Numbers 04580185
14 pages
Bengali
No ratings yet
Bengali
14 pages
Subject Indexing For Bengali Publications: Some Issues and Perspectives
No ratings yet
Subject Indexing For Bengali Publications: Some Issues and Perspectives
12 pages
Verb Morphology in Colloquial Bangla
No ratings yet
Verb Morphology in Colloquial Bangla
22 pages
Mynmar
No ratings yet
Mynmar
6 pages
Language Situation in Bangladesh
No ratings yet
Language Situation in Bangladesh
11 pages
Article
No ratings yet
Article
25 pages
Bengali Alphabet, Pronunciation and Language
No ratings yet
Bengali Alphabet, Pronunciation and Language
8 pages
ITCS-Unit-02 Notes
No ratings yet
ITCS-Unit-02 Notes
32 pages
A Grammar Sketch of The Bugis Language
100% (2)
A Grammar Sketch of The Bugis Language
68 pages
BENGALI Lang and Culture Presentation PDF
No ratings yet
BENGALI Lang and Culture Presentation PDF
42 pages
BENGALI Lang and Culture Presentation
100% (1)
BENGALI Lang and Culture Presentation
42 pages
Itcs-Ch 2
No ratings yet
Itcs-Ch 2
32 pages
Bengali Reference Grammar
No ratings yet
Bengali Reference Grammar
579 pages
wongEtAl 23 A-Very-Brief
No ratings yet
wongEtAl 23 A-Very-Brief
18 pages
Syllabaries: Consonant-Based Logographies
No ratings yet
Syllabaries: Consonant-Based Logographies
2 pages
BUS 251 Cultural Identity of Bangladesh
No ratings yet
BUS 251 Cultural Identity of Bangladesh
29 pages
Unit 2 - Indian Literature, Culture, Tradition, and Practices
No ratings yet
Unit 2 - Indian Literature, Culture, Tradition, and Practices
25 pages
CH 15
No ratings yet
CH 15
41 pages
Language
No ratings yet
Language
1 page
Types of Writing Systems
No ratings yet
Types of Writing Systems
4 pages
Bangla Text To Speech Using Festival: Firoj Alam S.M. Murtoza Habib Mumit Khan
No ratings yet
Bangla Text To Speech Using Festival: Firoj Alam S.M. Murtoza Habib Mumit Khan
8 pages
R.Olocco Bengali MATD14 Hires PDF
No ratings yet
R.Olocco Bengali MATD14 Hires PDF
118 pages
A Rewrite Rule Based Model of Bangla Morpho-Phonological Change
No ratings yet
A Rewrite Rule Based Model of Bangla Morpho-Phonological Change
8 pages
Talk Edit View History: Article Read
No ratings yet
Talk Edit View History: Article Read
10 pages
Origin of Bangalees - Bangladesh - Bangla Language
No ratings yet
Origin of Bangalees - Bangladesh - Bangla Language
14 pages
Lecture 5-6 Orign of The Name of Bangladesh
No ratings yet
Lecture 5-6 Orign of The Name of Bangladesh
18 pages
Lecture Slide 2 Origin of Bangalees - Bangladesh - Bangla Language
No ratings yet
Lecture Slide 2 Origin of Bangalees - Bangladesh - Bangla Language
16 pages
Sanskrit
No ratings yet
Sanskrit
1 page
Miller 11 Graph On Omic P
No ratings yet
Miller 11 Graph On Omic P
62 pages
Sanskrit - Wikipedia
No ratings yet
Sanskrit - Wikipedia
67 pages
Origin and Identity of The People of Bangladesh and History of The Bangla Language
No ratings yet
Origin and Identity of The People of Bangladesh and History of The Bangla Language
20 pages
Script
No ratings yet
Script
13 pages
Semantic Values in Translating From English To Bangla. : Mohammad Daniul Huq Jahangirnagar University, Savar, Dhaka
No ratings yet
Semantic Values in Translating From English To Bangla. : Mohammad Daniul Huq Jahangirnagar University, Savar, Dhaka
17 pages
Punjabi Gurumukhi Hindi Words
No ratings yet
Punjabi Gurumukhi Hindi Words
17 pages
Bangla Lang's History by Rajkonna Islam
No ratings yet
Bangla Lang's History by Rajkonna Islam
7 pages
Malayalam Script - Wikipedia
No ratings yet
Malayalam Script - Wikipedia
58 pages
Oiuyfghj
No ratings yet
Oiuyfghj
4 pages
Lecture Notes by DR V V Jaddipal
No ratings yet
Lecture Notes by DR V V Jaddipal
6 pages
Language and Communication
No ratings yet
Language and Communication
6 pages
The History of Bengali Language
0% (1)
The History of Bengali Language
3 pages
FYUG-DOC2 - Other Courses Sem I and Sem II
No ratings yet
FYUG-DOC2 - Other Courses Sem I and Sem II
87 pages
Sbi Clerk Previous Year Paper PDF
No ratings yet
Sbi Clerk Previous Year Paper PDF
90 pages
Indus Inscriptions by Yajnadevam
No ratings yet
Indus Inscriptions by Yajnadevam
10 pages
Samskrita Uccaranam Basics
No ratings yet
Samskrita Uccaranam Basics
8 pages
When You Complain
No ratings yet
When You Complain
15 pages
Hindi Swar
No ratings yet
Hindi Swar
8 pages
2024-09-13 15 - 37 - 57 - Syl2030
No ratings yet
2024-09-13 15 - 37 - 57 - Syl2030
107 pages
Alphanumeric Series (Basic Concept) Lecture - 1
No ratings yet
Alphanumeric Series (Basic Concept) Lecture - 1
19 pages
Machine 125 Question
No ratings yet
Machine 125 Question
65 pages
Alphabet PDF 7
No ratings yet
Alphabet PDF 7
19 pages
Marathi Indic Input 2-User Guide
No ratings yet
Marathi Indic Input 2-User Guide
22 pages
Alphanumeric Series
No ratings yet
Alphanumeric Series
38 pages
A Basic Course in Bodo
No ratings yet
A Basic Course in Bodo
20 pages
Reasoning Questions PDF - 1421
No ratings yet
Reasoning Questions PDF - 1421
54 pages
20 136 1 PB
No ratings yet
20 136 1 PB
17 pages
Nikkud Exercise
No ratings yet
Nikkud Exercise
6 pages
Set C Computer
No ratings yet
Set C Computer
3 pages
Devanagari Transliteration Sheet
No ratings yet
Devanagari Transliteration Sheet
1 page
Quant Checklist 498 by Aashish Arora For Bank Exams 2024
No ratings yet
Quant Checklist 498 by Aashish Arora For Bank Exams 2024
118 pages
Alphanumeric 1652933186637
No ratings yet
Alphanumeric 1652933186637
56 pages
Kawi Unicode Draft 20jul20
No ratings yet
Kawi Unicode Draft 20jul20
21 pages
Class - Kg-I - Syllabus - Term - III 2024-25
No ratings yet
Class - Kg-I - Syllabus - Term - III 2024-25
2 pages
Resource 20240522125344 Class 2 Holiday Homework 2024 - 25
No ratings yet
Resource 20240522125344 Class 2 Holiday Homework 2024 - 25
7 pages
The Unicode Standard, Version 12.0
No ratings yet
The Unicode Standard, Version 12.0
69 pages
Hindi Script Book - Sarvabhashin
94% (16)
Hindi Script Book - Sarvabhashin
245 pages
Question Box - 01 by Saurav Sir For All Bank Exams
No ratings yet
Question Box - 01 by Saurav Sir For All Bank Exams
106 pages
A2404220395 Credila Loan Agreement
No ratings yet
A2404220395 Credila Loan Agreement
60 pages
Devangiri
No ratings yet
Devangiri
44 pages