This work is licensed under a Creative Commons
Attribution-ShareAlike 4.0 License.
Status of CJK language support in
LibreOffice 2024
Shinji Enoki
shinji.enoki@libreoffice.org
in COSCUP 2024
2024-08-04
2
Shinji Enoki (榎真治)
● From Settsu City, Japan
● Part of Osaka Prefecture. I moved here a month ago.
● Member of LibreOffice Japanese Team (2011-)
● Membership Committee substitute members
of The Document Foundation (2020-)
● Activity: organizing events, building communities,
sometimes Q&A, QA, translate
This is my 6th COSCUP
●
I joined COSCUP 2018 and 2019, 2021(online), 2022(online),
2023(online)
●
I'm happy to be able to join in person after a long time.
5
LibreOffice Asia Conference 2024
● This year it was held in Taiwan, together with COSCUP
2024
● 8/2 Government Day @ Minsheng Building
● 8/3 Community Day @ NTUST (COSCUP LibreOffice track)
● https:/
/conf.libreoffice.asia/
6
Agenda
● About CJK functions of LibreOffice
● Recent CJK issues
● Recent efforts in the LibreOffice community
7
Have you heard the term "CJK" before?
8
What are CJK issues / bugs
● CJK is an abbreviation for “Chinese-Japanese-Korean”
● Chinese, Japanese, and Korean are different languages, but
they have some common features
● For example, they use kanji-based characters use vertical
writing, etc...
About CJK functions of LibreOffice
10
Overview of LibreOffice CJK functions
● Multibyte character
● Text Layout
● Vertical writing
● Phonetic guides (ruby)
● Page Formats, include Line Composition
● Input methods
● Fonts
● etc...
11
Multibyte character
● Use Unicode to handle multibyte characters
● UTF-8, UTF-16, UTF-32 (Character Encoding Form)
https://ja.wikipedia.org/wiki/Unicode
12
Unicode Surrogate Pairs
● Expressing variations in characters
● 16 bit x 2
● There was a bug in LibreOffice before, but now it's fine
13
● Other character codes were also used in the past. For
Japanese there were three:
● ISO-2022-jp (JIS Code)
● EUC-JP (Extended Unix Code)
● Shift-JIS (SJIS)
● Sometimes it's not just a CJK bug, but a multi-byte issue
that affects other languages
14
Vertical writing
● Writer: [Page Style] setting can be set to vertical writing
15
Vertical writing (In various places)
縦
書
Shape
16
Justification (Line Adjustment)
● Japanese justification: Calculations based on multiple
conditions are required
Option off
18
Differences in rules among CJK
● There are many CJK features that have different
requirements and behavior depending on the language
● The position and shape of commas are different between
Chinese and Japanese.
19
Reference: W3C Requirements documents
● Requirements for Japanese Text Layout
● https:/
/www.w3.org/TR/jlreq/
● “This document describes requirements for general Japanese layout realized with
technologies like CSS, SVG and XSL-FO. ”
●
Requirements for Chinese Text Layout 中文排版需求 (01 July 2023 updated)
● https:/
/www.w3.org/TR/clreq/
● “This document was developed by people working in different areas, using both
Simplified and Traditional Chinese. ”
● Requirements for Hangul Text Layout and Typography
● https:/
/www.w3.org/TR/klreq/
20
References book
● CJKV Information Processing, 2nd Edition(Ken Lunde,
O'Reilly Media, Inc., December 2008)
● https:/
/learning.oreilly.com/library/view/cjkv-information-pro
cessing/9780596156114/
21
Other slides
● COSCUP 2019
● https:/
/www.slideshare.net/eno_eno/state-of-cjk-issues-of-libreoffice-2019
● COSCUP 2021
● https:/
/www.slideshare.net/eno_eno/state-of-cjk-issues-of-libreoffice-2021-in-coscup
● COSCUP 2022
● https:/
/www.slideshare.net/eno_eno/improve-features-about-our-language-cjk-issues
-of-libreoffice-in-2022
● LibreOffice Conference 2021
● https:/
/www.slideshare.net/eno_eno/state-of-cjk-issues-of-libreoffice-2021-edition
Recent CJK issues
23
Meta issue for CJK
● Bug 83066 : [META] CJK (Chinese, Japanese, Korean, and Vietnamese)
language issues
● Meta issue for each CJK language
● Bug 113193 : [META] Traditional Chinese (zh_TW, zh_HK)
● Bug 113194 : [META] Simplified Chinese (zh_CN)
● Bug 113195 : [META] Japanese CJK issues
● Bug 113196 : [META] Korean
● Bug 119352 : [META] Language issues
Basically, Bug 83066 is used
●
There is a list link in "Depends on"
for easy tracking
25
Japanese, vertical CTL text: some pasted text
displayed incorrectly: Bug#155772
● In Writer, there are occasionally overlap
in the vertical writing.
● Fixed 25.2.0 by Jonathan Clark
26
PDF export overlaps CJK characters when document has
both vertical and horizontal text: Bug 157390
● Problems exporting to PDF
●
● Fixed in 24.8.0.2
by Jonathan Clark
● (demo)
27
Full-width list numbering incorrectly uses Latin digits for
lines beginning with Latin characters : Bug 161804
● When multibyte is selected
in Writer list symbol
● Not fixed
● (Demo)
28
Does not display some characters when inputting
Japanese: Bug#152293
● During Japanese input, the part exceeding the width of the cell is
not displayed
● Only windows
● Still not fixed
● I don't know if
this is only
in Japanese
29
Bug 161145 - Japanese characters (kanjis, hiragana,
katana...) have an extra spacing since version LO 7
● Not fix
● This is a
regression
● (demo)
6.4
24.2
Recent efforts in the LibreOffice community
35
Challenges: CJK developers
● This year, a full-time developer joined the TDF staff.
● Jonathan Clark
● He is in charge of multilingual issues such as RTL/CTL/CJK
● H had meetups with RTL and JA communities, and he is
checking and fix various Bugzilla issues.
36
37
Communication at events
● Franklin Weng from Taiwan a keynote at LibreOffice Kaigi
(Japan's annual event), 1 month ago
38
Improve each other's language support
through communication
● We often have the same CJK/other problem
● Problems may be easier to solve if there is more
collaboration
● Bugzilla, Telegram / Matrix chat, ML, Conference
39
Ask LibreOffice
● https:/
/ask.libreoffice.org/c/chinese-traditional/11
●
40
Conclusion
● LibreOffice has various CJK functions, mostly okay, but
sometimes broken
● Introduced some CJK bugs
● Now, TDF has one developer
● CJK volunteer contributors could better resolve CJK issues
with more collaboration
41
Thank you
●
Shinji Enoki
●
Email: shinji.enoki@libreoffice.org
●
I will be at the Japanese Community booth in the afternoon.

Status of CJK language support in LibreOffice 2024 (in COSCUP 2024)

  • 1.
    This work islicensed under a Creative Commons Attribution-ShareAlike 4.0 License. Status of CJK language support in LibreOffice 2024 Shinji Enoki [email protected] in COSCUP 2024 2024-08-04
  • 2.
    2 Shinji Enoki (榎真治) ●From Settsu City, Japan ● Part of Osaka Prefecture. I moved here a month ago. ● Member of LibreOffice Japanese Team (2011-) ● Membership Committee substitute members of The Document Foundation (2020-) ● Activity: organizing events, building communities, sometimes Q&A, QA, translate
  • 4.
    This is my6th COSCUP ● I joined COSCUP 2018 and 2019, 2021(online), 2022(online), 2023(online) ● I'm happy to be able to join in person after a long time.
  • 5.
    5 LibreOffice Asia Conference2024 ● This year it was held in Taiwan, together with COSCUP 2024 ● 8/2 Government Day @ Minsheng Building ● 8/3 Community Day @ NTUST (COSCUP LibreOffice track) ● https:/ /conf.libreoffice.asia/
  • 6.
    6 Agenda ● About CJKfunctions of LibreOffice ● Recent CJK issues ● Recent efforts in the LibreOffice community
  • 7.
    7 Have you heardthe term "CJK" before?
  • 8.
    8 What are CJKissues / bugs ● CJK is an abbreviation for “Chinese-Japanese-Korean” ● Chinese, Japanese, and Korean are different languages, but they have some common features ● For example, they use kanji-based characters use vertical writing, etc...
  • 9.
    About CJK functionsof LibreOffice
  • 10.
    10 Overview of LibreOfficeCJK functions ● Multibyte character ● Text Layout ● Vertical writing ● Phonetic guides (ruby) ● Page Formats, include Line Composition ● Input methods ● Fonts ● etc...
  • 11.
    11 Multibyte character ● UseUnicode to handle multibyte characters ● UTF-8, UTF-16, UTF-32 (Character Encoding Form) https://ja.wikipedia.org/wiki/Unicode
  • 12.
    12 Unicode Surrogate Pairs ●Expressing variations in characters ● 16 bit x 2 ● There was a bug in LibreOffice before, but now it's fine
  • 13.
    13 ● Other charactercodes were also used in the past. For Japanese there were three: ● ISO-2022-jp (JIS Code) ● EUC-JP (Extended Unix Code) ● Shift-JIS (SJIS) ● Sometimes it's not just a CJK bug, but a multi-byte issue that affects other languages
  • 14.
    14 Vertical writing ● Writer:[Page Style] setting can be set to vertical writing
  • 15.
    15 Vertical writing (Invarious places) 縦 書 Shape
  • 16.
    16 Justification (Line Adjustment) ●Japanese justification: Calculations based on multiple conditions are required
  • 17.
  • 18.
    18 Differences in rulesamong CJK ● There are many CJK features that have different requirements and behavior depending on the language ● The position and shape of commas are different between Chinese and Japanese.
  • 19.
    19 Reference: W3C Requirementsdocuments ● Requirements for Japanese Text Layout ● https:/ /www.w3.org/TR/jlreq/ ● “This document describes requirements for general Japanese layout realized with technologies like CSS, SVG and XSL-FO. ” ● Requirements for Chinese Text Layout 中文排版需求 (01 July 2023 updated) ● https:/ /www.w3.org/TR/clreq/ ● “This document was developed by people working in different areas, using both Simplified and Traditional Chinese. ” ● Requirements for Hangul Text Layout and Typography ● https:/ /www.w3.org/TR/klreq/
  • 20.
    20 References book ● CJKVInformation Processing, 2nd Edition(Ken Lunde, O'Reilly Media, Inc., December 2008) ● https:/ /learning.oreilly.com/library/view/cjkv-information-pro cessing/9780596156114/
  • 21.
    21 Other slides ● COSCUP2019 ● https:/ /www.slideshare.net/eno_eno/state-of-cjk-issues-of-libreoffice-2019 ● COSCUP 2021 ● https:/ /www.slideshare.net/eno_eno/state-of-cjk-issues-of-libreoffice-2021-in-coscup ● COSCUP 2022 ● https:/ /www.slideshare.net/eno_eno/improve-features-about-our-language-cjk-issues -of-libreoffice-in-2022 ● LibreOffice Conference 2021 ● https:/ /www.slideshare.net/eno_eno/state-of-cjk-issues-of-libreoffice-2021-edition
  • 22.
  • 23.
    23 Meta issue forCJK ● Bug 83066 : [META] CJK (Chinese, Japanese, Korean, and Vietnamese) language issues ● Meta issue for each CJK language ● Bug 113193 : [META] Traditional Chinese (zh_TW, zh_HK) ● Bug 113194 : [META] Simplified Chinese (zh_CN) ● Bug 113195 : [META] Japanese CJK issues ● Bug 113196 : [META] Korean ● Bug 119352 : [META] Language issues Basically, Bug 83066 is used
  • 24.
    ● There is alist link in "Depends on" for easy tracking
  • 25.
    25 Japanese, vertical CTLtext: some pasted text displayed incorrectly: Bug#155772 ● In Writer, there are occasionally overlap in the vertical writing. ● Fixed 25.2.0 by Jonathan Clark
  • 26.
    26 PDF export overlapsCJK characters when document has both vertical and horizontal text: Bug 157390 ● Problems exporting to PDF ● ● Fixed in 24.8.0.2 by Jonathan Clark ● (demo)
  • 27.
    27 Full-width list numberingincorrectly uses Latin digits for lines beginning with Latin characters : Bug 161804 ● When multibyte is selected in Writer list symbol ● Not fixed ● (Demo)
  • 28.
    28 Does not displaysome characters when inputting Japanese: Bug#152293 ● During Japanese input, the part exceeding the width of the cell is not displayed ● Only windows ● Still not fixed ● I don't know if this is only in Japanese
  • 29.
    29 Bug 161145 -Japanese characters (kanjis, hiragana, katana...) have an extra spacing since version LO 7 ● Not fix ● This is a regression ● (demo) 6.4 24.2
  • 30.
    Recent efforts inthe LibreOffice community
  • 31.
    35 Challenges: CJK developers ●This year, a full-time developer joined the TDF staff. ● Jonathan Clark ● He is in charge of multilingual issues such as RTL/CTL/CJK ● H had meetups with RTL and JA communities, and he is checking and fix various Bugzilla issues.
  • 32.
  • 33.
    37 Communication at events ●Franklin Weng from Taiwan a keynote at LibreOffice Kaigi (Japan's annual event), 1 month ago
  • 34.
    38 Improve each other'slanguage support through communication ● We often have the same CJK/other problem ● Problems may be easier to solve if there is more collaboration ● Bugzilla, Telegram / Matrix chat, ML, Conference
  • 35.
  • 36.
    40 Conclusion ● LibreOffice hasvarious CJK functions, mostly okay, but sometimes broken ● Introduced some CJK bugs ● Now, TDF has one developer ● CJK volunteer contributors could better resolve CJK issues with more collaboration
  • 37.
    41 Thank you ● Shinji Enoki ● Email:[email protected] ● I will be at the Japanese Community booth in the afternoon.