Puneet Varma (Editor)

Line breaking rules in East Asian languages

Updated on
Edit
Like
Comment
Share on FacebookTweet on TwitterShare on LinkedInShare on Reddit

The line breaking rules in East Asian language specify how to wrap East Asian Language text such as Chinese, Japanese, and Korean. Certain characters in those languages should not come at the end of a line, certain characters should not come at the start of a line, and some characters should never be split up across two lines. For example, periods and closing parentheses are not allowed to start a line. Many word processing and DTP software products have built-in features to control line breaking rules in those languages.

Contents

In Japanese Language, especially, categories of line breaking rules and processing methods are determined by Japanese Industrial Standard JIS X 4051, and it is called Kinsoku Shori (禁則処理).

Line breaking rules in Chinese text

Line breaking rules for Chinese language have been described in the reference of Office Open XML, Ecma standard. There are rules about certain characters that are not allowed to start or end a line, such as below.

Simplified Chinese

  • Characters that are not allowed at the start of a line :!%),.:;?]}¢°·’""†‡›℃∶、。〃〆〕〗〞﹚﹜!"%'),.:;?!]}~
  • Characters that are not allowed at the end of a line :$(£¥·‘"〈《「『【〔〖〝﹙﹛$(.[{£¥
  • Traditional Chinese

  • Characters that are not allowed at the start of a line :!),.:;?]}¢·–— ’"•" 、。〆〞〕〉》」︰︱︲︳﹐﹑﹒﹓﹔﹕﹖﹘﹚﹜!),.:;?︶︸︺︼︾﹀﹂﹗]|}、
  • Characters that are not allowed at the end of a line :([{£¥‘"‵〈《「『〔〝︴﹙﹛({︵︷︹︻︽︿﹁﹃﹏
  • Line breaking rules in Japanese text (Kinsoku Shori)

    Line breaking rules of Japanese language are determined by JIS X 4051, Japanese Industrial Standard. It describes word wrap rules and processing rules for Japanese language documents. These rules are called Kinsoku Shori (禁則処理, literally process of prohibition rules).

    Categories

    Regarding prohibited characters, there are some conventions called "House Rules", which are characteristic rules of their publishers. Furthermore, there are many publishers whose rules contradict other publishers' ones. For that reason, there are lots of conventions which are not supported by Western DTP software tools, and that is the main cause of the growing demand of computerized phototypesetting systems.

    Characters not permitted on the start of a line
  • Closing brackets
  • )]}〕〉》」』】〙〗〟’"⦆»
  • Japanese characters that are not allowed at the start of a line
  • ヽヾーァィゥェォッャュョヮヵヶぁぃぅぇぉっゃゅょゎゕゖㇰㇱㇲㇳㇴㇵㇶㇷㇸㇹㇺㇻㇼㇽㇾㇿ々〻

    (Note: The above rule only applies to small (chiisai) kana. Full size kana can start a line.)

  • Hyphens
  • ‐゠–〜
  • Delimiters
  •  ?!‼⁇⁈⁉
  • Mid-sentence punctuation
  • ・、:;,
  • Sentence-ending punctuation
  • 。.

    Note: Kinsoku Shori does not apply to Japanese characters while one line contains not enough characters

    Characters not permitted at the end of a line
  • Opening brackets
  • ([{〔〈《「『【〘〖〝‘"⦅«
    Do not split
  • Characters that can't be separated
  • —…‥〳〴〵
  • Numbers
  • Grouped characters
  • 一昨日おととい (Kanji sequences that have ruby characters that do not have a clear mapping to the underlying kanji characters (jukujikun))

    Processing rules

    Burasage (Hanging punctuation)
    Move punctuation character to the end of the previous line.
    Oidashi (Wrap to next)
    Send characters not permitted at the end of a line to the next line, increase kerning to pad out first line. Another use is to wrap a character from the first line with the goal of preventing a character that shouldn't start a line from coming first on the next line.
    Oikomi (Squeeze)
    Reduce kerning on the first line to pull a character not permitted at the start of a line from being the first character on the second line. If the software does not have kerning ability, white space is sometimes added to the end of a line.
    Do not split
    Use Oidashi and Oikomi to process. If characters that can't be split up straddle the end of a line, move them in a block to the next line using Oidashi, or keep them all together on the previous line by using Oikomi.

    Line breaking rules in Korean text

    Line breaking rules for Korean language have been described in the reference of Office Open XML, Ecma standard. There are rules about certain characters that are not allowed to start or end a line, such as below.

  • Characters that are not allowed at the start of a line :!%),.:;?]}¢°’"†‡℃〆〈《「『〕!%),.:;?]}
  • Characters that are not allowed at the end of a line :$([{£¥‘"々〇〉》」〔$([{⦆¥₩ #
  • KS X ISO/IEC 26300:2007, OpenDocument standard in Korea, describes hyphenation at the start or at the end of line in OpenDocument.
  • KS X 6001, standard for file specification of Korean word processor intermediate document, describes rules for line breaking at the end of page.
  • References

    Line breaking rules in East Asian languages Wikipedia