top of page
搜尋

Line Breaking Logic and Optimization for Thai Text in Unity TextMeshPro

  • 作家相片: Katarzyna
    Katarzyna
  • 2025年7月3日
  • 讀畢需時 3 分鐘

已更新:2025年10月21日


The current method inserts zero-width spaces (ZWSP) after Thai word segmentation, allowing TextMeshPro to wrap text automatically at word boundaries, while prioritizing explicit line breaks (\n). This approach resolves most issues where continuous Thai text without spaces fails to wrap properly. However, it still encounters problems such as incorrect breaks caused by combining characters, tone marks, special punctuation, and rich text tags, as well as cross-platform rendering inconsistencies.

The recommended optimization is to preprocess text using the ITextPreprocessor interface before rendering:

  • Use a Thai word segmenter to automatically insert ZWSPs between words.

  • Avoid inserting spaces within rich text tags.

  • Provide a script (ThaiLineBreaker.cs) that can be attached directly to TextMeshProUGUI, ensuring consistent line-breaking behavior across all platforms.


1. Current Thai Line-Breaking Approach and Its Issues in Unity TextMeshPro


The Thai line-breaking solution currently used in game development typically follows this process:First, the text is segmented using a Thai word segmentation library (which splits words and inserts zero-width spaces), then passed to TextMeshPro (TMPro) to perform automatic line breaks based on word boundaries.Next, explicit line breaks (\n) are prioritized—meaning the \n rule takes precedence over word-based wrapping.

This approach has resolved most issues where continuous Thai text without spaces could not wrap properly. However, several problems remain:

  • In Thai, apart from word boundaries, combining characters, tone marks, and special punctuation (such as “ๆ” or “ฯ”) can cause improper line breaks.

  • Some rich text tags (e.g., <color>, <b>) may produce rendering or segmentation errors after inserting invisible spaces.

  • Cross-platform inconsistencies occur, especially in Unity + TMPro, where font rendering and line width calculations differ slightly across Windows, Android, and iOS.


  1. Recommended Optimization Plan for Thai Line Breaking in Unity TextMeshPro

Recommended Optimization Plan for Thai Line Breaking in Unity TextMeshPro

Optimization Proposal:


  • Unity TextMeshPro’s built-in line-breaking mechanism provides limited support for Thai. You can enhance it by using the ITextPreprocessor interface to perform word segmentation and insert zero-width spaces (ZWSP) before rendering, ensuring that line-break positions are properly marked in advance.

  • Avoid inserting spaces or ZWSPs inside rich text tags (use regex to exclude tag ranges).

Example:Original Thai text:ประเทศไทยสวยมาก(Meaning: “Thailand is beautiful.”)

After segmentation:ประเทศ | ไทย | สวย | มาก

Insert ZWSP (\u200B) between words:ประเทศ\u200Bไทย\u200Bสวย\u200Bมาก

Rendering behavior:

  • If the line is long enough → displays as one full line without visible spaces.

  • If the line is too short → automatically breaks at ZWSP positions.

Unity ITextPreprocessor Script:

  • Automatically adds ZWSPs to Thai text before TextMeshPro renders.

  • Skips inserting spaces within rich text tags (e.g., <color>, <b>).

  • Can be directly attached to TextMeshProUGUI or TMP_Text components.

Usage:

  1. Save the script as ThaiLineBreaker.cs.

  2. Attach it to a TextMeshProUGUI or TMP_Text object.

  3. In the ThaiSegment method, replace the logic with your Thai word segmenter that outputs ZWSPs (e.g., using PyThaiNLP).

  4. When running, TextMeshPro will call PreprocessText() before rendering to automatically support Thai line breaking.

(If you need the source code, please contact us.)


  1. Additional Optimization: Phrase Protection in Thai Line Breaking


If some words are still being split incorrectly after using the ITextPreprocessor for automatic ZWSP insertion, you can maintain a phrase list to protect certain phrases from being divided.

For example, phrases in this list—such as “การอัปเกรด”—should be treated as indivisible units. During the word segmentation phase, the system should first check this phrase list; if a match is found, the phrase is processed as a single block, and no ZWSP is inserted within it.

In this case, “การอัปเกรด” will remain intact and will not be split. However, if the standalone word “การ” appears outside of that specific phrase, it can still follow the standard line-breaking rules.







 
 
 

留言


這篇文章不開放留言。請連絡網站負責人了解更多。
bottom of page