Chinese Romanisation Style
Table of contents
- Core Romanisation Style
- The Basis
- Capitalisation
- Word Spacing, Fusions, and Hyphenation
- Actual Title Showcasing the Style
- Further Clarification
- Loanword Restoration Rule
- The Particle De (的 / 地 / 得)
- Copula (Is/Am/Are)
- Hyphenation and Special Characters
- Personal Names
- Honorifics and Prefixes
- Numbers and Counters
- The Character Zhī (之)
- Detailed Tables
- Common Particles Functional Words
- Suffixes (Treated as Loanwords/Restoration)
- LLM Ruleset and Prompts
- YAML Ruleset
- Old Guidelines (Kept temporarily for internal reference. DO NOT USE)
- Overview
- Title field priorities
- Pinyin display rules
- 之 as subtitle divider
- Loanword Rule
- Names capitalization
- Numbers, dates, counters
- Quotation punctuation
- Examples
- Notes for editors
Note
This romanisation guide is not entirely finished and might change at any moment, be it partially or entirely.
Last update: 2026-01-21
Core Romanisation Style
This style guide defines how to consistently romanise and style Mandarin Chinese titles (manhua) for metadata purposes. The goal is to provide a standard that is easy to read and type while preserving the tonal nature of the language through Hanyu Pinyin.
The Basis
We follow standard Hanyu Pinyin with the following specific enhancements and deviations:
- Tonal Diacritics: We strictly use tone marks (diacritics) for all syllables to ensure accurate identification (e.g., Mā vs Má vs Mǎ vs Mà) ― major deviation.
- Loanword Restoration: Translate clear phonetic loanwords into their English/foreign spelling (e.g., 巴士 / Bāshì → "Bus", 粉丝 / Fěnsī → "Fans") instead of strict pinyin ― major deviation.
- The "Zhi" Separation: The character 之 (Zhī) is treated as a delimiter (colon) when separating a main title from a subtitle.
- Agglutination: We fuse compund words (polysyllabic words) according to standard orthography (e.g., Dàxué and not Dà Xué), but keep grammatical particles separate.
Capitalisation
As we are romanising titles, we use the Chicago Headline Style:
- Capitalise all nouns, verbs, adjectives, adverbs, pronouns, and auxiliary verbs.
- Keep all particles (e.g., de, le, zhe, ma), prepositions (e.g., zài, cóng), and copulas (e.g., shì) in lowercase.
Word Spacing, Fusions, and Hyphenation
Chinese characters (Hanzi) represent syllables, but Pinyin represents words (Ci). We standardise boundaries as follows:
- Compounds: Fuse standard dictionary compounds. If a word is a single concept composed of multiple characters, it is one word (e.g., Péngyǒu for Friend, not Péng Yǒu).
- Names: Surname First (Capitalised) + Given Name (Capitalised and Fused). Unlike Korean, Chinese given names are fused without a hyphen (e.g., Máo Zédōng).
- Erhua (R-coloring): The suffix 儿 (-r) is fused to the preceding syllable (e.g., Huār not Huā Er).
- Separation: Separate words that do not form a tight compound. Subject-Verb-Object structures are always separated.
- Apostrophes: Use an apostrophe strictly to disambiguate syllable boundaries when a syllable starting with a, o, or e follows another syllable (e.g., Xī'ān vs Xiān).
Tone Sandhi
We romanise using the original dictionary tone, not the sandhi (changed) tone.
- Nǐ hǎo (你好) not Ní hǎo (even though it is pronounced with a rising tone on the first syllable).
- Yī (一) and Bù (不) retain their dictionary tones (yī, bù) regardless of the tone of the following word, to ensure search consistency.
Actual Title Showcasing the Style
重生之黑客系统的逆袭 ~在异世界当骇客,我的粉丝全是僵尸~
Chóngshēng: Hacker Xìtǒng de Nìxí ~Zài Yìshìjiè Dāng Hacker, Wǒ de Fans Quán shì Jiāngshī~
- Loanwords: 黑客 (Hēikè) / 骇客 (Hàikè) → Hacker. 粉丝 (Fěnsī) → Fans.
- Native Nouns: Xìtǒng (System), Yìshìjiè (Otherworld), Jiāngshī (Zombie) are kept in Pinyin as they are standard native/Sino compounds or not direct phonetic loans.
- The "Zhi" Rule: Chóngshēng (Rebirth) is followed by Zhi acting as a separator, so it becomes a colon.
- Particles: de (possessive) is separate and lowercase.
- Prepositions/Copula: zài (at/in) and shì (is) are separate and lowercase.
- Compounds: Nìxí (Counterattack) is fused.
Further Clarification
Loanword Restoration Rule
Chinese often absorbs foreign words phonetically. To improve readability, we restore these to their original foreign counterparts if they are direct phonetic transliterations.
- 巴士 (Bāshì) → Bus
- 沙发 (Shāfā) → Sofa
- 巧克力 (Qiǎokèlì) → Chocolate
- 朋克 (Péngkè) → Punk
- S级 (S-jí) → S-jí (or S-Rank if "Rank" is implied contextually).
Semantic Translations vs Phonetic Loans
We do not translate words that are semantic translations (translating the meaning rather than the sound).
- 电脑 (Diànnǎo - Electric Brain) → Diànnǎo (Keep Pinyin, do not write Computer).
- 手机 (Shǒujī - Hand Machine) → Shǒujī (Keep Pinyin, do not write Mobile).
- Only restore words where the Chinese characters were chosen specifically to mimic the English/foreign sound (Phonetic Loans).
The Particle "De" (的 / 地 / 得)
The natural tone particle de is the most common grammatical marker. It is always separated and lowercase.
- 红色跑车 (Hóngsè Pǎochē) → Hóngsè Pǎochē (Red Sports Car - no particle)
- 红色的跑车 (Hóngsè de Pǎochē) → Hóngsè de Pǎochē (Red-colored Sports Car)
- 慢慢地走 (Mànmàn de Zǒu) → Mànmàn de Zǒu (Walk slowly - adverbial marker)
- 跑得快 (Pǎo de Kuài) → Pǎo de Kuài (Run fast - complement marker)
Tones on Particles
While often pronounced neutral, for metadata consistency, if a particle has a dictionary tone variant, prefer the neutral (no mark) for grammatical particles listed in the table below to reduce keyboard friction, unless distinct emphasis is required.
Copula (Is/Am/Are)
The verb shì (是) functions as the copula. While syntactically a verb, for Title Styling purposes to align with our Japanese/Korean guidelines, we treat it as a functional connector.
- Lowercase shì when it acts as a simple distinct equating verb in a title.
- Fuse and Capitalise if it is part of a lexicalised compound (e.g., Wèishénme - Why).
Hyphenation and Special Characters
Personal Names
Surname + Given Name
- 李小龙 → Lǐ Xiǎolóng
- 司马光 → Sīmǎ Guāng (Compound Surname Sima is fused).
Honorifics and Prefixes
Chinese uses prefixes/suffixes for familiarity. Hyphenate these to clarify they are not part of the name proper.
- 老王 (Lǎo Wáng) → Lǎo-Wáng (Old Wang)
- 小猫 (Xiǎo Māo) → Xiǎo-Māo (Little Mao)
- 这里的黎明静悄悄 → Zhèlǐ de Límíng Jìngqiāoqiāo (No honorifics here, just standard fusion).
Numbers and Counters
Use Arabic numerals. Hyphenate the counter.
- 第101次 → Dì 101-cì (The 101st Time)
- 3个人 → 3-gè Rén (3 People)
The Character "Zhī" (之)
As a Separator: If it appears after the initial noun phrase to introduce the subtitle/description, replace it with a Colon (:).
- System Zhī Wáng → System: Wáng
As a Possessive: If it appears inside a phrase meaning "of", write as zhī (lowercase). - Sānfēn zhī Yī (One third) → Sānfēn zhī Yī
Detailed Tables
Common Particles & Functional Words
These words should generally be lowercase and separated.
| Pinyin | Chinese | Function | Notes |
|---|---|---|---|
| de | 的 / 地 / 得 | Possessive / Adverbial / Complement | Always neutral tone in this context. |
| le | 了 | Aspect particle (Completion) | |
| zhe | 着 | Aspect particle (Continuous) | |
| guò | 过 | Aspect particle (Experience) | Write with tone guò or neutral. |
| men | 们 | Plural marker | Usually fused to pronouns (Wǒmen), separate for nouns (Rén men). |
| ba | 吧 | Modal particle (Suggestion) | |
| ma | 吗 | Question particle | |
| ne | 呢 | Query particle | |
| bèi | 被 | Passive marker | |
| gěi | 给 | Preposition (To/For) | |
| zài | 在 | Preposition (At/In) | |
| hé / yǔ | 和 / 与 | Conjunction (And) | |
| shì | 是 | Copula (Is/Am/Are) | Lowercase in titles. |
| wéi | 为 | Copula (Act as/Become) | Lowercase in titles. |
Suffixes (Treated as Loanwords/Restoration)
| Chinese | Pinyin Reading | Restoration | Condition |
|---|---|---|---|
| S级 | S-jí | S-Rank | If context implies ranking. |
| APP | A-P-P | App | |
| V | V | V |
LLM Ruleset and Prompts
YAML Ruleset
romanization_style:
name: "MangaBaka Chinese Romamization Style"
version: "1.0"
base_system: "Hanyu Pinyin (ISO 7098)"
tones:
usage: "obligatory"
type: "diacritics"
rule: "Use original dictionary tones (e.g., Nǐ hǎo), do not apply tone sandhi rules."
neutral_tone: "Do not mark neutral tones (e.g., de, le, zi)."
vowels: "Use ü for u-umlaut (e.g., nǚ)."
capitalization: >
Capitalize all Nouns, Verbs, Adjectives, Adverbs, Pronouns, and Auxiliaries.
Lowercase grammatical particles, prepositions, and copulas (shì, wéi).
word_segmentation:
compounds: "Fuse all lexicalized dictionary compounds (e.g., Dàxué, Shénme)."
phrases: "Separate words that are syntactically distinct (Subject Verb Object)."
names: "Surname (Cap) + GivenName (Cap+Fused). Example: Lǐ Xiǎolóng."
erhua: "Fuse the 'r' suffix to the stem (e.g., Huār)."
special_characters:
zhi_rule: >
If '之' functions as a subtitle delimiter, romanize as a colon (:).
If '之' functions as a possessive/grammatical particle, romanize as 'zhī' (lowercase).
apostrophe: "Use strictly before a, o, e syllables that follow another syllable (e.g., Xī'ān)."
loanwords: >
Restore English spelling ONLY if the Chinese word is a direct phonetic transliteration.
Examples: Bāshì -> Bus, Fěnsī -> Fans, Jiákè -> Jacket.
Do not restore semantic translations (e.g., Shǒujī stays Shǒujī, not Mobile).
particles: >
Always separate and lowercase: de, le, zhe, guo, ba, ma, ne, bei, gei, zai, he, yu.
Copulas (shi, wei) are lowercase.
hyphens: >
Use hyphens for:
- Honorific prefixes/suffixes (Lǎo-Wáng, Xiǎo-Lǐ).
- Arabic numerals + Counters (10-gè, Dì-1-cì).
example_validation:
input: "重生之在大巴上捡到了神级系统"
output: "Chóngshēng: Zài Bus shàng Jiǎndàole Shénjí Xìtǒng"
explanation: "Zhī becomes colon. Dàbā restored to Bus. Zài/shàng are prepositions/locatives (lowercase/separate). Shénjí (God-level) is fused."
Old Guidelines (Kept temporarily for internal reference. DO NOT USE)
Overview
This public guide explains how Mandarin Chinese titles are romanised and styled for display on MangaBaka. The goal is readable, frontend-friendly Pinyin with clear editorial rules that keep titles consistent across the site.
Canonical system: Hanyu Pinyin with tone marks for display, with editorial fallbacks for clarity.
Title field priorities
- Native Title: the original Chinese characters as shown on the cover or source (Simplified or Traditional).
- Romanized Title: the Pinyin shown to users (with tone marks where applicable).
Pinyin display rules
- Use diacritics (tone marks) for display: e.g., 鹤报刀法,我一刀一个 → Hébào Dāofǎ, Wǒ Yì Dāo Yí Gè.
- Use ü for the vowel ü in display (e.g., nǚ).
- Use an apostrophe to disambiguate syllable boundaries when necessary (Xi'an).
- Capitalize content words in title-style capitalization; keep grammatical particles (de, di, le, zhe, ba, ne, ma, ye) lowercased when they are functional.
之 as subtitle divider
When 之 is used as a subtitle divider, it is romanised as a colon (':'), but when it is used as a possessive, it has to be romanised as zhī.
Loanword Rule
- When a Chinese title clearly uses a loanword and there is a common real-world/English term, prefer the real-world term in the romanisation or as the visible English phrase.
Example: if the title references a specific English concept such as "Rhythm Game," present that term visibly while also keeping pinyin for the native characters.
- Maintain the original Chinese title in the native field.
Names & capitalization
- Follow Chinese name order (surname first) for metadata unless an established English name exists. In that case, include the common English form as an alternative display title.
- Capitalize main content words in display pinyin; keep small grammatical particles lowercased.
Numbers, dates, counters
- Use Arabic numerals for clarity and sorting: 第四部 → Di 4-bù (display may include tone marks where useful).
- Hyphens for counters are acceptable to align with cross-language conventions: Di 3-bù: ....
Quotation & punctuation
- Decorative Chinese quotation marks in the native title are retained in the native field. For pinyin display, use standard ASCII quotes.
- Use apostrophes in pinyin display where they clarify syllable boundaries (e.g., Xī'ān).
Examples
- Native: 鹤报刀法,我一刀一个
Romanization: Hébào Dāofǎ, Wǒ Yì Dāo Yí Gè
- Native: 西安事变
Romanization: Xī'ān Shìbiàn
- Native: 女
Romanization: nǚ
Notes for editors
- Do not adopt publisher romanisation as the default; publishers often use inconsistent romanisations.
