Mangabaka Chinese Romanisation Style

Updated by user · 79 days ago · 10 min read

> [!NOTE] > This romanisation guide is not entirely finished and might change at any moment, be it partially or entirely. > Last update: 2026-01-21 ## Core Romanisation Style This style guide defines how to consistently romanise and style Mandarin Chinese titles (manhua) for metadata purposes. The goal is to provide a standard that is easy to read and type while preserving the tonal nature of the language through **Hanyu Pinyin**. ### The Basis We follow standard [Hanyu Pinyin](https://en.wikipedia.org/wiki/Pinyin) with the following specific enhancements and deviations: - **Tonal Diacritics**: We strictly use **tone marks** (diacritics) for all syllables to ensure accurate identification (e.g., *Mā* vs *Má* vs *Mǎ* vs *Mà*) ― **major deviation**. - **Loanword Restoration**: Translate clear **phonetic loanwords** into their English/foreign spelling (e.g., 巴士 / *Bāshì* → "Bus", 粉丝 / *Fěnsī* → "Fans") instead of strict pinyin ― **major deviation**. - **The "Zhi" Separation**: The character **之 (Zhī)** is treated as a delimiter (colon) when separating a main title from a subtitle. - **Agglutination**: We fuse compund words (polysyllabic words) according to standard orthography (e.g., *Dàxué* and not *Dà Xué*), but keep grammatical particles separate. ### Capitalisation As we are romanising **titles**, we use the **Chicago Headline Style**: - Capitalise all **nouns**, **verbs**, **adjectives**, **adverbs**, **pronouns**, and **auxiliary verbs**. - Keep all **particles** (e.g., *de*, *le*, *zhe*, *ma*), **prepositions** (e.g., *zài*, *cóng*), and **copulas** (e.g., *shì*) in lowercase. ### Word Spacing, Fusions, and Hyphenation Chinese characters (*Hanzi*) represent syllables, but Pinyin represents words (*Ci*). We standardise boundaries as follows: - **Compounds**: Fuse standard dictionary compounds. If a word is a single concept composed of multiple characters, it is one word (e.g., *Péngyǒu* for Friend, not *Péng Yǒu*). - **Names**: Surname First (Capitalised) + Given Name (Capitalised and Fused). Unlike Korean, Chinese given names are **fused** without a hyphen (e.g., *Máo Zédōng*). - **Erhua (R-coloring)**: The suffix **儿 (-r)** is fused to the preceding syllable (e.g., *Huār* not *Huā Er*). - **Separation**: Separate words that do not form a tight compound. Subject-Verb-Object structures are always separated. - **Apostrophes**: Use an apostrophe strictly to disambiguate syllable boundaries when a syllable starting with *a*, *o*, or *e* follows another syllable (e.g., *Xī'ān* vs *Xiān*). !!! warning Tone Sandhi We romanise using the **original dictionary tone**, not the sandhi (changed) tone. - **Nǐ hǎo (你好)** not *Ní hǎo* (even though it is pronounced with a rising tone on the first syllable). - **Yī (一)** and **Bù (不)** retain their dictionary tones (*yī*, *bù*) regardless of the tone of the following word, to ensure search consistency. !!! ## Actual Title Showcasing the Style 重生之黑客系统的逆袭 ~在异世界当骇客,我的粉丝全是僵尸~ Chóngshēng: Hacker Xìtǒng de Nìxí ~Zài Yìshìjiè Dāng Hacker, Wǒ de Fans Quán shì Jiāngshī~ - **Loanwords**: 黑客 (*Hēikè*) / 骇客 (*Hàikè*) → Hacker. 粉丝 (Fěnsī) → Fans. - **Native Nouns**: *Xìtǒng* (System), *Yìshìjiè* (Otherworld), *Jiāngshī* (Zombie) are kept in Pinyin as they are standard native/Sino compounds or not direct phonetic loans. - **The "Zhi" Rule**: *Chóngshēng* (Rebirth) is followed by *Zhi* acting as a separator, so it becomes a colon. - **Particles**: *de* (possessive) is separate and lowercase. - **Prepositions/Copula**: *zài* (at/in) and *shì* (is) are separate and lowercase. - **Compounds**: *Nìxí* (Counterattack) is fused. ## Further Clarification ### Loanword Restoration Rule Chinese often absorbs foreign words phonetically. To improve readability, we restore these to their original foreign counterparts if they are **direct phonetic transliterations**. - 巴士 (*Bāshì*) → **Bus** - 沙发 (*Shāfā*) → **Sofa** - 巧克力 (*Qiǎokèlì*) → **Chocolate** - 朋克 (*Péngkè*) → **Punk** - S级 (*S-jí*) → **S-jí** (or **S-Rank** if "Rank" is implied contextually). !!! note Semantic Translations vs Phonetic Loans We do **not** translate words that are semantic translations (translating the meaning rather than the sound). - 电脑 (*Diànnǎo* - Electric Brain) → **Diànnǎo** (Keep Pinyin, do not write *Computer*). - 手机 (*Shǒujī* - Hand Machine) → **Shǒujī** (Keep Pinyin, do not write *Mobile*). - **Only** restore words where the Chinese characters were chosen specifically to mimic the English/foreign sound (Phonetic Loans). !!! ### The Particle "De" (的 / 地 / 得) The natural tone particle *de* is the most common grammatical marker. It is **always separated** and **lowercase**. - 红色跑车 (*Hóngsè Pǎochē*) → **Hóngsè Pǎochē** (Red Sports Car - no particle) - 红色的跑车 (*Hóngsè de Pǎochē*) → **Hóngsè de Pǎochē** (Red-colored Sports Car) - 慢慢地走 (*Mànmàn de Zǒu*) → **Mànmàn de Zǒu** (Walk slowly - adverbial marker) - 跑得快 (*Pǎo de Kuài*) → **Pǎo de Kuài** (Run fast - complement marker) !!! tip Tones on Particles While often pronounced neutral, for metadata consistency, if a particle has a dictionary tone variant, prefer the **neutral** (no mark) for grammatical particles listed in the table below to reduce keyboard friction, unless distinct emphasis is required. !!! ### Copula (Is/Am/Are) The verb **shì** (是) functions as the copula. While syntactically a verb, for Title Styling purposes to align with our Japanese/Korean guidelines, we treat it as a functional connector. - **Lowercase** *shì* when it acts as a simple distinct equating verb in a title. - **Fuse and Capitalise** if it is part of a lexicalised compound (e.g., *Wèishénme* - Why). ### Hyphenation and Special Characters ### Personal Names Surname + Given Name - 李小龙 → **Lǐ Xiǎolóng** - 司马光 → **Sīmǎ Guāng** (Compound Surname *Sima* is fused). ### Honorifics and Prefixes Chinese uses prefixes/suffixes for familiarity. Hyphenate these to clarify they are not part of the name proper. - 老王 (*Lǎo Wáng*) → **Lǎo-Wáng** (Old Wang) - 小猫 (*Xiǎo Māo*) → **Xiǎo-Māo** (Little Mao) - 这里的黎明静悄悄 → *Zhèlǐ de Límíng Jìngqiāoqiāo* (No honorifics here, just standard fusion). ### Numbers and Counters Use Arabic numerals. Hyphenate the counter. - 第101次 → **Dì 101-cì** (The 101st Time) - 3个人 → **3-gè Rén** (3 People) ### The Character "Zhī" (之) **As a Separator**: If it appears after the initial noun phrase to introduce the subtitle/description, replace it with a **Colon (:)**. - *System Zhī Wáng* → **System: Wáng** **As a Possessive**: If it appears inside a phrase meaning "of", write as **zhī** (lowercase). - *Sānfēn zhī Yī* (One third) → **Sānfēn zhī Yī** ## Detailed Tables ### Common Particles & Functional Words These words should generally be **lowercase** and **separated**. | Pinyin | Chinese | Function | Notes | | ------ | ------- | -------- | ----- | | **de** | 的 / 地 / 得 | Possessive / Adverbial / Complement | Always neutral tone in this context. | | **le** | 了 | Aspect particle (Completion) | | | **zhe** | 着 | Aspect particle (Continuous) | | | **guò** | 过 | Aspect particle (Experience) | Write with tone *guò* or neutral. | | **men** | 们 | Plural marker | Usually fused to pronouns (*Wǒmen*), separate for nouns (*Rén men*). | | **ba** | 吧 | Modal particle (Suggestion) | | | **ma** | 吗 | Question particle | | | **ne** | 呢 | Query particle | | | **bèi** | 被 | Passive marker | | | **gěi** | 给 | Preposition (To/For) | | | **zài** | 在 | Preposition (At/In) | | | **hé / yǔ** | 和 / 与 | Conjunction (And) | | | **shì** | 是 | Copula (Is/Am/Are) | Lowercase in titles. | | **wéi** | 为 | Copula (Act as/Become) | Lowercase in titles. | ### Suffixes (Treated as Loanwords/Restoration) | Chinese | Pinyin Reading | Restoration | Condition | | ------- | -------------- | ----------- | --------- | | S级 | *S-jí* | S-Rank | If context implies ranking. | | APP | *A-P-P* | App | | | V | | *V* | V | (As in 大V - Big Influencer). | ## LLM Ruleset and Prompts ### YAML Ruleset ```yaml romanization_style: name: "MangaBaka Chinese Romamization Style" version: "1.0" base_system: "Hanyu Pinyin (ISO 7098)" tones: usage: "obligatory" type: "diacritics" rule: "Use original dictionary tones (e.g., Nǐ hǎo), do not apply tone sandhi rules." neutral_tone: "Do not mark neutral tones (e.g., de, le, zi)." vowels: "Use ü for u-umlaut (e.g., nǚ)." capitalization: > Capitalize all Nouns, Verbs, Adjectives, Adverbs, Pronouns, and Auxiliaries. Lowercase grammatical particles, prepositions, and copulas (shì, wéi). word_segmentation: compounds: "Fuse all lexicalized dictionary compounds (e.g., Dàxué, Shénme)." phrases: "Separate words that are syntactically distinct (Subject Verb Object)." names: "Surname (Cap) + GivenName (Cap+Fused). Example: Lǐ Xiǎolóng." erhua: "Fuse the 'r' suffix to the stem (e.g., Huār)." special_characters: zhi_rule: > If '之' functions as a subtitle delimiter, romanize as a colon (:). If '之' functions as a possessive/grammatical particle, romanize as 'zhī' (lowercase). apostrophe: "Use strictly before a, o, e syllables that follow another syllable (e.g., Xī'ān)." loanwords: > Restore English spelling ONLY if the Chinese word is a direct phonetic transliteration. Examples: Bāshì -> Bus, Fěnsī -> Fans, Jiákè -> Jacket. Do not restore semantic translations (e.g., Shǒujī stays Shǒujī, not Mobile). particles: > Always separate and lowercase: de, le, zhe, guo, ba, ma, ne, bei, gei, zai, he, yu. Copulas (shi, wei) are lowercase. hyphens: > Use hyphens for: - Honorific prefixes/suffixes (Lǎo-Wáng, Xiǎo-Lǐ). - Arabic numerals + Counters (10-gè, Dì-1-cì). example_validation: input: "重生之在大巴上捡到了神级系统" output: "Chóngshēng: Zài Bus shàng Jiǎndàole Shénjí Xìtǒng" explanation: "Zhī becomes colon. Dàbā restored to Bus. Zài/shàng are prepositions/locatives (lowercase/separate). Shénjí (God-level) is fused." ``` ___ ## Old Guidelines (Kept temporarily for internal reference. DO NOT USE) ## Overview This public guide explains how Mandarin Chinese titles are romanised and styled for display on MangaBaka. The goal is readable, frontend-friendly Pinyin with clear editorial rules that keep titles consistent across the site. **Canonical system:** **Hanyu Pinyin with tone marks** for display, with editorial fallbacks for clarity. ## Title field priorities * **Native Title:** the original Chinese characters as shown on the cover or source (Simplified or Traditional). * **Romanized Title:** the Pinyin shown to users (with tone marks where applicable). ## Pinyin display rules * Use **diacritics (tone marks)** for display: e.g., 鹤报刀法,我一刀一个 → **Hébào Dāofǎ, Wǒ Yì Dāo Yí Gè**. * Use **ü** for the vowel ü in display (e.g., nǚ). * Use an apostrophe to disambiguate syllable boundaries when necessary (Xi'an). * Capitalize content words in title-style capitalization; keep grammatical particles (de, di, le, zhe, ba, ne, ma, ye) lowercased when they are functional. ## 之 as subtitle divider When 之 is used as a subtitle divider, it is romanised as **a colon (':')**, but when it is used as a possessive, it has to be romanised as **zhī**. ## Loanword Rule * When a Chinese title clearly uses a loanword and there is a common real-world/English term, prefer the real-world term in the **romanisation** or as the visible English phrase. > Example: if the title references a specific English concept such as "Rhythm Game," present that term visibly while also keeping pinyin for the native characters. * Maintain the original Chinese title in the native field. ## Names & capitalization * Follow Chinese name order (surname first) for metadata unless an established English name exists. In that case, include the common English form as an alternative display title. * Capitalize main content words in display pinyin; keep small grammatical particles lowercased. ## Numbers, dates, counters * Use Arabic numerals for clarity and sorting: 第四部 → **Di 4-bù** (display may include tone marks where useful). * Hyphens for counters are acceptable to align with cross-language conventions: **Di 3-bù: ...**. ## Quotation & punctuation * Decorative Chinese quotation marks in the native title are retained in the native field. For pinyin display, use standard ASCII quotes. * Use apostrophes in pinyin display where they clarify syllable boundaries (e.g., **Xī'ān**). ## Examples * Native: 鹤报刀法,我一刀一个 > Romanization: **Hébào Dāofǎ, Wǒ Yì Dāo Yí Gè** * Native: 西安事变 > Romanization: **Xī'ān Shìbiàn** * Native: 女 > Romanization: **nǚ** ## Notes for editors * Do not adopt publisher romanisation as the default; publishers often use inconsistent romanisations.