Mangabaka Chinese Romanisation Style
> [!NOTE]
> This romanisation guide is not entirely finished and might change at any moment, be it partially or entirely.
> Last update: 2026-01-21
## Core Romanisation Style
This style guide defines how to consistently romanise and style Mandarin Chinese titles (manhua) for metadata purposes. The goal is to provide a standard that is easy to read and type while preserving the tonal nature of the language through **Hanyu Pinyin**.
### The Basis
We follow standard [Hanyu Pinyin](https://en.wikipedia.org/wiki/Pinyin) with the following specific enhancements and deviations:
- **Tonal Diacritics**: We strictly use **tone marks** (diacritics) for all syllables to ensure accurate identification (e.g., *Mā* vs *Má* vs *Mǎ* vs *Mà*) ― **major deviation**.
- **Loanword Restoration**: Translate clear **phonetic loanwords** into their English/foreign spelling (e.g., 巴士 / *Bāshì* → "Bus", 粉丝 / *Fěnsī* → "Fans") instead of strict pinyin ― **major deviation**.
- **The "Zhi" Separation**: The character **之 (Zhī)** is treated as a delimiter (colon) when separating a main title from a subtitle.
- **Agglutination**: We fuse compund words (polysyllabic words) according to standard orthography (e.g., *Dàxué* and not *Dà Xué*), but keep grammatical particles separate.
### Capitalisation
As we are romanising **titles**, we use the **Chicago Headline Style**:
- Capitalise all **nouns**, **verbs**, **adjectives**, **adverbs**, **pronouns**, and **auxiliary verbs**.
- Keep all **particles** (e.g., *de*, *le*, *zhe*, *ma*), **prepositions** (e.g., *zài*, *cóng*), and **copulas** (e.g., *shì*) in lowercase.
### Word Spacing, Fusions, and Hyphenation
Chinese characters (*Hanzi*) represent syllables, but Pinyin represents words (*Ci*). We standardise boundaries as follows:
- **Compounds**: Fuse standard dictionary compounds. If a word is a single concept composed of multiple characters, it is one word (e.g., *Péngyǒu* for Friend, not *Péng Yǒu*).
- **Names**: Surname First (Capitalised) + Given Name (Capitalised and Fused). Unlike Korean, Chinese given names are **fused** without a hyphen (e.g., *Máo Zédōng*).
- **Erhua (R-coloring)**: The suffix **儿 (-r)** is fused to the preceding syllable (e.g., *Huār* not *Huā Er*).
- **Separation**: Separate words that do not form a tight compound. Subject-Verb-Object structures are always separated.
- **Apostrophes**: Use an apostrophe strictly to disambiguate syllable boundaries when a syllable starting with *a*, *o*, or *e* follows another syllable (e.g., *Xī'ān* vs *Xiān*).
!!! warning Tone Sandhi
We romanise using the **original dictionary tone**, not the sandhi (changed) tone.
- **Nǐ hǎo (你好)** not *Ní hǎo* (even though it is pronounced with a rising tone on the first syllable).
- **Yī (一)** and **Bù (不)** retain their dictionary tones (*yī*, *bù*) regardless of the tone of the following word, to ensure search consistency.
!!!
## Actual Title Showcasing the Style
重生之黑客系统的逆袭 ~在异世界当骇客,我的粉丝全是僵尸~
Chóngshēng: Hacker Xìtǒng de Nìxí ~Zài Yìshìjiè Dāng Hacker, Wǒ de Fans Quán shì Jiāngshī~
- **Loanwords**: 黑客 (*Hēikè*) / 骇客 (*Hàikè*) → Hacker. 粉丝 (Fěnsī) → Fans.
- **Native Nouns**: *Xìtǒng* (System), *Yìshìjiè* (Otherworld), *Jiāngshī* (Zombie) are kept in Pinyin as they are standard native/Sino compounds or not direct phonetic loans.
- **The "Zhi" Rule**: *Chóngshēng* (Rebirth) is followed by *Zhi* acting as a separator, so it becomes a colon.
- **Particles**: *de* (possessive) is separate and lowercase.
- **Prepositions/Copula**: *zài* (at/in) and *shì* (is) are separate and lowercase.
- **Compounds**: *Nìxí* (Counterattack) is fused.
## Further Clarification
### Loanword Restoration Rule
Chinese often absorbs foreign words phonetically. To improve readability, we restore these to their original foreign counterparts if they are **direct phonetic transliterations**.
- 巴士 (*Bāshì*) → **Bus**
- 沙发 (*Shāfā*) → **Sofa**
- 巧克力 (*Qiǎokèlì*) → **Chocolate**
- 朋克 (*Péngkè*) → **Punk**
- S级 (*S-jí*) → **S-jí** (or **S-Rank** if "Rank" is implied contextually).
!!! note Semantic Translations vs Phonetic Loans
We do **not** translate words that are semantic translations (translating the meaning rather than the sound).
- 电脑 (*Diànnǎo* - Electric Brain) → **Diànnǎo** (Keep Pinyin, do not write *Computer*).
- 手机 (*Shǒujī* - Hand Machine) → **Shǒujī** (Keep Pinyin, do not write *Mobile*).
- **Only** restore words where the Chinese characters were chosen specifically to mimic the English/foreign sound (Phonetic Loans).
!!!
### The Particle "De" (的 / 地 / 得)
The natural tone particle *de* is the most common grammatical marker. It is **always separated** and **lowercase**.
- 红色跑车 (*Hóngsè Pǎochē*) → **Hóngsè Pǎochē** (Red Sports Car - no particle)
- 红色的跑车 (*Hóngsè de Pǎochē*) → **Hóngsè de Pǎochē** (Red-colored Sports Car)
- 慢慢地走 (*Mànmàn de Zǒu*) → **Mànmàn de Zǒu** (Walk slowly - adverbial marker)
- 跑得快 (*Pǎo de Kuài*) → **Pǎo de Kuài** (Run fast - complement marker)
!!! tip Tones on Particles
While often pronounced neutral, for metadata consistency, if a particle has a dictionary tone variant, prefer the **neutral** (no mark) for grammatical particles listed in the table below to reduce keyboard friction, unless distinct emphasis is required.
!!!
### Copula (Is/Am/Are)
The verb **shì** (是) functions as the copula. While syntactically a verb, for Title Styling purposes to align with our Japanese/Korean guidelines, we treat it as a functional connector.
- **Lowercase** *shì* when it acts as a simple distinct equating verb in a title.
- **Fuse and Capitalise** if it is part of a lexicalised compound (e.g., *Wèishénme* - Why).
### Hyphenation and Special Characters
### Personal Names
Surname + Given Name
- 李小龙 → **Lǐ Xiǎolóng**
- 司马光 → **Sīmǎ Guāng** (Compound Surname *Sima* is fused).
### Honorifics and Prefixes
Chinese uses prefixes/suffixes for familiarity. Hyphenate these to clarify they are not part of the name proper.
- 老王 (*Lǎo Wáng*) → **Lǎo-Wáng** (Old Wang)
- 小猫 (*Xiǎo Māo*) → **Xiǎo-Māo** (Little Mao)
- 这里的黎明静悄悄 → *Zhèlǐ de Límíng Jìngqiāoqiāo* (No honorifics here, just standard fusion).
### Numbers and Counters
Use Arabic numerals. Hyphenate the counter.
- 第101次 → **Dì 101-cì** (The 101st Time)
- 3个人 → **3-gè Rén** (3 People)
### The Character "Zhī" (之)
**As a Separator**: If it appears after the initial noun phrase to introduce the subtitle/description, replace it with a **Colon (:)**.
- *System Zhī Wáng* → **System: Wáng**
**As a Possessive**: If it appears inside a phrase meaning "of", write as **zhī** (lowercase).
- *Sānfēn zhī Yī* (One third) → **Sānfēn zhī Yī**
## Detailed Tables
### Common Particles & Functional Words
These words should generally be **lowercase** and **separated**.
| Pinyin | Chinese | Function | Notes |
| ------ | ------- | -------- | ----- |
| **de** | 的 / 地 / 得 | Possessive / Adverbial / Complement | Always neutral tone in this context. |
| **le** | 了 | Aspect particle (Completion) | |
| **zhe** | 着 | Aspect particle (Continuous) | |
| **guò** | 过 | Aspect particle (Experience) | Write with tone *guò* or neutral. |
| **men** | 们 | Plural marker | Usually fused to pronouns (*Wǒmen*), separate for nouns (*Rén men*). |
| **ba** | 吧 | Modal particle (Suggestion) | |
| **ma** | 吗 | Question particle | |
| **ne** | 呢 | Query particle | |
| **bèi** | 被 | Passive marker | |
| **gěi** | 给 | Preposition (To/For) | |
| **zài** | 在 | Preposition (At/In) | |
| **hé / yǔ** | 和 / 与 | Conjunction (And) | |
| **shì** | 是 | Copula (Is/Am/Are) | Lowercase in titles. |
| **wéi** | 为 | Copula (Act as/Become) | Lowercase in titles. |
### Suffixes (Treated as Loanwords/Restoration)
| Chinese | Pinyin Reading | Restoration | Condition |
| ------- | -------------- | ----------- | --------- |
| S级 | *S-jí* | S-Rank | If context implies ranking. |
| APP | *A-P-P* | App | |
| V | | *V* | V | (As in 大V - Big Influencer). |
## LLM Ruleset and Prompts
### YAML Ruleset
```yaml
romanization_style:
name: "MangaBaka Chinese Romamization Style"
version: "1.0"
base_system: "Hanyu Pinyin (ISO 7098)"
tones:
usage: "obligatory"
type: "diacritics"
rule: "Use original dictionary tones (e.g., Nǐ hǎo), do not apply tone sandhi rules."
neutral_tone: "Do not mark neutral tones (e.g., de, le, zi)."
vowels: "Use ü for u-umlaut (e.g., nǚ)."
capitalization: >
Capitalize all Nouns, Verbs, Adjectives, Adverbs, Pronouns, and Auxiliaries.
Lowercase grammatical particles, prepositions, and copulas (shì, wéi).
word_segmentation:
compounds: "Fuse all lexicalized dictionary compounds (e.g., Dàxué, Shénme)."
phrases: "Separate words that are syntactically distinct (Subject Verb Object)."
names: "Surname (Cap) + GivenName (Cap+Fused). Example: Lǐ Xiǎolóng."
erhua: "Fuse the 'r' suffix to the stem (e.g., Huār)."
special_characters:
zhi_rule: >
If '之' functions as a subtitle delimiter, romanize as a colon (:).
If '之' functions as a possessive/grammatical particle, romanize as 'zhī' (lowercase).
apostrophe: "Use strictly before a, o, e syllables that follow another syllable (e.g., Xī'ān)."
loanwords: >
Restore English spelling ONLY if the Chinese word is a direct phonetic transliteration.
Examples: Bāshì -> Bus, Fěnsī -> Fans, Jiákè -> Jacket.
Do not restore semantic translations (e.g., Shǒujī stays Shǒujī, not Mobile).
particles: >
Always separate and lowercase: de, le, zhe, guo, ba, ma, ne, bei, gei, zai, he, yu.
Copulas (shi, wei) are lowercase.
hyphens: >
Use hyphens for:
- Honorific prefixes/suffixes (Lǎo-Wáng, Xiǎo-Lǐ).
- Arabic numerals + Counters (10-gè, Dì-1-cì).
example_validation:
input: "重生之在大巴上捡到了神级系统"
output: "Chóngshēng: Zài Bus shàng Jiǎndàole Shénjí Xìtǒng"
explanation: "Zhī becomes colon. Dàbā restored to Bus. Zài/shàng are prepositions/locatives (lowercase/separate). Shénjí (God-level) is fused."
```
___
## Old Guidelines (Kept temporarily for internal reference. DO NOT USE)
## Overview
This public guide explains how Mandarin Chinese titles are romanised and styled for display on MangaBaka. The goal is readable, frontend-friendly Pinyin with clear editorial rules that keep titles consistent across the site.
**Canonical system:** **Hanyu Pinyin with tone marks** for display, with editorial fallbacks for clarity.
## Title field priorities
* **Native Title:** the original Chinese characters as shown on the cover or source (Simplified or Traditional).
* **Romanized Title:** the Pinyin shown to users (with tone marks where applicable).
## Pinyin display rules
* Use **diacritics (tone marks)** for display: e.g., 鹤报刀法,我一刀一个 → **Hébào Dāofǎ, Wǒ Yì Dāo Yí Gè**.
* Use **ü** for the vowel ü in display (e.g., nǚ).
* Use an apostrophe to disambiguate syllable boundaries when necessary (Xi'an).
* Capitalize content words in title-style capitalization; keep grammatical particles (de, di, le, zhe, ba, ne, ma, ye) lowercased when they are functional.
## 之 as subtitle divider
When 之 is used as a subtitle divider, it is romanised as **a colon (':')**, but when it is used as a possessive, it has to be romanised as **zhī**.
## Loanword Rule
* When a Chinese title clearly uses a loanword and there is a common real-world/English term, prefer the real-world term in the **romanisation** or as the visible English phrase.
> Example: if the title references a specific English concept such as "Rhythm Game," present that term visibly while also keeping pinyin for the native characters.
* Maintain the original Chinese title in the native field.
## Names & capitalization
* Follow Chinese name order (surname first) for metadata unless an established English name exists. In that case, include the common English form as an alternative display title.
* Capitalize main content words in display pinyin; keep small grammatical particles lowercased.
## Numbers, dates, counters
* Use Arabic numerals for clarity and sorting: 第四部 → **Di 4-bù** (display may include tone marks where useful).
* Hyphens for counters are acceptable to align with cross-language conventions: **Di 3-bù: ...**.
## Quotation & punctuation
* Decorative Chinese quotation marks in the native title are retained in the native field. For pinyin display, use standard ASCII quotes.
* Use apostrophes in pinyin display where they clarify syllable boundaries (e.g., **Xī'ān**).
## Examples
* Native: 鹤报刀法,我一刀一个
> Romanization: **Hébào Dāofǎ, Wǒ Yì Dāo Yí Gè**
* Native: 西安事变
> Romanization: **Xī'ān Shìbiàn**
* Native: 女
> Romanization: **nǚ**
## Notes for editors
* Do not adopt publisher romanisation as the default; publishers often use inconsistent romanisations.