MangaBaka Japanese Romanization Style
Table of contents
- Core Romanization Style
- Base Romanization
- Modified Hepburn Romanization
- Adjustments made to Modified Hepburn
- Regarding loanwords, non-native names, and native names
- Capitalization
- Word Spacing, Fusions, and Hypenation
- Expanded Explanation and Examples
- Auxiliaries
- Thematic Portmanteaus and Coined Terms (Zougo)
- Hyphenation
- Honorifics, Titles, and Household/Clan names
- Multi-word Expressions
- Arabic numerals and Their Counters and Modifiers
- Romanization of Native Kanji Numerals and Their Counters
- Romanization of Quotations
- Romanization of Other Special Characters
- Particles, Copula, and Conjunctives
- Particle LUT
- Copula LUT
- Fused or Spit Romanization cases
- ので
- のに
- なので and なのに
- だって
- では
- な + ん + Copula Sequence
- じ + Adjective Sequences
- Noun + Suru Romanization
- Suffixes and Bound Elements
- Bound Elements
- Suffix-like Content Designators
- Prefixes
- Dai- (大) Prefix in Romanization
- Always fused (Fully Lexicalized / Dictionary Compounds)
- Usually separate (Productive Descriptive Modifier)
- Title Examples (for test cases)
- Title 1 MB:359750
- Title 2 MB:8670
- Title 3 MB:30392
- Title 4 MB:46802
- Title 5 MB:28009
- Title 6 MB:9333
- LLM Ruleset and prompts
- YAML ruleset
- YAML Ruleset (2) adjusted by Gemeni -- TESTING
Core Romanization Style
This style guide defines how to consistently romanize and style Japanese-language titles for metadata purposes.
The goal is to provide a standard that is easy to read, type, and apply across various databases, while staying true to linguistic norms.
Base Romanization
Modified Hepburn Romanization
At the base we use Modified Hepburn as the romanization, meaning;
- は, へ, and を as particles are always romanized as wa, e, and o.
- ん is always romanized as n, or as n' — with an (straight) apostrophe — when the ん is followed by a vowel or y.
- ず and づ are both romanized both as zu as they are phonetically identical.
Examples
- 私は学校へ友達を迎えに行きます is romanized as Watashi wa Gakkou e Tomodachi o Mukae ni Ikimasu.
- 婚約 is romanized as Kon'yaku, and similary 転移 is romanized as Ten'i.
- 女 is romanized as Onna there is no need to apostrophe the double n (looking at you Google Translate).
Adjustments made to Modified Hepburn
We have made the following adjustments on Modified Hepburn for ease of use, clarity, and community recognition;
- Represent long vowels using full-length spellings (e.g., ou, oo, uu, aa, ee, ii) instead of using macrons (e.g., ō, ū, ā ē, ī).
- Represent the sokuon (っ) with a “ch”-syllable as cch instead of the “strict” Modified Hepburn tch, however the tch romanization shall be used in the alternative title.
About the sokuon (っ)
We intentionally deviate from strict modified Hepburn romanization when representing sokuon (っ) with “ch”-syllables as the primary romanization. This choice aligns with common usage and recognition within the manga or anime community and preserves visual phonetic consistency, making it more readable and immediately recognizable (e.g, あっちこっち → Acchikocchi and not Atchikotchi, ぼっち・ざ・ろっく! → Bocchi the Rock and not Botchi the Rock).
Regarding loanwords, non-native names, and native names
- Render Non-assimilated or identifiable loanwords, and non-Japanese names in their established source equivalents whenever identifiable (e.g., スキル → Skill, ダンジョン → Dungeon, クリス → Chris).
- Retain clipped slang, or fully lexicalized loanwords in everyday Japanese, where the Japanese phonology or sociocultural meaning is distinct (e.g., ギャル → Gyaru, バイト → Baito, ラブホ → Rabuho).
- Always romanize native Japanese personal and place names using Japanese romanization rules, even if an English exonym is widely known (e.g., 東京 → Toukyou, 大阪 → Oosaka).
About ギャル
ギャル is a perfect example of why certain loanwords should be retained in their Japanese form rather than “restored” to English. Although the term ultimately derives from gal, ギャル has long since diverged in meaning and now refers to a distinctly Japanese sociocultural category with its own aesthetics, subtypes, and historical context. Rendering ギャル as Gal erases this distinction and falsely implies a generic English meaning, whereas Gyaru accurately reflects the term as it functions in Japanese today.
In cases like this, the original foreign source is no longer a recoverable identity as the meaning has become distinct, preserving the kana-derived form avoids semantic loss and cultural flattening.
Capitalization
As we are romanizing titles of Japanese literature, we use the following capitalization style:
- Capitalize all nouns, verbs, adjectives, and adverbs.
- Keep all particles, copulas (and their inflections), and conjunctives in lowercase, even at sentence-final.
Word Spacing, Fusions, and Hypenation
Keep all words—including particles, copulas (and their inflections), and conjunctives—separate, except for the following specific cases:
- Conjunctives & Grammar Bridges are kept fused and lowercase when functioning as a single unit rather than a sequence of distinct parts (e.g., なのに → nanoni, でも → demo, ですが → desuga).
- Morpheme ん (from の) is fused with the following element to prevent a loose letter in the title romanization (e.g., んです → ndesu, んだけど → ndakedo).
- Auxilaries are always fused to the main verb or adjective—whether via the -te form or the stem (e.g, 見てください → Mitekudasai, 可愛すぎる → Kawaisugiru, してない → Shitenai).
- Fossilized adverbs are kept fused as a single unit (e.g., いつの間にか → Itsunomanika, なんとなく → Nantonaku)
- Lexicalized Compounds & Generic Groups are fused when they form a single dictionary unit, a profession, or a general category. (e.g., 女の子 → Onnanoko, 文芸部 → Bungeibu, 駄菓子屋 → Dagashiya).
- Thematic Portmanteaus and Coined Terms are treated as single, fused lexical units, even if they are not found in standard dictionaries. (e.g., 異世改活 → Iseikaikatsu, 死に戻り → Shinimodori).
- Reduplications (Echo) are fused into a single word when a word is repeated for emphasis or as onomatopoeia (ムリムリ → Murimuri, ワクワク → Wakuwaku, フワフワ → Fuwafuwa).
- Leading honorifics o- and go- are fused without a hyphen and capitalized (お姫 → Ohime, ご主人 → oshujin).
- Hyphenation is used to protect the identity of a Proper Noun or to mark functional/relational boundaries that are not part of a compound noun’s core identity.
Kudasai (ください) in modern Japanese grammar is a request auxiliary when it is preceeded by a -te form of a verb, and is thus always fused like auxilary verbs (見てください → Mitekudasai).
Two consecutive -te form verbs are always separated, any auxilaries present are fused to the immediately preceding final -te form.
- 放っておく→ Houtteoku: Is kept fused, as it is a set dictionary expression / compound verb.
- 放っておいてくれ → Houtte Oitekure: The added auxiliary verb turns oku into its -te form, separating it from houtte; the auxiliary kureru is then fused.
- 放っておいてくれません → Houtte Oitekuremasen: An extra contracted auxiliary -masen is present, and is directly fused to the preceding auxiliary verb.
Suru and Suru Inflected Forms Following Nouns
- When suru or its inflected forms (e.g., shita, sareta) attach to a noun, they are romanized separately and with the suru-form capitalized (e.g., 転生した → Tensei Shita).
- Auxiliary elements are directly fused to the suru-form without a hyphen (e.g., 毛嫌いしていた → Kegirai Shiteita, 結婚してください → Kekkon Shitekudasai).
- Fixed or lexicalized suru-verbs such as 恋する (Koisuru), 愛する (Aisuru), 画する (Kakusuru) are romanized fused, since suru is part of the verb stem.
Note that -su verbs (stem) are unrelated to Suru; their -shite or -shita forms come from regular conjugation (e.g, 隠して → Kakushite, 話した → Hanashita).
Semantic Binomial and Determinative Compounds
Semantic binomial compounds — fixed expressions formed from two or more morphemes in an “X and Y” relationship — may be split and romanized separately for clarity and consistency.
Similarly, determinative compound nouns — where the first element modifies or specifies the second — are also romanized separately even when written as a single word in Japanese.
- 王侯貴族 → Oukou Kizoku (“kings and nobles”)
- 士農工商 → Shinou Shoukou (“warriors, farmers, artisans, and merchants”)
- 国外追放 → Kokugai Tsuihou (“banishment from the country”)
- 国内旅行 → Kokunai Ryokou (“domestic travel”)
Strongly lexicalized idioms, such as four-character idiomatic compounds (yojijukugo), should remain fused, since they function as a single lexical unit
- 悠々自適 → Yuuyuujiteki
- 一石二鳥 → Issekinichou
Expanded Explanation and Examples
Auxiliaries
Japanese auxiliary elements are always fused with the preceeding verb or adverb, to modify aspect, voice, or nuance, and can be rougly divided in three groups.
- Auxiliary verbs (e.g. -kuru, -iku, -shimau, -ageru, -morau, -kudasai, etc.) typically follow the -te form of a main verb and express sequence, aspect, completion, or request.
- Contracted auxiliaries (e.g. -iru → -teiru/-teru, -ita → -teita/-teta, -iyou → -you, -mashita, etc.) represent shortened or conjugated forms of auxiliary verbs.
- Adjectival auxiliaries (e.g. -sugiru, -yasui, -nikui, -rashii, etc.) to express degree, desire, or ease/difficulty.
Some Examples of the three groups
- 頑張ってくれ → Ganbattekure
- 見てください → Mitekudasai
- 行ってしまった → Itteshimatta
- 楽しんでいたら → Tanoshindeitara
- 食べてる → Tabeteru
- していた → Shiteita
- 可愛すぎる → Kawaisugiru
- 集まりすぎました → Atsumarusugimashita
- 読みやすい → Yomiyasui
Thematic Portmanteaus and Coined Terms (Zougo)
Many Japanese titles feature unique, coined compounds (造語, zougo) created by blending or clipping multiple existing words into a single keyword. These function as a specific "brand name" or thematic identity for the series.
For romanization, these are to be treated as single, fused lexical units, even if they are not found in standard dictionaries. This preserves the "wordplay" and distinguishes the term from standard descriptive modifiers. If the component parts are clipped or intended to be read as a single concept, they should always be fused.
Examples
Hyphenation
The word following the hyphen should always be in lower-case and without extra spaces surrounding the hyphen.
Honorifics, Titles, and Household/Clan names
- 直美さん is romanized as Naomi-san (name + honorific)
- お兄ちゃん is romanized as Onii-chan (kinship + “honorific”)
- 田中先生 is romanized as Tanaka-sensei (name + title)
- 魔王様 is romanized as Maou-sama (demon king + honorific)
- 天使様 is romanized as Tenshi-sama (rank + honorific)
- 聖騎士さま is romanized as Seikishi-sama (rank + honorific)
- 京兼家 is romanized as Kyougane-ke (household/clan name)
- 篝家 as Kagari-ke (household/clan name)
Titles that are a standalone noun title, such as 陛下, are always romanized separately and without a hyphen; 国王陛下 → Kokuou Heika.
Be careful of lexicalized kinship terms as they are fused without an hyphen.
- お姉さん / おねえさん is romanized as Oneesan (elder sister)
- お兄さん / おにいさん is romanized as Oniisan (elder brother)
- 叔父さん / おじさん is romanized as Ojisan (uncle or middle-aged man)
- 叔母さん / おばさん is romanized as Obasan (aunt or middle-aged woman)
- 奥さん is romanized as Okusan (wife)
- お嫁さん is romanized as Oyomesan (bride + polite prefix)
- おっさん is romanized as Ossan (old man)
- 皆さん is romanized as Minasan (everyone)
Do not hyphenate -sama when the entire word is an established noun in dictionaries and/or function as fixed vocatives rather than compositional “noun + title” expressions.
- 神様 is romanized as Kamisama (god)
- 姫様 is romanized as Himesama (princess)
- 旦那様 is romanized as Dannasama (husband or master of x)
- お嬢様 is romanized as Ojousama (young lady)
- ご主人様 is romanized as Goshujinsama (master)
- おひとり様 is romanized as Ohitorisama (a person alone)
Avoid double hyphenation by fusing the honorific when applicable.
- れい姉ちゃん is romanized as Rei-neechan
Multi-word Expressions
These words function as a single semantic or grammatical unit, particularly modifier–noun phrases, semi-lexicalized constructions, or commonly recognized set expressions. This is especially common when the first element is a katakana loanword or roman-letter term or abrivation, where the hyphen helps clarify word boundaries and preserves the perception of a single semantic unit
- Sランク is romanized as S-Rank (Considered as a fully loaned word, so it is capitalized as such).
- バイト先 is romanized as Baito-saki (先 is fused with a hyphen due to the katakana loanword; location).
- ポーション師 is romanized as Potion-shi (師 is fused with a hyphen due to the katanana loanword; master).
- S級 is romanized as S-kyuu (級 is fused with a hyphen due to the Roman letter term; class).
- 僕たち is romanized as Boku-tachi (たち is always fused with a hyphen; pluralization).
- 環状戦 is romanized as Kanjou-sen (級 is always fused with a hyphen; battle/war/fight).
- 契約婚 is romanized as Keyaku-kon (“contract marriage” hyphenated due to the clipped 婚約)
Arabic numerals and Their Counters and Modifiers
Arabic numerals followed by counters are always fused with a hyphen to the counter.
The counter uses its regular numeral-attached reading; the standard form used in dictionaries and this is most often On’yomi, but not always.
- 2人 is romanized as 2-nin (and not as 2-ri using the -ri from “futari” special reading).
- 1000体 is romanized as 1000-tai.
- 10年 is romanized as 10-nen.
- 8月31日 is romanized as 8-gatsu 31-nichi.
- 8号 is romanized as 8-gou.
Avoid double hyphenation by fusing the modifier to the counter or by fusing the counter to the preceding number unit.
- 10年間 is romanized as 10-nenkan as number + counter (年) + fused modifier (間).
- 222日目 is romanized as 222-nichime as number + counter (日) + fused modifier (目).
- 31番目 is romanized as 31-banme as number + counter (番) + fused modifier (目).
- 6歳上 is romanized as 4-saijou (number + counter (歳) + fused lexicalized (上)).
- 10歳下 is romanized as 10-saishita as number + counter (歳) + fused lexicalized (下).
- 8万枚 is romanized as 8-manmai as number + number unit (万) + fused counter (枚).
- 3億円 is romanized as 3-okuen as number + number unit (億) + fused counter (円).
- 10万年 is romanized as 10-mannen as number + number unit (万) + fused counter (年).
後 and 前 are treated as unit-final temporal suffixes in numeral constructions, and they are also fused to the counter to avoid double hyphenation.
When 後 or 前 follow an Arabic-numeral counter construction, they are treated as unit-final temporal suffixes and read as -go and -mae respectively.
- 100日後 is romanized as 100-nichigo
- 100年後 is romanized as 100-nengo
- 100日前 is romanized as 100-nichimae
- 100年前 is romanized as 100-nenmae
Note: While -zen is technically a Sino-Japanese reading (On-yomi) that matches -go, in modern Japanese, -zen as a suffix is almost exclusively formal, historical, or academic.
When the counter is a loanword then write the number and counter like it would be in English, the hyphen can then be used for the modifier.
- 5キロ is romanized as 5 Kilo (Capitalized because it is used in a title).
- 5キロ減 is romanized as 5 Kilo-ge (number + counter + modifier).
- 2メートル is romanized as 2 Meter (Capitalized because it is used in a title).
- 2メートル越え is romanized as 2 Meter-koe (number + counter + modifier).
Romanization of Native Kanji Numerals and Their Counters
Native anji numerals followed by counters are always fused without a hyphen to the counter, any modifier present is also fused directly to the counter.
This ensures that special numeral–counter readings or suppletive readings are reflecting natural lexicalization and are correctly preserved in romanization.
- 二人 → Futari (and not Ninin)
- 一年 → Ichinen
- 四年生 → Yonensei
- 八男 → Hachinan
後 and 前 are treated as unit-final temporal suffixes in numeral constructions, and they are bound with a hyphen to the counter like normal bound suffixes
- 二日前 → Futsuka-mae
- 数日前 → Suujitsu-mae
- 三年後 → Sannen-go
Romanization of Quotations
- Japanese quotation marks
「文章」, and『文章』are to be romanized to double straight quotation marks"text". - Any straight or smart quotation marks
“文章”are to be romanized to double straight quotation marks"text". - If the lenticular brackets
【文章】is clearly used as quotation then it also should be romanized to double straight quotation marks"text".
In practice when lenticular brackets 【文章】 hold a "skill name" then it is usually romanized as [text].
Romanization of Other Special Characters
- Full-width square brackets
[文章]or lenticular brackets【文章】when unclear if they are for quotation, are to be romanized to normal square brackets[text]. - Double Angle Brackets
《文章》or the Much Less/Greater-Than≪文章≫are to be romanized to double straight quotation marks"text" - Any other (unicode)symbol present (×,♥,♡,★,☆,○,♂,♀,ect.) shall be copied as-is but there shall nearly always (overruled by author/cover styling) be a space around these symbols!
Some Examples
- 《魔力無限》 is romanized as "Maryoku Mugen".
- L♥DK is romanized as L♥DK, kept as-is without any spaces.
- まどか★マギカ is romanized as "Madoka★Magica", author/cover specific styling.
- 黒騎士♂、戦闘メイド♀に is romanized as Kurokishi ♂, Sentou Maid ♀ ni.
Particles, Copula, and Conjunctives
Particle LUT
| Expression | Romanization | Function / Meaning |
|---|---|---|
| は | wa | topic particle |
| が | ga | subject particle |
| を | o | object particle |
| の | no | Genitive or nominalizer particle |
| な | na | adjectival(-noun) linking particle |
| に | ni | particle “to / at / in” |
| と | to | particle “and / with / that” |
| で | de | particle “at / in / by means of” |
| へ | e | particle “toward” |
| も | mo | particle “also / even” |
| や | ya | particle “and / among other things” |
| だけ | dake | particle “only” |
| まで | made | particle “to / until” |
| から | kara | particle “from / because” |
| とか | toka | particle “and so on” / “for example” |
| より | yori | particle “x rather than y / x over y” |
| にて | nite | particle (formal / literary) “at / in / by means of / on the occasion of” |
| しか | shika | particle “nothing but / except / no more than” always followed by a negation |
| ばかり | bakari | particle “only / merely / nothing but / no more than” |
| ばっか | bakka | particle “only / merely / nothing but / no more than”; colloquial, clipped form of ばかり |
| なんか | nanka | particle “something like... / things like...”; not to be confused with 何か (Nanka) |
| なら | nara | particle (conditional) “if / as for” |
| など | nado | particle-like “etc” / “and the like” |
| として | toshite | particle-like “as / in the role of / for / from the viewpoint of” |
| けど | kedo | conjunctive particle “but / although” |
| でも | demo | conjunctive particle “but / however” |
| し | shi | conjunctive particle “and / not only that / but...” |
| だに | da ni | copula + particle, always separate, even if even if archaic or when some dictionaries list it as a “particle” |
| なの | na no | copula + nominalizer structure, always separate, even when used sentence-finally or labeled an “expression” |
| のか | no ka | nominalizer + question particle structure, always separate, even if some dictionaries list it as a “particle” |
| ので | no de | nominalizer + particle (“because / since”), causal relation; See ので — Fused or Split? |
| node | conjunctive (“because / so”); See ので — Fused or Split? | |
| のに | no ni | nominalizer + particle (“despite the fact that / although”), clause-linking; See のに — Fused or Split? |
| noni | conjunctive (“although / even though”); See のに — Fused or Split? | |
| なので | na node | copula + conjunctive node, split after Formal Nouns (e.g., You, Wake, Hazu); See なので — Fused or Split? |
| nanode | lexicalized conjunctive “because / since”, after plain nouns and na-adjectives; See なので — Fused or Split? | |
| なのに | na noni | copula + conjunctive noni, split after Formal Nouns (e.g., You, Wake, Hazu); See なのに — Fused or Split? |
| nanoni | conjunctive “even though / despite that”, after plain nouns and na-adjectives; See なのに — Fused or Split? | |
| では | de wa | copula/particle construction marking contrast or condition, often before negation; See では — Fused or Split? |
| dewa | discourse marker (“well then / so / in that case”) at clause-start only; See では — Fused or Split? | |
| それでは | soredewa | discourse marker / conjunctive expression “well then / so / in that case” |
| だけど | dakedo | lexicalized conjunctive “but / however / although” |
| だから | dakara | lexicalized conjunctive “therefore / that’s why / because …” |
| なんて | nante | lexicalized conjunctive “such a thing as / like” |
| か | ka | sentence-ending particle to indicate a question or after alternatives in a summary, always separate |
| ね | ne | sentence-ending particle “right / isn’t it”, always separate |
| わ | wa | sentence-ending particles for emphasis or tone (feminine), always separate |
| ぞ / ぜ / さ | zo / ze / sa | sentence-ending particles for emphasis or tone (masculine), always separate |
| もん | mon | sentence-ending / explanatory particle-like contraction (from もの) indicates reason or excuse, or dissatisfaction (feminine) |
Copula LUT
| Expression | Romanization | Function / Meaning |
|---|---|---|
| だ | da | plain copula |
| だが | daga | copula + が, lexicalized adversative conjunctive (“however / but”) |
| だった | datta | plain past copula |
| だったが | dattaga | copula + が, lexicalized adversative conjunctive (past) |
| です | desu | polite copula |
| ですが | desuga | lexicalized conjunctive (“but / however / nevertheless”) |
| ですから | desukara | polite conjunctive (“therefore / so”), lexicalized discourse connector, clause-initial, or sentence-final |
| でした | deshita | polite past copula |
| だろう | darou | plain conjectural (“probably / I suppose”) |
| でしょう | deshou | polite conjectural (“probably / I suppose”) |
| でしょ | desho | colloquial contraction of deshou (“right? / isn’t it”) |
| である | de aru | formal written copula (non-polite) |
| であった | de atta | formal written past copula |
| だと | da to | copula + quotative particle, always separate; marks said / thought / considered content |
| だって | da tte | copula + quotative particle, transparent structure; See だって — Fused or Split? |
| datte | lexicalized particle (“even / after all / because”); See だって — Fused or Split? |
Fused or Spit Romanization cases
ので
In practice, especially for Japanese titles, ので is nearly always the conjunctive node!
Romanize ので single conjunctive node when:
- It functions as a single conjunctive particle meaning “because / so / since”.
- Commonly follows verbs, adjectives, or copulas to connect cause and result.
- Very often written as ので、 — with a comma — when starting a new clause.
- Example: 困ったので帰ります → Komatta node Kaerimasu (“Because I got into trouble, I’ll go home”).
Romanize ので as two particles no de — in extremely rare cases for titles — when:
- の no is the genitive or nominalizing particle, and で de marks the location or instrument (with / by / at).
- The meaning is literal “of X + with / by / at” rather than “because / so / since”.
- If you can replace の no with な na or remove の no entirely and the phrase still makes sense, it is two particles.
- Example: 本の写真で説明する → Hon no de Shashin de Setsumei Suru (“Explain with pictures of the book”).
のに
In practice, especially for Japanese titles, のに is nearly always the conjunctive noni!
Romanize のに as single conjunctive noni when:
- It functions as a single conjunctive particle meaning “despite / even though”.
- Commonly follows i-adjectives, verbs, or noun+particle structures.
- Very often written as のに、 — with a comma — when starting a new clause.
- Example: みんな疲れているのに頑張っている → Minna Tsukareteiru noni Ganbatte Iru (“Even though everyone’s tired, they’re pushing on”).
Romanize のに as two particles no ni — in extremely rare cases for titles — when:
- の no is the genitive or nominalizing particle, and に ni marks location or direction (at / in / on).
- The meaning is literal “of X” + “at / in / on” rather than “despite / even though”.
- Example: 君のに夢を置いた → Kimi no ni Yume o Oita (“I placed a dream in yours”).
なので and なのに
The sequences なので and なのに are handled differently depending on what precedes them.
Romanize なので or なのに as single conjunctive nanode or nanoni when they follow a “Concrete Noun” or a “Na-Adjective”:
- Functioning as a bridge meaning “because / since” or “even though / despite that” respectively.
- Concrete Nouns: Sensei nanode, Manga nanoni, Himitsu nanode.
- Na-Adjectives: Suki nanoni, Kirei nanode, Shizuka nanode.
Romanize なので or なのに as a copula + conjunctive structure na node or na noni when they follow “Formal Noun”:
- This prevents these “weighty” Formal Nouns from being swallowed by the grammar.
- See the list down below for the these Formal Nouns (abstract concepts) and are generally written in kana only.
| Formal Noun | Meaning | Example Structure | Example Romanization |
|---|---|---|---|
| よう (You) | Seeming / Appearance | ようなので | You na node |
| わけ (Wake) | Reason / Way | わけなのに | Wake na noni |
| ため (Tame) | Sake / Purpose | ためなので | Tame na node |
| こと (Koto) | Thing / Fact | ことなのに | Koto na noni |
| はず (Hazu) | Expectation / Should | はずなので | Hazu na node |
| もの (Mono) | Object / Reason | ものなのに | Mono na noni |
| つもり (Tsumori) | Intention | つもりなので | Tsumori na node |
| とき (Toki) | Time / When | ときなのに | Toki na noni |
| ところ | Place / Moment | ところなので | Tokoro na node |
Quick Test
If the word before na is an abstract concept such as “Time”, “Reason”, “Seeming”, “Intention”, “Expectation”, etc. the na is separated; na noni.
On the other hand if the word is a concrete “thing” then it is the full conjunctive; nanoni.
When na is separated for these formal nouns, the remaining element is always the conjunctive node or noni. There is never a "na no de" or "na no ni" structure in this context.
だって
Romanize だって as datte (conjunctive particle) when:
- It functions as a single conjunctive particle meaning “even / after all / because”.
- It usually appears before nouns or at the beginning of a sentence.
- It can often be replaced with も mo without changing the meaning.
- Example: 女の子だって遊びたい → Onnanoko datte Asobitai (“Even girls want to play.”)
- Test: Onnanoko mo Asobitai
Romanize だって as da tte (copula + partcile) when:
- It is literally da (copula “is/was”) followed by tte (quotative/colloquial topic particle).
- It usually follows a noun or adjective to state “X is” + quoted/topic element.
- You can replace だ da with です desu or だった datta and the meaning stays the same.
- Example: 「犯人は君だ」って言った → Hannin wa Kimi da tte Itta (“I said ‘The culprit is you’”)
- Test: Kimi desu tte itta
では
If では follows a noun or phrase and directly governs a predicate (especially before negation, contrast, or evaluation), they are always as two separate elements de wa.
Any standalone adjective or adjectival noun following では, are always separated and capitalized as independent words;
- ではない → de wa Nai
- ではなく → de wa Nakku
- ではいられない → de wa Irarenai
- では済まない → de wa Sumanai
If では appears sentence-initially or functions as a discourse marker meaning “well then / in that case”, treat it as lexicalized conjunctive dewa.
な + ん + Copula Sequence
The な + ん + copula sequence is treated as a single grammatical bridge:
- The morpheme ん (from の) fuses directly with the following copula or conjunctive as per the core style (e.g., んです → ndesu).
- The initial な is then fused to create a single lowercase unit which reflects the function of a fixed explanatory phrase (“it is that…” / “the reason is…”).
Ensure that it nan comes from the particles な + の, and not from 何 (“What”) as it would be otherwise separated and capitialized (e.g, 何ですか → Nan desu ka as for “What is it?”).
Examples
- なんです → nandesu
- なんだ → nanda
Title Analysis of 地雷なんですか?地原さん
- Romanized as: Jirai nandesu ka? Chihara-san
- Interpretation: “Is it [the case] that you are a landmine (girl)?, Chihara-san.”
If it were capitalized “Jirai [wa] Nan desu ka? Chihara-san” the reading would shift to “What is a landmine? Chihara-san” which misses the explanatory nuance of the title..
じ + Adjective Sequences
When the copula ja is followed by a standalone adjective or adjectival noun, they are kept as separate and capitalized as independent words. This preserves the semantic weight of the negative or prohibitive word.
- Ja is a contraction of the cluster “de wa”, so it is treated as a copula and as per the core style copula are always written separately and in lowercase.
- The negative adjective or adjectival noun (e.g. nai, dame) is not fused to ja and capitalized as independent words.
Note that じゃん is always romanized separate and in lowercase as jan, as it is a colloquial contraction of ja Nai (and not from ja + no).
Examples
- 好みじゃない → Konomi ja Nai
- 聖女じゃなかった → Seijo ja Nakatta
- 学生じゃなくなる → Gakusei ja Naku Naru
- 浮気じゃダメ → Uwaki ja Dame
- これじゃ無理 → Kore ja Muri
- 可愛いじゃん → Kawaii jan
Noun + Suru Romanization
When Suru or Suru inflected forms (Shita, Shite, Sareta, Saseru, etc.) attaches productively to a (productive) noun write it separately and capitalize it.
These are simply nouns that gain verbal force through the addition of suru; dictionaries may list them as so called “suru”-verbs but morphologically they are still a noun + suru.
Examples
- 転職する → Tenshoku Suru
- 辞退して → Jitai Shite
- 転生した → Tensei Shita
- 追放された → Tsuihou Sareta
- 愛します → Aishimasu
By contrast, fixed or lexicalized suru-verbs such as 恋する (Koisuru), 愛する (Aisuru), or 画する (Kakusuru) are romanized fused, since suru is no longer detachable but part of the verb stem.
Dictionaries may list them as normal suru-verbs, found by its suru base form, and they should be treated as a verb.
Examples
- 愛して → Aishite
- 恋した → Koishita
- 画された → Kakusareta
Auxiliary verbs or contracted auxilaries are directly fused to the suru-form without a hyphen
Examples
- 毛嫌いしていた → Kegirai Shiteita
- 溺愛されてました → Dekiai Saretemashita
- 全国配信してしまう → Zenkoku Haishin Shiteshimau
- 追放してきた → Tsuihou Shitekita
- 愛してる → Aishiteru
Suffixes and Bound Elements
Bound Elements
| Kanji | Reading | Handling Logic | Native Example | Loanword/Letter Example | Note |
|---|---|---|---|---|---|
| たち | -tachi | Always hyphenated | 私たち → Watashi-tachi | エルフたち → Elf-tachi | Pluralization |
| 好き | -zuki / -suki | Always hyphenated | オシャレ好き → Oshare-zuki | TS好き → TS-suki | x-lover |
| 殺し | -goroshi / -koroshi | Always hyphenated | 神殺し → Kami-goroshi | モンスター → Monster-koroshi | x-slayer |
| ぐらし | -gurashi | Always hyphenated | 辺境ぐらし → Henkyou-gurashi | スローライフ → Slow Life-gurashi | x-life |
Real Title Examples
- オシャレ好き → Oshare-zuki (“Fashion Lover”).
- TS好き → TS-suki (“TS Lover”).
- 神殺し → Kami-goroshi (“God Slayer”).
- 辺境ぐらし → Henkyou-gurashi (“Frontier Life”).
Suffix-like Content Designators
These suffix-like content designators (編, 譚, 章, 録, etc.) have their grammatical behavior shifts depending on what precedes them;
- When directly attached to a single noun or compound, they act as suffixes and are hyphenated to the preceding word (冬編 → Fuyu-hen, 妖怪譚 → Youkai-tan).
- When following a phrase or clause-like structure, they function as independent nouns and are separated and capitalized (俺が最強になった編 → Ore ga Saikyou ni Natta Hen, 斬吸血鬼譚 → Zan Kyuuketsuki Tan).
Hyphens mark tight morphological fusion, while separation marks syntactic or semantic independence.
Prefixes
Dai- (大) Prefix in Romanization
The Dai (大) prefix behaves differently depending on whether the combination is fully lexicalized or still feels like a “descriptive” modifier + noun.
- Fuse if the term looks like a single dictionary entry or a well-known compound.
- Separate if dai- is clearly a modifier in front of a standalone noun, especially in titles for emphasis.
Always fused (Fully Lexicalized / Dictionary Compounds)
| Kanji | Romanization | Meaning | Notes |
|---|---|---|---|
| 大好き | Daisuki | to love | Verb-like, fused |
| 大嫌い | Daikira | to hate | Fused |
| 大人気 | Daininki | very popular | Fused |
| 大学 | Daigaku | university | Fused |
| 大天使 | Daitenshi | archangel | Fixed compound, standard usage |
| 大冒険 | Daibouken | great adventure | Common collocation, borderline lexicalized |
| 大事故 | Daijiko | major accident | Semi-fixed, idiomatic |
| 大成功 | Daiseikou | great success | Semi-fixed, idiomatic |
| 大作戦 | Daisakusen | great plan | Semi-fixed, idiomatic |
| 大問題 | Daimondai | serious problem | Semi-fixed, idiomatic |
Usually separate (Productive Descriptive Modifier)
| Kanji | Romanization | Meaning | Notes |
|---|---|---|---|
| 大冒険者 | Dai Boukensha | great adventurer | Descriptive title |
| 大賢者 | Dai Kenja | great sage | Fantasy / epithet |
| 大妖怪 | Dai Youkai | great youkai | Descriptive |
| 大聖堂 | Dai Seidou | great cathedral | Semi-fixed but descriptive visible |
| 大戦士 | Dai Senshi | great warrior | Descriptive |
| 大魔王 | Dai Maou | great demon lord | LN-style epithet |
| 大聖女 | Dai Seijo | great saintess / holy maiden | Fantasy epithet |
| 大魔導士 | Dai Madoushi | great magician / archmage | Title-style descriptor |
| 大聖者 | Dai Seija | great saint / holy one | Fantasy epithet |
Some entries (e.g., 大聖女, 大魔導士, 大聖者) are morphologically lexicalized compounds, but in light novel and manga titles they are commonly romanized with a space (Dai Seijo, Dai Madoushi, Dai Seija) to emphasize the descriptive “great / arch” modifier.
Title Examples (for test cases)
Title 1 MB:359750
異世界グルメで成り上がり無双 ~山に追放されたので、のんびりキャンプを楽しんでいたらいつの間にか強くなっていて、王侯貴族や実力者たちが俺を放っておいてくれません。一方、俺を追放した貴族たちは破滅が始まる~
Isekai Gourmet de Nariagari Musou: Yama ni Tsuihou Sareta node, Nonbiri Camp o Tanoshindeitara Itsunomanika Tsuyoku Natteite, Oukou Kizoku ya Jitsuryokusha-tachi ga Ore o Houtte Oitekuremasen. Ippou, Ore o Tsuihou Shita Kizoku-tachi wa Hametsu ga Hajimaru
Elaboration
This title showcases long te-form chains with fused auxiliaries, where verbal sequences are treated as single units rather than spaced word-by-word. Loanwords like Gourmet and Camp are rendered in their origin forms, while productive Japanese compounds such as Nariagari remain fused. Itsunomanika is treated as a fossilized adverb and written as one word. Semantic binomials (王侯貴族) are split for transparency, plural groups use -tachi, and benefactive auxiliaries like oite kuremasen are fully fused. Conjunctions such as node are lexicalized and kept as single units. It also contains the rare ya particle.
Title 2 MB:8670
信じていた仲間達にダンジョン奥地で殺されかけたがギフト『無限ガチャ』でレベル9999の仲間達を手に入れて元パーティーメンバーと世界に復讐&『ざまぁ!』します!
Shinjiteita Nakama-tachi ni Dungeon Okuchi de Korosarekaketa ga Gift “Mugen Gacha” de Level 9999 no Nakama-tachi o Te ni Irete Moto Party Member to Sekai ni Fukushuu & “Zamaa!” Shimasu!
Elaboration
Here the focus is on auxiliary fusion and modern stylistic elements. Verb forms like korosarekaketa are fused according to auxiliary rules, while te ni irete reflects standard te-form chaining. Loanwords (Dungeon, Gift, Level, Party Member) retain their non-assimilated English forms. Group nouns take -tachi, quotation marks are preserved for in-text labels and slang, and symbols such as & are kept as-is when functioning stylistically rather than grammatically. Colloquial expressions like zamaa are retained verbatim rather than normalized.
Title 3 MB:30392
俺が好きなのは妹だけど妹じゃない
Ore ga Suki na no wa Imouto dakedo Imouto ja Nai
Elaboration
This title is a clean example of structural transparency. The sequence na + no is treated as copula plus nominalizer and always kept separate, even though it appears fused in kana. The contrast is carried by dakedo, which is treated as a single lexicalized conjunctive unit. The negative copula ja + Nai is kept split to reflect its morphology rather than collapsed into a pseudo-particle.
Title 4 MB:46802
異世界転生して魔女になったのでスローライフを送りたいのに魔王が逃がしてくれません
Isekai Tensei Shite Majo ni Natta node Slow Life o Okuritai noni Maou ga Nigashitekuremasen
Elaboration
This title demonstrates how conjunctive particles and auxiliaries dominate spacing decisions. Both node and noni are treated as fused conjunctive forms. The benefactive kureru in nigashitekuremasen functions as an auxiliary and therefore fuses to the preceding te-form; the -shite here comes from a verb (-su), not a noun + suru construction, so no separation occurs. Loanwords like Slow Life are rendered in origin form, while native verbs follow standard inflectional fusion.
Title 5 MB:28009
農民関連のスキルばっか上げてたら何故か強くなった。
Noumin Kanren no Skill bakka Agetetara Nazeka Tsuyoku Natta.
Elaboration
This title highlights colloquial particles and fossilized adverbs. Bakka is retained as written as a clipped, casual form of bakari, rather than being expanded or normalized. Nazeka is treated as a fossilized adverb and written as a single unit. Verb forms like agetetara and natta follow normal auxiliary fusion rules, while Skill is kept in its origin form as a non-assimilated loanword.
Title 6 MB:9333
真の仲間じゃないと勇者のパーティーを追い出されたので、辺境でスローライフすることにしました
Shin no Nakama ja Nai to Yuusha no Party o Oidasareta node, Henkyou de Slow Life Suru Koto ni Shimashita
Elaboration
This title combines copular negation, productive suru constructions, and passives. The ja + Nai structure is kept split, while Slow Life Suru follows the productive noun + suru rule rather than being fused or translated. Oidasareta reflects regular verb inflection of oidasu into the passive, and node again functions as a fused conjunctive particle. Loanwords (Party, Slow Life) remain in origin form, while native place terms like Henkyou are romanized normally.
LLM Ruleset and prompts
YAML ruleset
romanization_style:
name: "MangaBaka Romanization Style"
version: "1.2"
base_system: "Modified Hepburn (customized)"
macrons: false
long_vowels: "Always written fully (ou, oo, uu, aa, ee, ii)"
sokuon_rule: "Represent っ before ch as cch (e.g., Maccha instead of Matcha)"
n_rule: "ん is romanized as n. Use n' when ん is followed by a vowel or y to avoid ambiguity (e.g., Ten'i, Kon'yaku)."
capitalization: >
Capitalize all nouns, verbs, adjectives, adverbs, and auxiliary verbs when written separately.
Keep all particles, copulas, and copula inflections or contractions lowercase. Copula forms
(da or desu and all inflections) are never capitalized, even when sentence-final.
suru_taxonomy: >
Distinguish suru-forms as follows:
(1) Productive noun + Suru constructions are written separately, with Suru
and its inflections capitalized (e.g., Benkyou Suru, Tensei Shita).
(2) Verbs ending in -su are true verbs, not noun + Suru; their inflected forms
fuse normally with auxiliaries (e.g., Mezashite, Otoshita).
(3) Fixed or lexicalized suru-verbs listed as single verbs in dictionaries are
always fused, since Suru is part of the verb stem (e.g., Koisuru, Aisuru, Kakusuru).
auxiliaries: >
All auxiliary verbs and auxiliary-like forms are fused to the main verb
or adjective, whether attached via the -te form, the stem, or as
contracted forms (e.g., Natteshimatta, Kawaisugiru, Shiteita).
te_form_chains: >
When two or more verbs appear consecutively in the -te form, each lexical verb
is written separately. Any auxiliaries that follow are fused to the immediately
preceding final -te form. Examples are:
- 放っておく → Houtteoku
- 放っておいてくれ → Houtte Oitekure
- 放っておいてくれません → Houtte Oitekuremasen
particles: >
Particles are always written separately and in lowercase
(e.g., no, ga, o, de, mo, ni, wa).
conjunctive_particles: >
Conjunctive particles are fused when functioning as a single lexical unit
(e.g., nanoni, demo, dakedo, desuga). Node and noni are fused unless they are
clearly separate particles in a grammatical construction.
ja_auxiliary: >
Forms like じゃない and じゃだめ are romanized as ja + Auxiliary
(e.g., ja Nai, ja Dame). Ja is treated as a particle (from de wa) and
remains separate and lowercase; auxiliaries are capitalized when
written separately. Contractions like じゃん are romanized as jan.
compounds_semantic_binomial: >
Semantic binomial compounds (X and Y relationships) may be split for
clarity and consistency (e.g., Oukou Kizoku, Shinou Shoukou).
compounds_determinative: >
Determinative compounds (modifier + head noun) are written as separate
words unless strongly lexicalized (e.g., Kokugai Tsuihou, Kokunai Ryokou).
lexicalized_idioms: >
Strongly lexicalized idioms and fixed expressions, including yojijukugo,
remain fused (e.g., Yuuyuujiteki, Issekinichou, Jakunikukyoushoku).
lexicalized_adverbs: >
Lexicalized adverbs and set expressions are written as single words
(e.g., Itsunomanika, Nazeka, Nanto, Nidoto, Doushitemo).
native_personal_and_place_names: >
Always romanize native Japanese personal and place names using Japanese romanization
rules, even if an English exonym is widely known (e.g., Toukyou, Oosaka).
loanwords: >
Render Non-assimilated or uncommon loanwords, and non-native personal or place names
in their established native-language equivalents whenever identifiable (e.g., Skill, Dungeon).
Retain loanwords that are clipped slang, phonologically adapted, or fully lexicalized in everyday Japanese,
where the foreign source is no longer treated as a recoverable identit (e.g., Baito, Rabuho, Gyaru).
hyphens: >
Use hyphens with honorifics, personal titles, household or clan names,
Arabic numbers with counters, and multi-word expressions functioning as
one unit. Honorifics are hyphenated except for lexicalized kinship terms
(e.g., oneesan, ojisan) or established dictionary nouns (e.g., kamisama).
Arabic numbers + counter + modifier use a single hyphen with the modifier
fused to the counter (e.g., 31-banme) including 後 or 前 which are treated
as Sino-Japanese unit-final temporal suffixes and read as -go and -zen
respectively. Counters use their numeral-attached readings. Fully native
numbers + counter + modifier are fully fused to preserve special readings
(e.g., nidome, futari).
titles_and_compound_title_nouns: >
Distinguish between true titles/honorifics and compound title nouns.
True titles and honorifics that attach to a person or name are written
with a hyphen (e.g., Ou-sama).
Compound title nouns that function as a single lexical noun are fused
without a hyphen (e.g., Ningyouhime, kamisama, Maoujou).
bound_suffixes: >
Bound suffixes that modify the preceding element or when cannot stand alone as noun, verb, or adjective
without the loss of meaning are fused with a hyphen for clarity (e.g., Oshare-suki, Kami-goroshi, Inaka-gurashi)
copula_vs_conjunctive_dewa: >
Distinguish grammatical de wa (copula + particle) from lexicalized
conjunctive dewa. Grammatical de wa is written separately (de wa),
while conjunctive dewa meaning “well then / in that case” is written
fused (dewa), usually at the start of a clause.
example_validation:
input: "異世界グルメで成り上がり無双 ~山に追放されたので、のんびりキャンプを楽しんでいたらいつの間にか強くなっていて、王侯貴族や実力者たちが俺を放っておいてくれません。一方、俺を追放した貴族たちは破滅が始まる~"
output: "Isekai Gourmet de Nariagari Musou: Yama ni Tsuihou Sareta node, Nonbiri Camp o Tanoshindeitara Itsunomanika Tsuyoku Natteite, Oukou Kizoku ya Jitsuryokusha-tachi ga Ore o Houtte Oitekuremasen. Ippou, Ore o Tsuihou Shita Kizoku-tachi wa Hametsu ga Hajimaru"
YAML Ruleset (2) adjusted by Gemeni -- TESTING
romanization_style:
name: "MangaBaka Romanization Style"
version: "1.3a"
language_output: "Always reply and summarize in English, even if the input is mixed language. The input contains a Japanese title to be romanized by non-Japanese users."
base_system: "Modified Hepburn (customized)"
macrons: false
long_vowels: "Always written fully (ou, oo, uu, aa, ee, ii)"
sokuon_rule: "Represent っ before ch as cch (e.g., Maccha instead of Matcha)"
n_rule: "ん is romanized as n. Use n' when ん is followed by a vowel or y to avoid ambiguity (e.g., Ten'i, Kon'yaku)."
capitalization: >
Capitalize all nouns, verbs, adjectives, adverbs, and auxiliary verbs when written separately.
Keep all particles, copulas, and copula inflections lowercase. Copula forms (da, desu, and all
inflections) are never capitalized, even when sentence-final.
particles_and_conjunctions:
particles: "Always separate and lowercase (no, ga, o, de, mo, ni, wa)."
conjunctive_particles: >
Fused when a single lexical unit (nanoni, demo, dakedo, desuga).
Node, noni, datte are fused unless they are clearly separate particles or copula
in a grammatical construction (e.g., 'no ni Taishi').
copula_dewa: "Grammatical 'de wa' is separate; conjunctive 'dewa' (well then) is fused."
auxiliaries: >
All auxiliary verbs are fused to the main verb or adjective (via stem or te-form), except for the
'ja' construction. (e.g., Natteshimatta, Kawaisugiru, Shiteita).
ja_auxiliary: >
'ja' (from de wa) is treated as a particle (lowercase, separate). Any auxiliary following it
is capitalized because it is written separately (e.g., ja Nai, ja Nakatta, ja Dame).
Contractions like じゃん are romanized as jan.
explanatory_n_sequences: >
The explanatory 'n' (derived from 'no') is always fused to the following copula or contraction
to avoid loose characters and semantic ambiguity with 'nan' (what).
(1) n + copula fuses (e.g., nda, ndesu, ndatte).
(2) na + n + copula fuses into a single unit (e.g., nandesu, nandesuga, nandakara).
te_form_chains: >
Each lexical verb in a chain is written separately. Auxiliaries are fused to the
immediately preceding verb (e.g., Houtte Oitekuremasen).
suru_taxonomy: >
(1) Productive noun + Suru constructions are separate, with Suru capitalized (e.g., Benkyou Suru).
(2) Lexicalized suru-verbs (found in dictionaries as single entries) are fused (e.g., Koisuru, Aisuru).
(3) Verbs ending in -su are true verbs and fuse normally (e.g., Mezashite).
compounds_and_lexicalization:
dictionary_rule: "Compounds with a dedicated dictionary entry are fused (e.g., Isekai, Akuyaku, Tensei)."
determinative: "Modifier + head nouns without a single entry are separate (e.g., Kokunai Ryokou)."
idioms: "Yojijukugo and fixed idioms remain fused (e.g., Yuuyuujiteki, Issekinichou)."
adverbs: >
Fossilized adverbs are fused (e.g., Itsunomanika, Nazeka, Doushitemo).
Phrases retaining active lexical verbs remain separate (e.g., Dou Mitemo, Sou Ieba).
loanwords_and_names:
identifiable_loanwords: "Non-assimilated words use their origin equivalents (e.g., Skill, Dungeon, Level, Party)."
assimilated_loanwords: "Clipped slang or phonologically adapted words are romanized (e.g., Baito, Rabuho, Gyaru, Anime)."
native_names: "Always romanize Japanese personal/place names; no English exonyms (e.g., Toukyou, Oosaka)."
foreign_names: "Use established native-language spelling (e.g., Chris, Hathaway)."
brackets_and_quotes:
quotations: "「 」, 『 』, 《 》, ≪ ≫, and “ ” are all romanized as double straight quotes \" \"."
lenticular_brackets: "【 】 are \" \" if used as quotes, or [ ] if used for Skill Names or if usage is unclear."
square_brackets: "[ ] are always romanized as [ ]."
hyphens: >
Use with honorifics (Name-sama), personal titles, and Arabic numbers + counters (31-banme).
Bound suffixes that cannot stand alone are fused with a hyphen (e.g., Oshare-suki, Kami-goroshi).
example_validation:
input: "【新スキル】を発動した『勇者』が「魔王城」で「お洒落好き」な仲間と「勉強する」のに対し、俺は「放っておいてくれません」と言った。"
output: "[New Skill] o Hatsudou Shita \"Yuusha\" ga \"Maoujou\" de \"Oshare-suki\" na Nakama to \"Benkyou Suru\" no ni Taishi, Ore wa \"Houtte Oitekuremasen\" to Itta."
