cutlet

Open in Streamlit Current PyPI packages

cutlet

cutlet by Irasutoya

Cutlet is a tool to convert Japanese to romaji. Check out the interactive demo!

issueを英語で書く必要はありません。

Features:

  • support for Modified Hepburn, Kunreisiki, Nihonsiki systems
  • custom overrides for individual mappings
  • custom overrides for specific words
  • built in exceptions list (Tokyo, Osaka, etc.)
  • uses foreign spelling when available in UniDic
  • proper nouns are capitalized
  • slug mode for url generation

Things not supported:

  • traditional Hepburn n-to-m: Shimbashi
  • macrons or circumflexes: Tōkyō, Tôkyô
  • passport Hepburn: Satoh (but you can use an exception)
  • hyphenating words
  • Traditional Hepburn in general is not supported

Internally, cutlet uses fugashi, so you can use the same dictionary you use for normal tokenization.

Installation

Cutlet can be installed through pip as usual.

pip install cutlet

Note that if you don't have a MeCab dictionary installed you'll also have to install one. If you're just getting started unidic-lite is a good choice.

pip install unidic-lite

Usage

A command-line script is included for quick testing. Just use cutlet and each line of stdin will be treated as a sentence. You can specify the system to use (hepburn, kunrei, nippon, or nihon) as the first argument.

$ cutlet
ローマ字変換プログラム作ってみた。
Roma ji henkan program tsukutte mita.

In code:

import cutlet
katsu = cutlet.Cutlet()
katsu.romaji("カツカレーは美味しい")
# => 'Cutlet curry wa oishii'

# you can print a slug suitable for urls
katsu.slug("カツカレーは美味しい")
# => 'cutlet-curry-wa-oishii'

# You can disable using foreign spelling too
katsu.use_foreign_spelling = False
katsu.romaji("カツカレーは美味しい")
# => 'Katsu karee wa oishii'

# kunreisiki, nihonsiki work too
katu = cutlet.Cutlet('kunrei')
katu.romaji("富士山")
# => 'Huzi yama'

# comparison
nkatu = cutlet.Cutlet('nihon')

sent = "彼女は王への手紙を読み上げた。"
katsu.romaji(sent)
# => 'Kanojo wa ou e no tegami wo yomiageta.'
katu.romaji(sent)
# => 'Kanozyo wa ou e no tegami o yomiageta.'
nkatu.romaji(sent)
# => 'Kanozyo ha ou he no tegami wo yomiageta.'

Alternatives

  • kakasi: Historically important, but not updated since 2014.
  • pykakasi: self contained, it does segmentation on its own and uses its own dictionary.
  • kuroshiro: Javascript based.
  • kana: Go based.
1"""
2
3.. include:: ../README.md
4"""
5
6from .cutlet import *
7
8__all__ = ("Cutlet",)
class Cutlet:
 76class Cutlet:
 77    # TODO add mecab args
 78    def __init__(
 79            self,
 80            system = 'hepburn',
 81            use_foreign_spelling = True,
 82            ensure_ascii = True,
 83):
 84        """Create a Cutlet object, which holds configuration as well as
 85        tokenizer state.
 86
 87        `system` is `hepburn` by default, and may also be `kunrei` or
 88        `nihon`. `nippon` is permitted as a synonym for `nihon`.
 89
 90        If `use_foreign_spelling` is true, output will use the foreign spelling
 91        provided in a UniDic lemma when available. For example, "カツ" will
 92        become "cutlet" instead of "katsu".
 93
 94        If `ensure_ascii` is true, any non-ASCII characters that can't be
 95        romanized will be replaced with `?`. If false, they will be passed
 96        through.
 97
 98        Typical usage:
 99
100        ```python
101        katsu = Cutlet()
102        roma = katsu.romaji("カツカレーを食べた")
103        # "Cutlet curry wo tabeta"
104        ```
105        """
106        # allow 'nippon' for 'nihon'
107        if system == 'nippon': system = 'nihon'
108        self.system = system
109        try:
110            # make a copy so we can modify it
111            self.table = dict(SYSTEMS[system])
112        # TODO fix this
113        except KeyError:
114            print("unknown system: {}".format(system))
115            raise
116
117        self.tagger = fugashi.Tagger()
118        self.exceptions = load_exceptions()
119
120        # these are too minor to be worth exposing as arguments
121        self.use_tch = (self.system in ('hepburn',))
122        self.use_wa  = (self.system in ('hepburn', 'kunrei'))
123        self.use_he  = (self.system in ('nihon',))
124        self.use_wo  = (self.system in ('hepburn', 'nihon'))
125
126        self.use_foreign_spelling = True
127        self.ensure_ascii = True
128
129    def add_exception(self, key, val):
130        """Add an exception to the internal list.
131
132        An exception overrides a whole token, for example to replace "Toukyou"
133        with "Tokyo". Note that it must match the tokenizer output and be a
134        single token to work. To replace longer phrases, you'll need to use a
135        different strategy, like string replacement.
136        """
137        self.exceptions[key] = val
138
139    def update_mapping(self, key, val):
140        """Update mapping table for a single kana.
141
142        This can be used to mix common systems, or to modify particular
143        details. For example, you can use `update_mapping("ぢ", "di")` to
144        differentiate ぢ and じ in Hepburn.
145
146        Example usage:
147
148        ```
149        cut = Cutlet()
150        cut.romaji("お茶漬け") # Ochazuke
151        cut.update_mapping("づ", "du")
152        cut.romaji("お茶漬け") # Ochaduke
153        ```
154        """
155        self.table[key] = val
156
157    def slug(self, text):
158        """Generate a URL-friendly slug.
159
160        After converting the input to romaji using `Cutlet.romaji` and making
161        the result lower-case, any runs of non alpha-numeric characters are
162        replaced with a single hyphen. Any leading or trailing hyphens are
163        stripped.
164        """
165        roma = self.romaji(text).lower()
166        slug = re.sub(r'[^a-z0-9]+', '-', roma).strip('-')
167        return slug
168
169    def romaji(self, text, capitalize=True, title=False):
170        """Build a complete string from input text.
171
172        If `capitalize` is true, then the first letter of the text will be
173        capitalized. This is typically the desired behavior if the input is a
174        complete sentence.
175
176        If `title` is true, then words will be capitalized as in a book title.
177        This means most words will be capitalized, but some parts of speech
178        (particles, endings) will not.
179        """
180        if not text:
181            return ''
182
183        # perform unicode normalization
184        text = unicodedata.normalize('NFKC', text)
185        # convert all full-width alphanum to half-width, since it can go out as-is
186        text = mojimoji.zen_to_han(text, kana=False)
187        # replace half-width katakana with full-width
188        text = mojimoji.han_to_zen(text, digit=False, ascii=False)
189
190        words = self.tagger(text)
191
192        # TODO make a list and join to avoid weirdness with string building
193        out = ''
194
195        for wi, word in enumerate(words):
196            pw = words[wi - 1] if wi > 0 else None
197            nw = words[wi + 1] if wi < len(words) - 1 else None
198
199            # handle possessive apostrophe as a special case
200            if (word.surface == "'" and
201                    (nw and nw.char_type == 5 and not nw.white_space) and
202                    not word.white_space):
203                # remove preceeding space
204                out = out[:-1]
205                out += word.surface
206                continue
207
208            # resolve split verbs / adjectives
209            roma = self.romaji_word(word)
210            if roma and out and out[-1] == 'っ':
211                out = out[:-1] + roma[0]
212            if word.feature.pos2 == '固有名詞':
213                roma = roma.title()
214            if (title and 
215                word.feature.pos1 not in ('助詞', '助動詞', '接尾辞') and
216                not (pw and pw.feature.pos1 == '接頭辞')):
217                roma = roma.title()
218            # handle punctuation with atypical spacing
219            if word.surface in '「『':
220                out += ' ' + roma
221                continue
222            if roma in '([':
223                out += ' ' + roma
224                continue
225            if roma == '/':
226                out += '/'
227                continue
228            out += roma
229
230            # no space sometimes
231            # お酒 -> osake
232            if word.feature.pos1 == '接頭辞': continue
233            # 今日、 -> kyou, ; 図書館 -> toshokan
234            if nw and nw.feature.pos1 in ('補助記号', '接尾辞'): continue
235            # special case for half-width commas
236            if nw and nw.surface == ',': continue
237            # 思えば -> omoeba
238            if nw and nw.feature.pos2 in ('接続助詞'): continue
239            # 333 -> 333 ; this should probably be handled in mecab
240            if (word.surface.isdigit() and 
241                    nw and nw.surface.isdigit()):
242                continue
243            # そうでした -> sou deshita
244            if (nw and word.feature.pos1 in ('動詞', '助動詞','形容詞')
245                   and nw.feature.pos1 == '助動詞'
246                   and nw.surface != 'です'):
247                continue
248            out += ' '
249        # remove any leftover っ
250        out = out.replace('っ', '').strip()
251        # capitalize the first letter
252        if capitalize and len(out) > 0:
253            tmp = out[0].capitalize()
254            if len(out) > 1:
255                tmp += out[1:]
256            out = tmp
257        return out
258
259    def romaji_word(self, word):
260        """Return the romaji for a single word (node)."""
261
262        if word.surface in self.exceptions:
263            return self.exceptions[word.surface]
264
265        if word.surface.isdigit():
266            return word.surface
267
268        if is_ascii(word.surface):
269            return word.surface
270
271        # deal with unks first
272        if word.is_unk:
273            # at this point is is presumably an unk
274            # Check character type using the values defined in char.def. 
275            # This is constant across unidic versions so far but not guaranteed.
276            if word.char_type == 6 or word.char_type == 7: # hiragana/katakana
277                kana = jaconv.kata2hira(word.surface)
278                return self.map_kana(kana)
279
280            # At this point this is an unknown word and not kana. Could be
281            # unknown kanji, could be hangul, cyrillic, something else.
282            # By default ensure ascii by replacing with ?, but allow pass-through.
283            if self.ensure_ascii:
284                out = '?' * len(word.surface)
285                return out
286            else:
287                return word.surface
288
289        if word.feature.pos1 == '補助記号':
290            # If it's punctuation we don't recognize, just discard it
291            return self.table.get(word.surface, '')
292        elif (self.use_wa and 
293                word.feature.pos1 == '助詞' and word.feature.pron == 'ワ'):
294            return 'wa'
295        elif (not self.use_he and 
296                word.feature.pos1 == '助詞' and word.feature.pron == 'エ'):
297            return 'e'
298        elif (not self.use_wo and 
299                word.feature.pos1 == '助詞' and word.feature.pron == 'オ'):
300            return 'o'
301        elif (self.use_foreign_spelling and 
302                has_foreign_lemma(word)):
303            # this is a foreign word with known spelling
304            return word.feature.lemma.split('-')[-1]
305        elif word.feature.kana:
306            # for known words
307            kana = jaconv.kata2hira(word.feature.kana)
308            return self.map_kana(kana)
309        else:
310            # unclear when we would actually get here
311            return word.surface
312
313    def map_kana(self, kana):
314        """Given a list of kana, convert them to romaji.
315
316        The exact romaji resulting from a kana sequence depend on the preceding
317        or following kana, so this handles that conversion.
318        """
319        out = ''
320        for ki, char in enumerate(kana):
321            nk = kana[ki + 1] if ki < len(kana) - 1 else None
322            pk = kana[ki - 1] if ki > 0 else None
323            out += self.get_single_mapping(pk, char, nk)
324        return out
325
326    def get_single_mapping(self, pk, kk, nk):
327        """Given a single kana and its neighbors, return the mapped romaji."""
328        # handle odoriji
329        # NOTE: This is very rarely useful at present because odoriji are not
330        # left in readings for dictionary words, and we can't follow kana
331        # across word boundaries. 
332        if kk in ODORI:
333            if kk in 'ゝヽ':
334                if pk: return pk
335                else: return '' # invalid but be nice
336            if kk in 'ゞヾ': # repeat with voicing
337                if not pk: return ''
338                vv = add_dakuten(pk)
339                if vv: return self.table[vv]
340                else: return ''
341            # remaining are 々 for kanji and 〃 for symbols, but we can't
342            # infer their span reliably (or handle rendaku)
343            return ''
344        
345
346        # handle digraphs
347        if pk and (pk + kk) in self.table:
348            return self.table[pk + kk]
349        if nk and (kk + nk) in self.table:
350            return ''
351
352        if nk and nk in SUTEGANA:
353            if kk == 'っ': return '' # never valid, just ignore
354            return self.table[kk][:-1] + self.table[nk]
355        if kk in SUTEGANA:
356            return ''
357
358        if kk == 'ー': # 長音符
359            if pk and pk in self.table: return self.table[pk][-1]
360            else: return '-'
361        
362        if kk == 'っ':
363            if nk:
364                if self.use_tch and nk == 'ち': return 't'
365                elif nk in 'あいうえおっ': return '-'
366                else: return self.table[nk][0] # first character
367            else: 
368                # seems like it should never happen, but 乗っ|た is two tokens
369                # so leave this as is and pick it up at the word level
370                return 'っ'
371
372        if kk == 'ん':
373            if nk and nk in 'あいうえおやゆよ': return "n'"
374            else: return 'n'
375
376        return self.table[kk]
Cutlet(system='hepburn', use_foreign_spelling=True, ensure_ascii=True)
 78    def __init__(
 79            self,
 80            system = 'hepburn',
 81            use_foreign_spelling = True,
 82            ensure_ascii = True,
 83):
 84        """Create a Cutlet object, which holds configuration as well as
 85        tokenizer state.
 86
 87        `system` is `hepburn` by default, and may also be `kunrei` or
 88        `nihon`. `nippon` is permitted as a synonym for `nihon`.
 89
 90        If `use_foreign_spelling` is true, output will use the foreign spelling
 91        provided in a UniDic lemma when available. For example, "カツ" will
 92        become "cutlet" instead of "katsu".
 93
 94        If `ensure_ascii` is true, any non-ASCII characters that can't be
 95        romanized will be replaced with `?`. If false, they will be passed
 96        through.
 97
 98        Typical usage:
 99
100        ```python
101        katsu = Cutlet()
102        roma = katsu.romaji("カツカレーを食べた")
103        # "Cutlet curry wo tabeta"
104        ```
105        """
106        # allow 'nippon' for 'nihon'
107        if system == 'nippon': system = 'nihon'
108        self.system = system
109        try:
110            # make a copy so we can modify it
111            self.table = dict(SYSTEMS[system])
112        # TODO fix this
113        except KeyError:
114            print("unknown system: {}".format(system))
115            raise
116
117        self.tagger = fugashi.Tagger()
118        self.exceptions = load_exceptions()
119
120        # these are too minor to be worth exposing as arguments
121        self.use_tch = (self.system in ('hepburn',))
122        self.use_wa  = (self.system in ('hepburn', 'kunrei'))
123        self.use_he  = (self.system in ('nihon',))
124        self.use_wo  = (self.system in ('hepburn', 'nihon'))
125
126        self.use_foreign_spelling = True
127        self.ensure_ascii = True

Create a Cutlet object, which holds configuration as well as tokenizer state.

system is hepburn by default, and may also be kunrei or nihon. nippon is permitted as a synonym for nihon.

If use_foreign_spelling is true, output will use the foreign spelling provided in a UniDic lemma when available. For example, "カツ" will become "cutlet" instead of "katsu".

If ensure_ascii is true, any non-ASCII characters that can't be romanized will be replaced with ?. If false, they will be passed through.

Typical usage:

katsu = Cutlet()
roma = katsu.romaji("カツカレーを食べた")
# "Cutlet curry wo tabeta"
def add_exception(self, key, val):
129    def add_exception(self, key, val):
130        """Add an exception to the internal list.
131
132        An exception overrides a whole token, for example to replace "Toukyou"
133        with "Tokyo". Note that it must match the tokenizer output and be a
134        single token to work. To replace longer phrases, you'll need to use a
135        different strategy, like string replacement.
136        """
137        self.exceptions[key] = val

Add an exception to the internal list.

An exception overrides a whole token, for example to replace "Toukyou" with "Tokyo". Note that it must match the tokenizer output and be a single token to work. To replace longer phrases, you'll need to use a different strategy, like string replacement.

def update_mapping(self, key, val):
139    def update_mapping(self, key, val):
140        """Update mapping table for a single kana.
141
142        This can be used to mix common systems, or to modify particular
143        details. For example, you can use `update_mapping("ぢ", "di")` to
144        differentiate ぢ and じ in Hepburn.
145
146        Example usage:
147
148        ```
149        cut = Cutlet()
150        cut.romaji("お茶漬け") # Ochazuke
151        cut.update_mapping("づ", "du")
152        cut.romaji("お茶漬け") # Ochaduke
153        ```
154        """
155        self.table[key] = val

Update mapping table for a single kana.

This can be used to mix common systems, or to modify particular details. For example, you can use update_mapping("ぢ", "di") to differentiate ぢ and じ in Hepburn.

Example usage:

cut = Cutlet()
cut.romaji("お茶漬け") # Ochazuke
cut.update_mapping("づ", "du")
cut.romaji("お茶漬け") # Ochaduke
def slug(self, text):
157    def slug(self, text):
158        """Generate a URL-friendly slug.
159
160        After converting the input to romaji using `Cutlet.romaji` and making
161        the result lower-case, any runs of non alpha-numeric characters are
162        replaced with a single hyphen. Any leading or trailing hyphens are
163        stripped.
164        """
165        roma = self.romaji(text).lower()
166        slug = re.sub(r'[^a-z0-9]+', '-', roma).strip('-')
167        return slug

Generate a URL-friendly slug.

After converting the input to romaji using Cutlet.romaji and making the result lower-case, any runs of non alpha-numeric characters are replaced with a single hyphen. Any leading or trailing hyphens are stripped.

def romaji(self, text, capitalize=True, title=False):
169    def romaji(self, text, capitalize=True, title=False):
170        """Build a complete string from input text.
171
172        If `capitalize` is true, then the first letter of the text will be
173        capitalized. This is typically the desired behavior if the input is a
174        complete sentence.
175
176        If `title` is true, then words will be capitalized as in a book title.
177        This means most words will be capitalized, but some parts of speech
178        (particles, endings) will not.
179        """
180        if not text:
181            return ''
182
183        # perform unicode normalization
184        text = unicodedata.normalize('NFKC', text)
185        # convert all full-width alphanum to half-width, since it can go out as-is
186        text = mojimoji.zen_to_han(text, kana=False)
187        # replace half-width katakana with full-width
188        text = mojimoji.han_to_zen(text, digit=False, ascii=False)
189
190        words = self.tagger(text)
191
192        # TODO make a list and join to avoid weirdness with string building
193        out = ''
194
195        for wi, word in enumerate(words):
196            pw = words[wi - 1] if wi > 0 else None
197            nw = words[wi + 1] if wi < len(words) - 1 else None
198
199            # handle possessive apostrophe as a special case
200            if (word.surface == "'" and
201                    (nw and nw.char_type == 5 and not nw.white_space) and
202                    not word.white_space):
203                # remove preceeding space
204                out = out[:-1]
205                out += word.surface
206                continue
207
208            # resolve split verbs / adjectives
209            roma = self.romaji_word(word)
210            if roma and out and out[-1] == 'っ':
211                out = out[:-1] + roma[0]
212            if word.feature.pos2 == '固有名詞':
213                roma = roma.title()
214            if (title and 
215                word.feature.pos1 not in ('助詞', '助動詞', '接尾辞') and
216                not (pw and pw.feature.pos1 == '接頭辞')):
217                roma = roma.title()
218            # handle punctuation with atypical spacing
219            if word.surface in '「『':
220                out += ' ' + roma
221                continue
222            if roma in '([':
223                out += ' ' + roma
224                continue
225            if roma == '/':
226                out += '/'
227                continue
228            out += roma
229
230            # no space sometimes
231            # お酒 -> osake
232            if word.feature.pos1 == '接頭辞': continue
233            # 今日、 -> kyou, ; 図書館 -> toshokan
234            if nw and nw.feature.pos1 in ('補助記号', '接尾辞'): continue
235            # special case for half-width commas
236            if nw and nw.surface == ',': continue
237            # 思えば -> omoeba
238            if nw and nw.feature.pos2 in ('接続助詞'): continue
239            # 333 -> 333 ; this should probably be handled in mecab
240            if (word.surface.isdigit() and 
241                    nw and nw.surface.isdigit()):
242                continue
243            # そうでした -> sou deshita
244            if (nw and word.feature.pos1 in ('動詞', '助動詞','形容詞')
245                   and nw.feature.pos1 == '助動詞'
246                   and nw.surface != 'です'):
247                continue
248            out += ' '
249        # remove any leftover っ
250        out = out.replace('っ', '').strip()
251        # capitalize the first letter
252        if capitalize and len(out) > 0:
253            tmp = out[0].capitalize()
254            if len(out) > 1:
255                tmp += out[1:]
256            out = tmp
257        return out

Build a complete string from input text.

If capitalize is true, then the first letter of the text will be capitalized. This is typically the desired behavior if the input is a complete sentence.

If title is true, then words will be capitalized as in a book title. This means most words will be capitalized, but some parts of speech (particles, endings) will not.

def romaji_word(self, word):
259    def romaji_word(self, word):
260        """Return the romaji for a single word (node)."""
261
262        if word.surface in self.exceptions:
263            return self.exceptions[word.surface]
264
265        if word.surface.isdigit():
266            return word.surface
267
268        if is_ascii(word.surface):
269            return word.surface
270
271        # deal with unks first
272        if word.is_unk:
273            # at this point is is presumably an unk
274            # Check character type using the values defined in char.def. 
275            # This is constant across unidic versions so far but not guaranteed.
276            if word.char_type == 6 or word.char_type == 7: # hiragana/katakana
277                kana = jaconv.kata2hira(word.surface)
278                return self.map_kana(kana)
279
280            # At this point this is an unknown word and not kana. Could be
281            # unknown kanji, could be hangul, cyrillic, something else.
282            # By default ensure ascii by replacing with ?, but allow pass-through.
283            if self.ensure_ascii:
284                out = '?' * len(word.surface)
285                return out
286            else:
287                return word.surface
288
289        if word.feature.pos1 == '補助記号':
290            # If it's punctuation we don't recognize, just discard it
291            return self.table.get(word.surface, '')
292        elif (self.use_wa and 
293                word.feature.pos1 == '助詞' and word.feature.pron == 'ワ'):
294            return 'wa'
295        elif (not self.use_he and 
296                word.feature.pos1 == '助詞' and word.feature.pron == 'エ'):
297            return 'e'
298        elif (not self.use_wo and 
299                word.feature.pos1 == '助詞' and word.feature.pron == 'オ'):
300            return 'o'
301        elif (self.use_foreign_spelling and 
302                has_foreign_lemma(word)):
303            # this is a foreign word with known spelling
304            return word.feature.lemma.split('-')[-1]
305        elif word.feature.kana:
306            # for known words
307            kana = jaconv.kata2hira(word.feature.kana)
308            return self.map_kana(kana)
309        else:
310            # unclear when we would actually get here
311            return word.surface

Return the romaji for a single word (node).

def map_kana(self, kana):
313    def map_kana(self, kana):
314        """Given a list of kana, convert them to romaji.
315
316        The exact romaji resulting from a kana sequence depend on the preceding
317        or following kana, so this handles that conversion.
318        """
319        out = ''
320        for ki, char in enumerate(kana):
321            nk = kana[ki + 1] if ki < len(kana) - 1 else None
322            pk = kana[ki - 1] if ki > 0 else None
323            out += self.get_single_mapping(pk, char, nk)
324        return out

Given a list of kana, convert them to romaji.

The exact romaji resulting from a kana sequence depend on the preceding or following kana, so this handles that conversion.

def get_single_mapping(self, pk, kk, nk):
326    def get_single_mapping(self, pk, kk, nk):
327        """Given a single kana and its neighbors, return the mapped romaji."""
328        # handle odoriji
329        # NOTE: This is very rarely useful at present because odoriji are not
330        # left in readings for dictionary words, and we can't follow kana
331        # across word boundaries. 
332        if kk in ODORI:
333            if kk in 'ゝヽ':
334                if pk: return pk
335                else: return '' # invalid but be nice
336            if kk in 'ゞヾ': # repeat with voicing
337                if not pk: return ''
338                vv = add_dakuten(pk)
339                if vv: return self.table[vv]
340                else: return ''
341            # remaining are 々 for kanji and 〃 for symbols, but we can't
342            # infer their span reliably (or handle rendaku)
343            return ''
344        
345
346        # handle digraphs
347        if pk and (pk + kk) in self.table:
348            return self.table[pk + kk]
349        if nk and (kk + nk) in self.table:
350            return ''
351
352        if nk and nk in SUTEGANA:
353            if kk == 'っ': return '' # never valid, just ignore
354            return self.table[kk][:-1] + self.table[nk]
355        if kk in SUTEGANA:
356            return ''
357
358        if kk == 'ー': # 長音符
359            if pk and pk in self.table: return self.table[pk][-1]
360            else: return '-'
361        
362        if kk == 'っ':
363            if nk:
364                if self.use_tch and nk == 'ち': return 't'
365                elif nk in 'あいうえおっ': return '-'
366                else: return self.table[nk][0] # first character
367            else: 
368                # seems like it should never happen, but 乗っ|た is two tokens
369                # so leave this as is and pick it up at the word level
370                return 'っ'
371
372        if kk == 'ん':
373            if nk and nk in 'あいうえおやゆよ': return "n'"
374            else: return 'n'
375
376        return self.table[kk]

Given a single kana and its neighbors, return the mapped romaji.