KanaConverter

Easy back-and-forth conversion of kana, hankaku, zenkaku, and other characters used in Japanese text

Perform multiple conversions on Kana and Roma-ji text with just a single static function call

Sample with Single Conversion

Convert hankaku katakana in a string to zenkaku

String input_str = "カナ・ツールズ v1.0";
int conv_op_flags = KanaConverter.OP_HAN_KATA_TO_ZEN_KATA;
String output_str = KanaConverter.convertKana(input_str, conv_op_flags);
System.out.println(output_str);
// "カナ・ツールズ v1.0"

Sample with Multiple Conversions

Convert hankaku katakana in a string to zenkaku, also convert zenkaku ASCII characters to hankaku

String input_str = "カナ・ツールズ v1.0";
int conv_op_flags = KanaConverter.OP_HAN_KATA_TO_ZEN_KATA | KanaConverter.OP_ZEN_ASCII_TO_HAN_ASCII;
String output_str = KanaConverter.convertKana(input_str, conv_op_flags);
System.out.println(output_str);
// "カナ・ツールズ v1.0"

List of all conversion types that KanaConverter can perform on an input string.

Conversion Operation Example Target Characters
OP_HAN_ASCII_TO_ZEN_ASCII
Standard-width ASCII to double-width
Ja (12) →
Ja  (12)
All hankaku ASCII alpha-numeric characters, spaces, and symbols
OP_HAN_LETTER_TO_ZEN_LETTER
Standard-width alphabetic letters to double-width
Ja → Ja All hankaku upper and lower case letters
OP_HAN_NUMBER_TO_ZEN_NUMBER
Standard-width numbers to double-width
12 → 12 All hankaku numerals
OP_HAN_SPACE_TO_ZEN_SPACE
Standard-width spaces to double-width
  →   The hankaku space character
OP_HAN_KATA_TO_ZEN_HIRA
Half-width katakana to full-width hiragana
ジャ → じゃ All hankaku katakana characters plus , , , , , and
OP_HAN_KATA_TO_ZEN_KATA
Half-width katakana to full-width
ジャ → ジャ All hankaku katakana characters plus , , , , , and
OP_KEEP_DIACRITIC_MARKS_APART
Keep hankaku katakana diacritic marks separate
ジャ → シ゛ャ and will remain their own characters when converting to zenkaku. Mix this operation with either OP_HAN_KATA_TO_ZEN_HIRA or OP_HAN_KATA_TO_ZEN_KATA
OP_ZEN_ASCII_TO_HAN_ASCII
Double-width ASCII characters to standard-width
Ja  (12)
→ Ja (12)
All zenkaku ASCII alpha-numeric characters, spaces, and symbols
OP_ZEN_LETTER_TO_HAN_LETTER
Double-width alphabetic letters to standard-width
Ja → Ja All zenkaku upper and lower case letters
OP_ZEN_NUMBER_TO_HAN_NUMBER
Double-width numbers to standard-width
12 → 12 All zenkaku numerals
OP_ZEN_SPACE_TO_HAN_SPACE
Double-width spaces to standard-width
  →   The zenkaku space character
OP_ZEN_HIRA_TO_HAN_KATA
Full-width hirgana to half-width katakana
じゃ → ジャ All zenkaku hiragana characters plus , , , , , and
OP_ZEN_HIRA_TO_ZEN_KATA
Full-width hiragana to full-width katakana
じゃ → ジャ All zenkaku hiragana characters
OP_ZEN_KATA_TO_HAN_KATA
Full-width katakana to half-width
ジャ → ジャ All zenkaku katakana characters plus , , , , , and
OP_ZEN_KATA_TO_ZEN_HIRA
Full-width katakana to full-width hiragana
ジャ → じゃ All zenkaku katakana characters except

Further Details on Target Characters

For a complete listing of the target characters of each conversion method, please have a look inside the SingleOpTest unit test in the repository which tests the entire character range of every supported method.

A few examples of KanaConverter in action

Handling Addresses

Addresses and phone numbers often come with a mix of hankaku and zenkaku katakana as well as ASCII characters.

Easily standardize address inputs into zenkaku katakana and hankaku ASCII.

Before

東京都北区赤羽6−30−1 赤羽ヒルズ

After

東京都北区赤羽6-30-1 赤羽ヒルズ
// Set the necessary conversion flags in a flag-based integer
int conversion_flags = 0;
conversion_flags |= KanaConverter.OP_HAN_KATAKANA_TO_ZEN_KATA;
conversion_flags |= KanaConverter.OP_ZEN_ASCII_TO_HAN_ASCII;

// Convert the string
String standardized_address = KanaConverter.convertKana(input_address, conversion_flags);

Mixed Kana Input

Easily convert words that are part-hiragana and part-katakana entirely to one or the other.

Before

デカい

After

でかい
String mixed_word = "デカい";
int conversion_flags = KanaConverter.OP_ZEN_KATA_TO_ZEN_HIRA;
String all_hiragana_word = KanaConverter.convertKana(mixed_word, conversion_flags);

Example with Excluded Characters

Exclude certain characters from conversion if necessary, such as HTML tag enclosure symbols

Before

<script>

After

<script>
String string_with_html = "<script>";
String excluded_chars = "<>";
String safe_conversion = KanaConverter.convertKana(string_with_html, KanaConverter.OP_ZEN_ASCII_TO_HAN_ASCII, excluded_chars);