Kakasi kanji to roomaji converter encoding difficulties

General Tech Learning Aids/Tools 2 years ago

0 2 0 0 0 tuteeHUB earn credit +10 pts

5 Star Rating 1 Rating

Posted on 16 Aug 2022, this text provides information on Learning Aids/Tools related to General Tech. Please note that while accuracy is prioritized, the data presented might not be entirely correct or up-to-date. This information is offered for general knowledge and informational purposes only, and should not be considered as a substitute for professional advice.

Take Quiz To Earn Credits!

Turn Your Knowledge into Earnings.

tuteehub_quiz

Answers (2)

Post Answer
profilepic.png
manpreet Tuteehub forum best answer Best Answer 2 years ago

I am trying to use the Kakasi kanji/hiragana/katakana to roomaji converter, as an aid to learning kanji pronunciation within specific sentences. I am using command and parameters:

kakasi -Ja -Ha -Ka -Ea -s

For example, converting today's date gives:

$ echo "731" | kakasi -Ja -Ha -Ka -Ea -s 
7 shin ?? 1 ka �

There is clearly a configuration error, that I think comes from the input encoding (UTF-8) not being correctly understood by the tool.

Could anybody with experience on this matter please advise on how to either tell kakasi to accept Unicode input, or suggest an alternative open-source tool for conversion that works better? (Please, no Windows software.)

profilepic.png
manpreet 2 years ago

 

Thanks to comments by @Earthliŋ and @blutorange (recognition where recognition is due), the combination of iconv with kakasi has finally worked. Initial convertion from Unicode to Shift-JIS is required, and performed using:

$ echo "731" | iconv -f utf8 -t shift-jis | kakasi -Ja -Ha -Ka -Ea -s 

7 gatsu 31 nichi

Conversion back in the other direction is not needed when output is roumaji, since the basic characters have low ASCII values that are identical under all encodings. If necessary, conversion from Shift-JIS back to Unicode can be performed with:

$ echo "731" | iconv -f utf8 -t shift-jis | kakasi -Ja -Ha -Ka -Ea -s | iconv -f shift-jis -t utf8

7 gatsu 31 nichi

For instance, to convert into Hiragana:

$ echo "731" | iconv -f utf8 -t shift-jis | kakasi -JH -KH -Ea -s | iconv -f shift-jis -t utf8

7 がつ 31 にち

Update

As pointed out by @oals in the comments, newer versions of kakasi have the little documented parameters -iutf8 and -outf8 to specify Unicode encoding for either input or output. The above conversion to Hiragana can then be more efficiently performed using:

$ echo "731" | kakasi -JH -KH -Ea -s -iutf8 -outf8

7 がつ 31 にち

Thanks for your f="https://forum.tuteehub.com/tag/help">help.


0 views   0 shares

No matter what stage you're at in your education or career, TuteeHub will help you reach the next level that you're aiming for. Simply,Choose a subject/topic and get started in self-paced practice sessions to improve your knowledge and scores.