Home > OSX > KAKASIの使い方メモ

KAKASIの使い方メモ

  • 2009-01-29 (木) 17:30
  • OSX

日本語のファイル名をローマ字のファイル名に強制的に変換したいが、どうすればよいか?ということを調べていて、

を使えば出来そうだということがわかった。

以下、KAKASIを試しに使ってみたときのメモ。

まず、最初に MacPorts で kakasi をインストール。

$ which kakasi
/opt/local/bin/kakasi

nkfも使う。

$ which nkf
/opt/local/bin/nkf

とりあえず文字列の変換をターミナルで試してみる

文字列の文字コードをnkfで一度EUC-JPに変換してから、kakasiで文字列を変換するのがミソらしい。

$ echo 猫の手も借りたい | nkf -e | kakasi -Ja | nkf -w
nekoのteもkariたい
 
$ echo 猫の手も借りたい | nkf -e | kakasi -JH | nkf -w
ねこのてもかりたい
 
$ echo 猫の手も借りたい | nkf -e | kakasi -JK | kakasi -HK | nkf -w
ネコノテモカリタイ
 
$ echo 猫の手も借りたい | nkf -e | kakasi -Ja | kakasi -Ha | nkf -w
nekonotemokaritai

なるほど。これを使って応用すれば、ファイル名の変換も出来そうだ。

kakasi

$ kakasi -v
KAKASI - Kanji Kana Simple Inverter  Version 2.3.4
Copyright (C) 1992-1999 Hironobu Takahashi. All rights reserved.
 
Usage: kakasi -a[jE] -j[aE] -g[ajE] -k[ajKH] -E[aj] -K[ajkH] -H[ajkK] -J[ajkKH]
              -i{oldjis,newjis,dec,euc,sjis} -o{oldjis,newjis,dec,euc,sjis}
              -r{hepburn,kunrei} -p -s -f -c"chars"  [jisyo1, jisyo2,,,]
 
      Character Sets:
       a: ascii  j: jisroman  g: graphic  k: kana (j,k     defined in jisx0201)
       E: kigou  K: katakana  H: hiragana J: kanji(E,K,H,J defined in jisx0208)
 
      Options:
      -i: input coding system    -o: output coding system
      -r: romaji conversion system
      -p: list all readings (with -J option)
      -s: insert separate characters (with -J option)
      -f: furigana mode (with -J option)
      -c: skip chars within jukugo (with -J option: default TAB CR LF BLANK)
      -C: romaji Capitalize (with -Ja or -Jj option)
      -U: romaji Upcase     (with -Ja or -Jj option)
      -u: call fflush() after 1 character output
      -w: wakatigaki mode
 
Report bugs to <bug-kakasi@namazu.org>.

nkf

$ nkf --help
USAGE:  nkf(nkf32,wnkf,nkf2) -[flags] [in file] .. [out file for -O flag]
Flags:
b,u      Output is buffered (DEFAULT),Output is unbuffered
j,s,e,w  Output code is JIS 7 bit (DEFAULT), Shift JIS, EUC-JP, UTF-8N
         After 'w' you can add more options. -w[ 8 [0], 16 [[BL] [0]] ]
J,S,E,W  Input assumption is JIS 7 bit , Shift JIS, EUC-JP, UTF-8
         After 'W' you can add more options. -W[ 8, 16 [BL] ] 
t        no conversion
i[@B]    Specify the Esc Seq for JIS X 0208-1978/83 (DEFAULT B)
o[BJH]   Specify the Esc Seq for ASCII/Roman        (DEFAULT B)
r        {de/en}crypt ROT13/47
h        1 katakana->hiragana, 2 hiragana->katakana, 3 both
v        Show this usage. V: show version
m[BQN0]  MIME decode [B:base64,Q:quoted,N:non-strict,0:no decode]
M[BQ]    MIME encode [B:base64 Q:quoted]
l        ISO8859-1 (Latin-1) support
f/F      Folding: -f60 or -f or -f60-10 (fold margin 10) F preserve nl
Z[0-3]   Convert X0208 alphabet to ASCII
         1: Kankaku to 1 space  2: to 2 spaces  3: Convert to HTML Entity
X,x      Assume X0201 kana in MS-Kanji, -x preserves X0201
B[0-2]   Broken input  0: missing ESC,1: any X on ESC-[($]-X,2: ASCII on NL
O        Output to File (DEFAULT 'nkf.out')
I        Convert non ISO-2022-JP charactor to GETA
d,c      Convert line breaks  -d: LF  -c: CRLF
-L[uwm]  line mode u:LF w:CRLF m:CR (DEFAULT noconversion)
 
Long name options
 --ic=<input codeset>  --oc=<output codeset>
                   Specify the input or output codeset
 --fj  --unix --mac  --windows
 --jis  --euc  --sjis  --utf8  --utf16  --mime  --base64
                   Convert for the system or code
 --hiragana  --katakana  --katakana-hiragana
                   To Hiragana/Katakana Conversion
 --prefix=         Insert escape before troublesome characters of Shift_JIS
 --cap-input, --url-input  Convert hex after ':' or '%'
 --numchar-input   Convert Unicode Character Reference
 --fb-{skip, html, xml, perl, java, subchar}
                   Specify how nkf handles unassigned characters
 --in-place[=SUFFIX]  --overwrite[=SUFFIX]
                   Overwrite original listed files by filtered result
                   --overwrite preserves timestamp of original files
 -g  --guess       Guess the input code
 --help  --version Show this help/the version
                   For more information, see also man nkf
 
Network Kanji Filter Version 2.0.8 (2007-07-20) 
Copyright (C) 1987, FUJITSU LTD. (I.Ichikawa),2000 S. Kono, COW
Copyright (C) 2002-2007 Kono, Furukawa, Naruse, mastodon

Comments:0

Comment Form
Remember personal info

Trackbacks:0

Trackback URL for this entry
http://side-b.sto.co.jp/weblog/archives/430/trackback
Listed below are links to weblogs that reference
KAKASIの使い方メモ from Side-B

Home > OSX > KAKASIの使い方メモ

Search
Feeds
Meta

Return to page top