VBAの非可逆文字変換についていろいろ

気分転換にちょっと調べてみた
調査は Windows XP SP2 + Excel 2003 SP3 環境にて。

シフトJISUnicodeシフトJIS文字コードが非可逆に変換されるもの

元のシフトJISコード:sj
変換後のシフトJISコード:Asc(ChrW$(AscW(Chr$(sj))))

区点変換前ShiftJIS変換後ShiftJIS
1380879081E0
1381879181DF
1382879281E7
1385879581E3
1386879681DB
1387879781DA
1390879A81E6
1391879B81BF
1392879C81BE
8901ED40FA5C
8902ED41FA5D
8903ED42FA5E
8904ED43FA5F
8905ED44FA60
8906ED45FA61
8907ED46FA62
8908ED47FA63
8909ED48FA64
8910ED49FA65
8911ED4AFA66
8912ED4BFA67
8913ED4CFA68
8914ED4DFA69
8915ED4EFA6A
8916ED4FFA6B
8917ED50FA6C
8918ED51FA6D
8919ED52FA6E
8920ED53FA6F
8921ED54FA70
8922ED55FA71
8923ED56FA72
8924ED57FA73
8925ED58FA74
8926ED59FA75
8927俿ED5A俿FA76
8928ED5BFA77
8929ED5CFA78
8930ED5DFA79
8931ED5EFA7A
8932ED5FFA7B
8933ED60FA7C
8934ED61FA7D
8935ED62FA7E
8936ED63FA80
8937ED64FA81
8938ED65FA82
8939ED66FA83
8940ED67FA84
8941ED68FA85
8942ED69FA86
8943ED6AFA87
8944ED6BFA88
8945ED6CFA89
8946ED6DFA8A
8947ED6EFA8B
8948ED6FFA8C
8949ED70FA8D
8950ED71FA8E
8951ED72FA8F
8952ED73FA90
8953ED74FA91
8954ED75FA92
8955ED76FA93
8956ED77FA94
8957ED78FA95
8958ED79FA96
8959ED7AFA97
8960ED7BFA98
8961ED7CFA99
8962ED7DFA9A
8963ED7EFA9B
8964ED80FA9C
8965ED81FA9D
8966ED82FA9E
8967ED83FA9F
8968ED84FAA0
8969ED85FAA1
8970ED86FAA2
8971ED87FAA3
8972ED88FAA4
8973ED89FAA5
8974ED8AFAA6
8975ED8BFAA7
8976ED8CFAA8
8977ED8DFAA9
8978ED8EFAAA
8979ED8FFAAB
8980ED90FAAC
8981ED91FAAD
8982ED92FAAE
8983ED93FAAF
8984ED94FAB0
8985ED95FAB1
8986ED96FAB2
8987ED97FAB3
8988ED98FAB4
8989ED99FAB5
8990ED9AFAB6
8991ED9BFAB7
8992ED9CFAB8
8993ED9DFAB9
8994ED9EFABA
9001ED9FFABB
9002EDA0FABC
9003EDA1FABD
9004EDA2FABE
9005EDA3FABF
9006EDA4FAC0
9007EDA5FAC1
9008EDA6FAC2
9009EDA7FAC3
9010EDA8FAC4
9011EDA9FAC5
9012EDAAFAC6
9013EDABFAC7
9014EDACFAC8
9015EDADFAC9
9016EDAEFACA
9017EDAFFACB
9018EDB0FACC
9019EDB1FACD
9020EDB2FACE
9021EDB3FACF
9022EDB4FAD0
9023EDB5FAD1
9024EDB6FAD2
9025EDB7FAD3
9026EDB8FAD4
9027EDB9FAD5
9028EDBAFAD6
9029EDBBFAD7
9030EDBCFAD8
9031EDBDFAD9
9032EDBEFADA
9033EDBFFADB
9034EDC0FADC
9035EDC1FADD
9036EDC2FADE
9037EDC3FADF
9038EDC4FAE0
9039EDC5FAE1
9040EDC6FAE2
9041EDC7FAE3
9042EDC8FAE4
9043EDC9FAE5
9044EDCAFAE6
9045EDCBFAE7
9046EDCCFAE8
9047EDCDFAE9
9048EDCEFAEA
9049EDCFFAEB
9050EDD0FAEC
9051EDD1FAED
9052EDD2FAEE
9053EDD3FAEF
9054EDD4FAF0
9055EDD5FAF1
9056EDD6FAF2
9057EDD7FAF3
9058EDD8FAF4
9059氿EDD9氿FAF5
9060EDDAFAF6
9061EDDBFAF7
9062EDDCFAF8
9063EDDDFAF9
9064EDDEFAFA
9065EDDFFAFB
9066EDE0FAFC
9067EDE1FB40
9068EDE2FB41
9069EDE3FB42
9070EDE4FB43
9071EDE5FB44
9072EDE6FB45
9073EDE7FB46
9074EDE8FB47
9075EDE9FB48
9076EDEAFB49
9077溿EDEB溿FB4A
9078EDECFB4B
9079EDEDFB4C
9080EDEEFB4D
9081EDEFFB4E
9082EDF0FB4F
9083EDF1FB50
9084EDF2FB51
9085EDF3FB52
9086EDF4FB53
9087EDF5FB54
9088EDF6FB55
9089EDF7FB56
9090EDF8FB57
9091EDF9FB58
9092EDFAFB59
9093EDFBFB5A
9094EDFCFB5B
9101EE40FB5C
9102EE41FB5D
9103EE42FB5E
9104EE43FB5F
9105EE44FB60
9106EE45FB61
9107EE46FB62
9108EE47FB63
9109EE48FB64
9110EE49FB65
9111EE4AFB66
9112EE4BFB67
9113EE4CFB68
9114EE4DFB69
9115EE4EFB6A
9116EE4FFB6B
9117EE50FB6C
9118EE51FB6D
9119EE52FB6E
9120EE53FB6F
9121EE54FB70
9122EE55FB71
9123EE56FB72
9124EE57FB73
9125EE58FB74
9126EE59FB75
9127EE5AFB76
9128EE5BFB77
9129EE5CFB78
9130EE5DFB79
9131EE5EFB7A
9132EE5FFB7B
9133EE60FB7C
9134EE61FB7D
9135EE62FB7E
9136EE63FB80
9137EE64FB81
9138EE65FB82
9139EE66FB83
9140EE67FB84
9141EE68FB85
9142EE69FB86
9143EE6AFB87
9144EE6BFB88
9145EE6CFB89
9146EE6DFB8A
9147EE6EFB8B
9148EE6FFB8C
9149EE70FB8D
9150EE71FB8E
9151EE72FB8F
9152EE73FB90
9153EE74FB91
9154EE75FB92
9155EE76FB93
9156EE77FB94
9157EE78FB95
9158EE79FB96
9159EE7AFB97
9160EE7BFB98
9161EE7CFB99
9162EE7DFB9A
9163EE7EFB9B
9164EE80FB9C
9165EE81FB9D
9166EE82FB9E
9167EE83FB9F
9168EE84FBA0
9169EE85FBA1
9170EE86FBA2
9171EE87FBA3
9172EE88FBA4
9173EE89FBA5
9174EE8AFBA6
9175EE8BFBA7
9176EE8CFBA8
9177EE8DFBA9
9178EE8EFBAA
9179EE8FFBAB
9180譿EE90譿FBAC
9181EE91FBAD
9182EE92FBAE
9183EE93FBAF
9184EE94FBB0
9185EE95FBB1
9186EE96FBB2
9187EE97FBB3
9188EE98FBB4
9189EE99FBB5
9190EE9AFBB6
9191EE9BFBB7
9192EE9CFBB8
9193EE9DFBB9
9194EE9EFBBA
9201EE9FFBBB
9202EEA0FBBC
9203EEA1FBBD
9204EEA2FBBE
9205EEA3FBBF
9206EEA4FBC0
9207EEA5FBC1
9208EEA6FBC2
9209EEA7FBC3
9210EEA8FBC4
9211EEA9FBC5
9212EEAAFBC6
9213EEABFBC7
9214EEACFBC8
9215EEADFBC9
9216EEAEFBCA
9217EEAFFBCB
9218EEB0FBCC
9219EEB1FBCD
9220EEB2FBCE
9221EEB3FBCF
9222EEB4FBD0
9223EEB5FBD1
9224EEB6FBD2
9225EEB7FBD3
9226EEB8FBD4
9227EEB9FBD5
9228EEBAFBD6
9229EEBBFBD7
9230EEBCFBD8
9231EEBDFBD9
9232EEBEFBDA
9233EEBFFBDB
9234EEC0FBDC
9235EEC1FBDD
9236EEC2FBDE
9237EEC3FBDF
9238EEC4FBE0
9239EEC5FBE1
9240EEC6FBE2
9241EEC7FBE3
9242EEC8FBE4
9243EEC9FBE5
9244EECAFBE6
9245EECBFBE7
9246EECCFBE8
9247EECDFBE9
9248EECEFBEA
9249EECFFBEB
9250EED0FBEC
9251EED1FBED
9252EED2FBEE
9253EED3FBEF
9254EED4FBF0
9255EED5FBF1
9256EED6FBF2
9257EED7FBF3
9258EED8FBF4
9259EED9FBF5
9260EEDAFBF6
9261EEDBFBF7
9262EEDCFBF8
9263EEDDFBF9
9264EEDEFBFA
9265EEDFFBFB
9266EEE0FBFC
9267EEE1FC40
9268EEE2FC41
9269EEE3FC42
9270EEE4FC43
9271EEE5FC44
9272EEE6FC45
9273EEE7FC46
9274EEE8FC47
9275EEE9FC48
9276EEEAFC49
9277EEEBFC4A
9278EEECFC4B
9281EEEFFA40
9282EEF0FA41
9283EEF1FA42
9284EEF2FA43
9285EEF3FA44
9286EEF4FA45
9287EEF5FA46
9288EEF6FA47
9289EEF7FA48
9290EEF8FA49
9291EEF981CA
9292EEFAFA55
9293EEFBFA56
9294EEFCFA57

機種依存文字は表示されない可能性があるため表中の文字はグリフ等価なUnicodeポイントを使用している


13区(0x8790〜0x879C)の記号は機種依存文字。同じ字形の本来の(JISで定義されている)シフトJISコードに再マップされる。


89区〜92区はいわゆる「NEC選定IBM拡張文字」。0xED40〜0xEEFCの文字は同じ字形の「IBM拡張文字」(0xFA5C〜0xFA57)にマップされ直されてしまう。しかも例外的に0xEEF9(¬)だけ0x81CAに飛び地している。


なお、VBAの文字列処理は内部ではUnicodeで行われるため、StrConvなどを使って文字列操作を行った場合も暗黙的に上記と同じ現象が発生する。

■全角 ⇒ 半角 ⇒ 全角 で文字コードが非可逆に変換されるもの

元のシフトJISコード:sj
変換後のシフトJISコード:Asc(StrConv(StrConv(Chr$(sj), vbNarrow), vbWide))

区点変換前ShiftJIS半角ShiftJIS変換後ShiftJIS
01388165'27FA56
01398166'27FA56
01408167"22FA57
01418168"22FA57
9292EEFA|7C8162
9293EEFB'27FA56
9294EEFC"22FA57


「’」はいつも困るんだよね。キーボードからSHIFT+7で半角時「'」全角時「’」入力できるくせに、Excelの検索では両者を分けて考えるから、「半角と全角を区別する」のチェックを外しても同一視してくれないから両方が混在してると同時に検索できない。


■テストコード

Option Explicit
Option Base 0

Sub test()

Dim ku As Long
Dim ten As Long
Dim sj As Long, csj As Long, un As Long
Dim sjc As String, uc As String
Dim sjnc As String, csjc As String
Dim r As Long

r = 1

For ku = 1 To 94
For ten = 1 To 94
sj = KutenToShiftJis(ku, ten)
sjc = Chr$(sj)
un = AscW(sjc)
uc = ChrW$(un)
csj = Asc(uc)
If csj < 0 Then csj = csj + 65536
If sj <> csj And csj <> 33093 Then
ActiveSheet.Cells(r, 1).Value = _
Right$("000" & (ku * 100 + ten), 4) & " : " & _
sjc & "(" & Hex$(sj) & ") / " & _
uc & "(" & Hex$(csj) & ")"
r = r + 1
End If
Next ten
Next ku

r = 1

For ku = 1 To 94
For ten = 1 To 94
sj = KutenToShiftJis(ku, ten)
sjc = Chr$(sj)
sjnc = StrConv(sjc, vbNarrow)
csjc = StrConv(sjnc, vbWide)
csj = Asc(csjc)
If csj < 0 Then csj = csj + 65536
If sj <> csj And csj <> 33093 And sjnc <> csjc Then
ActiveSheet.Cells(r, 2).Value = _
Right$("000" & (ku * 100 + ten), 4) & " : " & _
sjc & "(" & Hex$(sj) & ") / " & _
sjnc & "(" & Hex$(Asc(sjnc)) & ") / " & _
Chr$(csj) & "(" & Hex$(csj) & ")"
r = r + 1
End If
Next ten
Next ku

End Sub

Function KutenToShiftJis(ByVal ku As Long, ByVal ten As Long) As Long

Dim hi As Long, lo As Long

KutenToShiftJis = 0

If ku < 1 Or ku > 94 Then Exit Function
If ten < 1 Or ten > 94 Then Exit Function

hi = Int((ku - 1) / 2) + &H81
If (hi >= &HA0) Then
hi = hi + &HE0 - &HA0
End If

If ku Mod 2 = 0 Then
lo = ten - 1 + &H9F
Else
lo = ten - 1 + &H40
If (lo >= &H7F) Then lo = lo + 1
End If

KutenToShiftJis = hi * 256 + lo

End Function