Score:1

How do I get my Japanese VLC song title, out from inside a zip?

de flag

I initally had this problem: How to unzip a Japanese ZIP file, and avoid mojibake/garbled characters

But that "unzip -O shift-jis [filename.zip]" did the job and I got my nice Japanese characters in the file names, but that didn't seem to work for the file metadata?

I found this: Why does my VLC window show weird fonts?, but its solution seems to be for subtitles only, and my issue doesn't seem to be a VLC thing, since the audio file's Audio properties says that its title is, showing as mojibake blocks on my screen but when copy-pasted here, they turn into characters that take up no space: "Ôç©d¸UE - C[h"

enter image description here

Also, my Neptunia Re;Birth1 music is lining up with the reports of everyone else: Tracks 1 and 18 are Japanese, the rest seem to be Mojibake.

I guess if I just wanted to figure out the names, I do something like: the answers for How to turn mojibake text to readable form?

ChanganAuto avatar
us flag
Dumb question: Do you have Japanese language support (and fonts) installed?
Malady avatar
de flag
@ChanganAuto - My Language Support: Installed Languages has Japanese as Installed?
Score:1
pl flag

First step: determine what encode the metadata is written?

Install Exif reader

sudo apt install libimage-exiftool-perl

Show exif information you want to play on VLC.

exiftool filename

Sample output:

ExifTool Version Number         : 12.49
File Name                       : 10 - グラスホッパー.flac
--cut--
File Type                       : FLAC
File Type Extension             : flac
MIME Type                       : audio/flac
--cut--
Track Number                    : 10
Discnumber                      : 1
Title                           : グラスホッパー
Artist                          : スピッツ
Album                           : ハチミツ
Genre                           : Unknown
Date                            : 1995-09-20
--cut--
Artistsort                      : Spitz
Discid                          : 9c0a320b
Musicbrainz Discid              : KcCfHpYnqpWm4siIth0whkxTBEU-
Tracktotal                      : 11
Duration                        : 0:03:31

If you can read exif metadata normally in your terminal, then the metadata is written in Unicode. (check echo $LANG) And also check the VLC font settings.

VLCFont.png Otherwise, it is written in another character encode. In Japanese, it probably is in Shift-JIS or EUC.

Now save text of exiftool exiftool filename > textfile.txt

Encode Shift-JIS (or EUC-JP 'eucjp') to Unicode UTF-8

iconv -f sjis -t utf8 textfile.txt

cat textfile.txt

If you see this file No Tofu characters then you can edit original exif with them.

For example: exiftool -Title="グラスホッパー" -Artist="スピッツ" -Album="ハチミツ" Let's play this song/video on VLC, see what is changed.

Malady avatar
de flag
Almost! I use from sjis to utf-8 and I get "窶ーテ板催ァ窶堋ゥ窶播ツ青クUE - ニ陳iconv: illegal input sequence at position 105"
Sadaharu Wakisaka avatar
pl flag
Sorry there were two typos, #1 no `iconv` package to install #2 `iconv -f from -t to`. Now edited.
Malady avatar
de flag
Thanks! But sjis or eucjp, they both can't handle all of the info before spitting out a "iconv: illegal input sequence at position 75". I've seen this list of formats, but not sure which ones are Japanese or how to get iconv to use them: https://stackoverflow.com/a/8039467/4592583
Sadaharu Wakisaka avatar
pl flag
I don't know these two character encodes were famous in the 1990s - 2000s. Then the Unicode came in the middle of 2000s, since then no one intentionally tried to use them. I personally saw a lot of weird character encode from outside of Japan.
I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.