Umlaut/Special Characters Issue over SMB on Mac and Synology NAS

Synology NAS has native support for HFS file systems (Apple’s previous standard file system). So, it is possible to connect a HFS-formatted external USB drive to a Synology NAS and directly copy the files to the NAS. However, there is one big issue: Mac OS applies another character encoding (utf-8-mac or utf8-D) to file (and folder) names, than standard unix systems (utf8-C). (More on that here.) This is not a problem for standard characters, but leads to some severe problems with special characters like Umlaute (äüöß). When copying files from Mac OS Finder to a SMB share, Mac OS automatically takes care of the different encodings and stores files UTF8-C encoded on the remote directory.

The problem

However, when copying directly from a HFS-formatted USB drive connected to the Synology NAS, the Synology NAS does not account for this particularity and leaves filename encoding unchanged. The Synology Web File Station is capable of displaying UTF8-D encoded special characters correctly. However, when trying to access UTF8-D encoded files/folders via SMB from Mac OS Finder, these folders appear empty or the files appear corrupt.

The solution: convmv

Fortunately, there is a tool available, which allows to bulk repair the encodings: convmv. (More on that here and here.) This tool does not need installation. It can be simply downloaded and executed on the the Synology NAS.

#Download and Extract
#Manually check if more recent version available 
wget https://www.j3e.de/linux/convmv/convmv-2.05.tar.gz 
tar xzvf convmv-2.05.tar.gz 
cd convmv-2.05

#Runs the tool
./convmv

#Dryrun of converting from UTF8-D (Mac) to UTF8-C (Unix/NAS)
./convmv -r -f utf8 -t utf8 --nfc /path/to/dir

#Dryrun of converting from UTF8-C (Unix/NAS) to UTF8-D (Mac)
./convmv -r -f utf8 -t utf8 --nfd /path/to/dir

#Actually Execute the command (no dryrun)
./convmv -r -f utf8 -t utf8 --nfc --notest /path/to/dir
After running the tool, all files appear as expected

Alternative: rsync

If you use rsync to sync files between HFS and Unix file systems, rsync supports the reencoding. If you run rsync from Mac OS, you first need to update rsync to the most recent version. Therefore, you need homebrew.

#Stanard Mac OS rsync version too old
rsync --version

#Install Homebrew
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

#Install most recent rsync version
brew install rsync

#Quit and Restart Terminal
#Rsync Version now is fin
rsync --version

#Run Rsync with --iconv=SOURCE,DEST
rsync --iconv=utf-8-mac,utf-8 SOURCE DEST

4 thoughts on “Umlaut/Special Characters Issue over SMB on Mac and Synology NAS”

  1. Thanks a LOT for your post, I’ve finally resolved my issue of having duplicated folders when using french éôèà characters! The rsync solution (which I used anyway) works perfectly.

  2. Not working for me. No matter what I do, copying files from OSX to SMB-mounted NAS share (brtfs-based synology) results in the files being munged to non-UTF-8.

    convmv claims they need to be repaired. If I do the `–notest` it claims they are repaired, but are not. Persists across unmount/remount. Running `convmv` on the NAS-side indicates the files are not in need of conversion.

    `rsync` does correctly convert copying from NAS to OSX, but the other way, the filenames are always munged. They are Japanese filenames and the てんてん is encoded as two UTF-8 codes, rather than one.

    UTF-8: 03-【飛ばないHIIT】20分間で全身の体脂肪を燃やすトレーニング!.mp4
    NAS: 03-【飛は\u{3099}ないHIIT】20分間て\u{3099}全身の体脂肪を燃やすトレーニンク\u{3099}!.mp4

  3. Oh man, after two hours of fooling around with char encodings in smb settings I found this. Hurah for small web.
    The –iconv setting on rsync fixed it! I didn’t know there even was a utf-8-mac, it all makes sense now.

    Lifesaver, thanks!

  4. *Ps, just to clarify for people having this issue in the future. I had a USB disk mounted on raspbian, exfat, utf8 (check with ‘mount’ command in shell). It displayed the files and chars correctly in a utf8 ssh -> bash shell after rsync, and all files were there. Just the mount on macOS, it either didn’t display any files with special chars in the name (smb in utf8), or displayed those files with wrong chars (smb in latin1).

    So set this in your smb.conf:
    unix charset = utf8
    And use rsync –iconv=utf-8-mac,utf-8 from mac -> raspi (I use SSH instead of mount to rsync files)

    As extra bonus I also set:
    mangled names = no

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.