Previously, I ran into an issue with pasting non-ASCII characters into Emacs on my Windows system. For example, copying a Greek small letter rho (
ρ) would result in a question mark (
?) appearing in the Emacs editor. Oddly enough, copying Unicode characters from Emacs generally worked fine.
Originally, I assumed it was a bug in Emacs. Yet, I couldn’t find any bug reports about it so that seemed unlikely. Eventually, I came to realize that one of the settings in my Emacs profile (
init.el) caused the problem:
(set-selection-coding-system 'utf-8) ; wrong
As it turns out, the correct value is
'utf-16-le because the Windows API is built entirely on top of UTF-16:
(set-selection-coding-system 'utf-16-le) ; correct
The problem was introduced a while back: it was part of a hack to force Emacs to use UTF-8 as the default buffer encoding. At the time, I copied the code from this StackOverflow answer (fixed now) without too much thought, not realizing it would cause a bug discovered several months later. It turns out that
set-clipboard-coding-system actually do the same thing, but the connection between selections and clipboards is not exactly obvious until one becomes familiar with the Emacs jargon.
Fortunately, the choice of the
selection-coding-system doesn’t affect the encoding of the buffer itself, so it doesn’t invalidate the hack I used to enforce UTF-8 as the default buffer encoding.