Text Subtleties
I just noticed that when wget tells you the filename of file it just
saved, if your LANG=C
then it surrounds it with apostrophes ('), but
if your LANG=en_US.UTF-8
then it surrounds it with Unicode LEFT SINGLE
QUOTATION MARK (‘)and RIGHT SINGLE QUOTATION MARK (’). I appreciate
little subtleties like that.
I use Unicode characters in most of the writing I do. For LaTeX,
which I rarely use these days, I use XeTeX, which understands UTF-8
natively. ConTeXt, which I do use regularly, also understands UTF-8
natively. For groff I use the -k
switch, which preprocesses the
text with preconv (which is part of groff), converting the
UTF-8 characters into groff character escapes, since groff doesn't
understand UTF-8 natively. Of course, if it is ReStructuredText that
I'm working with then pandoc can be configured to use any one of
LaTeX, ConTeXt, and groff for creating PDF output, and since
rst2html.py just produces LaTeX that includes any
character you put in your source you can just use xelatex
as part of your commands to turn it into PDF. And sometimes, when I'm
feeling whimsical, I use Heirloom Troff, from the Heirloom
Documentation Tools, which understands UTF-8 natively.
Last edited: 2020-08-03 16:02:52 EDT
Comments
Comments powered by Disqus