Skip to content

Printable ASCII characters >=128 show up on Latin1 terminals as <xx> #508

@h3xx

Description

@h3xx

On non-UTF8 terminals, the entire range of printable ASCII extended characters >= 128 (0x80) is replaced with <xx> where xx is the hex code -- regardless of whether LESSCHARDEF flags them as printable.

They displayed properly until recently.

I already did some of the legwork tracking this down using git bisect.
I'm fairly certain commit a60cc1b introduces this regression. (2024-04-23 "Fix bug in control_char().")

Steps to Repro

export LANG=en_US LC_CTYPE=en_US
# It *does* suss out this setting from the locale--that's not the issue. Set here to be absolutely sure.
export LESSCHARSET=latin1 LESSCHARDEF=8bcccbcc18b95.33b.
# Test this as well:
#export LESSCHARDEF=.
for chr in {239..255}; do
    printf '%b' "\\x$(printf '%x' "$chr")"
done | less
  • Expected: ASCII extended characters 239-255 (0xF0-0xFF) should show up as ïðñòóôõö÷øùúûüýþÿ because they're in the printable range set by LESSCHARDEF. They displayed properly in less v653 and all git versions up until the commit mentioned above.
  • Actual: <EF><F0><F1><F2><F3><F4><F5><F6><F7><F8><F9><FA><FB><FC><FD><FE><FF> shows.
  • Tested in version: 57d6622

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions