this post was submitted on 02 Oct 2023
6 points (100.0% liked)

Programmer Humor

19176 readers
958 users here now

Welcome to Programmer Humor!

This is a place where you can post jokes, memes, humor, etc. related to programming!

For sharing awful code theres also Programming Horror.

Rules

founded 1 year ago
MODERATORS
 
you are viewing a single comment's thread
view the rest of the comments
[–] CodexArcanum@lemmy.world 0 points 11 months ago* (last edited 11 months ago) (7 children)

I love how many people brought up the Turkish "I" as if everyone here is on the Unicode steering committee or just got jobs for Turkish facebook.

I, an English speaker, have personally solved the problem by not having a Turkish I in the name of my Downloads directory, or any other directory that I need to cd into on my computer. I'm going to imagine the Turks solve it by painstakingly typing the correct I, or limiting their use of uppercase I's in general.

In fact, researching the actual issue for more than 1 second seemingly shows that Unicode basically created this problem themselves because the two I's are just seperate letters in Turkic languages. https://en.m.wikipedia.org/wiki/Dotted_and_dotless_I_in_computing

If you nerds think this is bad try doing Powershell for any amount of time. It is entirely case-insensitive.

[–] yum13241@lemm.ee 0 points 11 months ago (6 children)

Why the FUCK did they make characters that look the same have different codepointers in UNICODE? They should've done what they did in CJK and make duplicates have the same codepointer.

Unicode needs a redo.

[–] Tranus@programming.dev 0 points 11 months ago (1 children)

Well letters don't really have a single canonical shape. There are many acceptable ways of rendering each. While two letters might usually look the same, it is very possible that some shape could be acceptable for one but not the other. So, it makes sense to distinguish between them in binary representation. That allows the interpreting software to determine if it cares about the difference or not.

Also, the Unicode code tables do mention which characters look (nearly) identical, so it's definitely possible to make a program interpret something like a Greek question mark the same as a semicolon. I guess it's just that no one has bothered, since it's such a rare edge case.

[–] yum13241@lemm.ee 0 points 11 months ago (1 children)

Why are the Latin "a" and the Cryilic "a" THE FUCKING SAME?

[–] mrpants@midwest.social 0 points 11 months ago (1 children)

In cases where something looks stupid but your knowledge on it is almost zero it's entirely possible that it's not.

The people that maintain Unicode have put a lot of thought and effort into this. Might be helpful to research why rather than assuming you have a better way despite little knowledge of the subject.

[–] yum13241@lemm.ee 0 points 11 months ago (1 children)

When it's A FUCKING SECURITY issue, I know damn well what I'm talking about.

[–] mrpants@midwest.social 0 points 11 months ago (1 children)

Again you do not because the world consists of more than your interests and job description.

[–] yum13241@lemm.ee 0 points 11 months ago

I know damn well what I'm talking about when someone could get scammed on "apple.com" but with a Cyrillic A.

load more comments (4 replies)
load more comments (4 replies)