txt.sour.is

hacker-news-front-page

feeds.twtxt.net

Wed, May 27 07:28 (7w ago)

Unicode 18.0.0 Beta
Article URL: https://www.unicode.org/versions/Unicode18.0.0/

Comments URL: https://news.ycombinator.com/item?id=48290881

Points: 9

# Comments: 1 ⌘ Read more

⤋ Read More

movq

www.uninformativ.de

Fri, Jan 16 17:46 (26w ago)

I recently got an email with this byte sequence:

\xf0\x9f\x8e\x81\xf0\x9f\x95\xaf\xef\xb8\x8f

That’s U+1F381, U+1F56F, U+FE0F. The last one is a “variation selector”:

https://unicodeplus.com/U+FE0F

My toolkit renders this incorrectly – and so do tmux and GNU screen.

Unicode ain’t easy. 🥴

⤋ Read More

@villares@ciberlandia.pt

ciberlandia.pt

Fri, Jan 16 10:49 (26w ago)

Could it be that Source Sans Pro changed recently? No… Somehow at some point ✳ was replaced with ⚹ in my markdown files… I have no idea how this happened.
#Unicode #Typography

Image

⤋ Read More

movq

www.uninformativ.de

Thu, Jan 15 19:18 (26w ago)

↳ In-reply-to » Some work on the menu system to brighten my mood a little bit. No mouse support yet.

@bender@twtxt.net I’m already using it for tracktivity (meant for tracking activities and events, like weather, food consumption, stuff like that), which is basically a somewhat-fancy CSV editor:

Image

I have a couple of other projects where I could use it, because they are plain curses at the moment. Like, one of them has an “edit box”, but you can’t enter Unicode, because it was too complicated. That would benefit from the framework.

Either way, it’s the most satisfying project in a long time and I’m learning a ton of stuff.

⤋ Read More

lyse

lyse.isobeef.org

Thu, Jan 15 19:15 (26w ago)

↳ In-reply-to » Here am I looking at the different tcell.Key constants and typing different key combinations in the terminal to see the generated tcell.EventKeys in the debug log. Until I pressed Ctrl+Alt+Backspace… :-D Yep, suddenly there went my X…

@movq@www.uninformativ.de Yeah, I know that terminals are super weird and messy. In both the KDE Konsole (identifying itself as TERM=xterm-256color) and xterm (TERM=xterm) it just works flawlessly. My urxvt (TERM=rxvt-unicode-256color) just doesn’t. I also tried messing with TERM in urxvt, but no luck so far.

⤋ Read More

movq

www.uninformativ.de

Sat, Jan 3 19:12 (28w ago)

More widget system progress:

https://movq.de/v/87e2bce376/vid-1767467193.mp4

I like the oldschool shadow effect. 😅 Not sure if I’ll keep it, but it’s neat.

The menu bar is still fake.

Had to spend quite a bit of time optimizing the rendering today. This can get really slow really quickly.

Unicode is Pain.

I might be able to start porting my first program (currently uses urwid) soon. 🤔

⤋ Read More

movq

www.uninformativ.de

Wed, Dec 31 08:29 (28w ago)

Why have these Unicode smilies never caught on, I wonder? 🤪

𜲦𜲩 𜲨𜲩 𜲦𜲧
𜲪𜲫 𜲮𜲯 𜲰𜲱

⤋ Read More

movq

www.uninformativ.de

Tue, Dec 30 14:16 (29w ago)

Well, you girls and guys are making cool things, and I have some progress to show as well. 😅

https://movq.de/v/c0408a80b1/movwin.mp4

Scrolling widgets appears to work now. This is (mostly) Unicode-aware: Note how emojis like “😅” are double-width “characters” and the widget system knows this. It doesn’t try to place a “😅” in a location where there’s only one cell available.

Same goes for that weird “ä” thingie, which is actually “a” followed by U+0308 (a combining diacritic). Python itself thinks of this as two “characters”, but they only occupy one cell on the screen. (Assuming your terminal supports this …)

This library does the heavy Unicode lifting: https://github.com/jquast/wcwidth (Take a look at its implementation to learn how horrible Unicode and human languages are.)

The program itself looks like this, it’s a proper widget hierarchy:

Image

(There is no input handling yet, hence some things are hardwired for the moment.)

⤋ Read More

lyse

lyse.isobeef.org

Mon, Dec 29 17:00 (29w ago)

↳ In-reply-to » @lyse I’m toying with the idea of making a widget/window system on top of Python’s ncurses. I’ve never really been happy with the existing ones (like urwid, textual, pytermgui, …). I mean, they’re not horrible, it’s mostly the performance that’s bugging me – I don’t want to wait an entire second for a terminal program to start up.

@movq@www.uninformativ.de I see. Yeah, all the Unicode stuff certainly doesn’t help here, that’s for sure.

Maybe “speedcurses” could be a name. Or just select any Palatinate curse. ;-)

⤋ Read More

movq

www.uninformativ.de

Mon, Dec 29 16:19 (29w ago)

↳ In-reply-to » Trying to come up with a name for a new project and every name is already taken. 🤣 The internet is full!

@lyse@lyse.isobeef.org I’m toying with the idea of making a widget/window system on top of Python’s ncurses. I’ve never really been happy with the existing ones (like urwid, textual, pytermgui, …). I mean, they’re not horrible, it’s mostly the performance that’s bugging me – I don’t want to wait an entire second for a terminal program to start up.

Not sure if I’ll actually see it through, though. Unicode makes this kind of thing extremely hard. 🫤

⤋ Read More

@villares@ciberlandia.pt

ciberlandia.pt

Mon, Oct 6 12:39 (41w ago)

Fun video about #Unicode #UTF8. I knew about the historical context and fundamental implementation ideas already, but I didn’t know about the Hangul combinations block trick mentioned in the end… clever stuff.

https://www.youtube.com/watch?v=vpSkBV5vydg

⤋ Read More

movq

www.uninformativ.de

Wed, Jul 23 14:05 2025 (1y ago)

↳ In-reply-to » @movq I fully agree with you on https://www.uninformativ.de/blog/postings/2025-07-22/0/POSTING-en.html!

@lyse@lyse.isobeef.org The underlines are a bit much, yes. It appears to be related to my font (Helvetica) … Maybe they do some Unicode trickery these days, I don’t know. 🫤

⤋ Read More

movq

www.uninformativ.de

Sun, Jun 15 08:48 2025 (1y ago)

fn sub(foo: &String) {
    println!("We got this string: [{}]", foo);
}

fn main() {
    // "Hello", 0x00, 0x00, "!"
    let buf: [u8; 8] = [0x48, 0x65, 0x6C, 0x6C, 0x6F, 0x00, 0x00, 0x21];

    // Create a string from the byte array above, interpret as UTF-8, ignore decoding errors.
    let lossy_unicode = String::from_utf8_lossy(&buf).to_string();

    sub(&lossy_unicode);
}

Create a string from a byte array, but the result isn’t a string, it’s a cow 🐮, so you need another to_string() to convert your “string” into a string.

I still have a lot to learn.

(into_owned() instead of to_string() also works and makes more sense to me, it’s just that the compiler suggested to_string() first, which led to this funny example.)

⤋ Read More

movq

www.uninformativ.de

Sun, May 18 18:23 2025 (1y ago)

↳ In-reply-to » And to finish the day: Om Live at Pioneer Works 🤘 – https://www.youtube.com/watch?v=IwnDKcoVHmY

(Where is there no bass emoji in Unicode? Pah.)

⤋ Read More

lobste_rs

feeds.twtxt.net

Fri, May 16 13:31 2025 (1y ago)

Detecting malicious Unicode
Comments ⌘ Read more

⤋ Read More

lyse

lyse.isobeef.org

Fri, Mar 21 22:30 2025 (1y ago)

Hmmm, when I Ctrl+Left to jump a word left, I get 1;5D in my tt2 message text. My TERM is set to rxvt-unicode-256color. In tt, it works just fine. When I change to TERM=xterm-256color, it also works in tt2. I have to read up on that. Maybe even try to capture these sequences and rewrite them.

⤋ Read More

lyse

lyse.isobeef.org

Thu, Jan 30 17:30 2025 (1y ago)

↳ In-reply-to » Heute fahren wir auffe Arbeit ein großen Update für das CMS der zentralen Webseiten. Hoffentlich geht das alles gut. 😱

@arne@uplegger.eu Ohjemine, TYPO3! O_o Lass mich schreiend davonlaufen!

Mit dieser absoluten Katastrophensoftware vor dem Herrn haben wir mal ein Studienprojekt gemacht. Die hat alle Vorurteile komplett übererfüllt. Angefangen von Fehlerseiten, die statt 4xx oder dergleichen immer mit HTTP 200 ausgeliefert wurden oder auch, dass das generierte HTML leider einfach ungültig war. Über die Implementierung von Löschen durch einen Deleted-Schalter in der Datenbank, das Speichern von Passwörtern im Klartext bis hin zu völlig umständlichen Bedienungskonzepten. Alles hat immer brutal viele Schritte gebraucht. Das Zeilennummernrumgeeier im TYPO-Script erinnerte eher an Basic. Uns kam es auch so vor, als ob man damit nicht ernsthaft was sinnvolles machen könnte.

Zu allem Überfluss hatte irgendwer noch ein ganz hundsmiserables Buch ausgegraben, das als Vorbereitung dienen sollte. Ich kann mich zum Glück weder an den Titel noch den Autor erinnern, aber ich weiß noch, wie das komplett inkonsistent geschrieben war. Anfangs gabs mehrere Seiten zu Unicode und UTF-8 wurde angepriesen, aber alle Beispiele haben dann auf ISO-8859-1 gesetzt. Gezeigter Beispielcode war häufig unterste Schublade. Selten hab ich so merkwürdige Erklärungen gelesen: „Wenn Sie die Sicherheitswarnhinweise stören, kommentieren Sie doch bitte im Quelltext die die()-Funktion in $ZEILE aus.“ Oder ein anderer Klassiker: „Ausgeschrieben würde der Code wohl folgendes tun…“. War sich der Autor also nicht ganz sicher, ob sein Codeschnipsel vllt. doch in Wahrheit was ganz anderes tut.

Seit diesem gigantischen Trauma (das hat mich wirklich sehr nachhaltig geprägt, wie man Dinge nicht machen sollte) hab ich erfolgreich einen Bogen um das TYPO3-Universum gemacht.

Ich kann nur hoffen, dass es zwischenzeitlich ein wenig besser geworden ist. Aber Deinem Kurzbericht zufolge scheint da ja immer noch der Wurm drin zu sein. Mein Beileid! :-(

⤋ Read More

lyse

lyse.isobeef.org

Thu, Nov 7 20:15 2024 (1y ago)

↳ In-reply-to » I've been thinking of a few improvements for the next generation of twtxt spec, let me know if these are useful or interesting :) https://text.eapl.mx/a-few-ideas-for-a-next-twtxt-version

Righto, @eapl.me@eapl.me, ta for the writeup. Here we go. :-)

Metadata on individual twts are too much for me. I do like the simplicity of the current spec. But I understand where you’re coming from.

Numbering twts in a feed is basically the attempt of generating message IDs. It’s an interesting idea, but I reckon it is not even needed. I’d simply use location based addressing (feed URL + ‘#’ + timestamp) instead of content addressing. If one really wanted to, one could hash the feed URL and timestamp, but the raw form would actually improve disoverability and would not even require a richer client. But the majority of twtxt users in the last poll wanted to stick with content addressing.

yarnd actually sends If-Modified-Since request headers. Not only can I observe heaps of 304 responses for yarnds in my access log, but in Cache.FetchFeeds(…) we can actually see If-Modified-Since being deployed when the feed has been retrieved with a Last-Modified response header before: https://git.mills.io/yarnsocial/yarn/src/commit/98eee5124ae425deb825fb5f8788a0773ec5bdd0/internal/cache.go#L1278

Turns out etags with If-None-Match are only supported when yarnd serves avatars (https://git.mills.io/yarnsocial/yarn/src/commit/98eee5124ae425deb825fb5f8788a0773ec5bdd0/internal/handlers.go#L158) and media uploads (https://git.mills.io/yarnsocial/yarn/src/commit/98eee5124ae425deb825fb5f8788a0773ec5bdd0/internal/media_handlers.go#L71). However, it ignores possible etags when fetching feeds.

I don’t understand how the discovery URLs should work to replace the User-Agent header in HTTP(S) requests. Do you mind to elaborate?

Different protocols are basically just a client thing.

I reckon it’s best to just avoid mixing several languages in one feed in the first place. Personally, I find it okay to occasionally write messages in other languages, but if that happens on a more regularly basis, I’d definitely create a different feed for other languages.

Isn’t the emoji thing “just” a client feature? So, feed do not even have to state any emojis. As a user I’d configure my client to use a certain symbol for feed ABC. Currently, I can do a similar thing in tt where I assign colors to feeds. On the other hand, what if a user wants to control what symbol should be displayed, similar to the feed’s nick? Hmm. But still, my terminal font doesn’t even render most of emojis. So, Unicode boxes everywhere. This makes me think it should actually be a only client feature.

⤋ Read More

prologic-twtxt-atom-feed

feeds.twtxt.net

Fri, Nov 1 23:12 2024 (1y ago)

(#fmnhewq) @bender@bender Which feed has Unicode newlines in the desc? Hmm 🧐
@bender Which feed has Unicode newlines in the desc? Hmm 🧐 ⌘ Read more

⤋ Read More

falsifian

www.falsifian.org

Wed, Oct 30 17:44 2024 (1y ago)

↳ In-reply-to » @prologic I'm not a yarnd user, so it doesn't matter a whole lot to me, but FWIW I'm not especially keen on changing how I format my twts to work around yarnd's quirks.

@bender@twtxt.net @prologic@twtxt.net I’m not exactly asking yarnd to change. If you are okay with the way it displayed my twts, then by all means, leave it as is. I hope you won’t mind if I continue to write things like 1/4 to mean “first out of four”.

What has text/markdown got to do with this? I don’t think Markdown says anything about replacing 1/4 with ¼, or other similar transformations. It’s not needed, because ¼ is already a unicode character that can simply be directly inserted into the text file.

What’s wrong with my original suggestion of doing the transformation before the text hits the twtxt.txt file? @prologic@twtxt.net, I think it would achieve what you are trying to achieve with this content-type thing: if someone writes 1/4 on a yarnd instance or any other client that wants to do this, it would get transformed, and other clients simply wouldn’t do the transformation. Every client that supports displaying unicode characters, including Jenny, would then display ¼ as ¼.

Alternatively, if you prefer yarnd to pretty-print all twts nicely, even ones from simpler clients, that’s fine too and you don’t need to change anything. My 1/4 -> ¼ thing is nothing more than a minor irritation which probably isn’t worth overthinking.

⤋ Read More

niplav

niplav.site

Mon, May 8 20:43 2023 (3y ago)

Unicode doesn’t distinguish between a dollar sign with one and a dollar sign with two strokes, which makes me sad.

⤋ Read More

xkcd-com

feeds.twtxt.net

Fri, Nov 18 00:00 2022 (3y ago)

Account Problems
⌘ Read more

⤋ Read More

xkcd-com

feeds.twtxt.net

Wed, Apr 13 00:00 2022 (4y ago)

Weird Unicode Math Symbols
⌘ Read more

⤋ Read More

niplav

niplav.site

Tue, Feb 15 23:47 2022 (4y ago)

someday i will descend upon the unicode consortium and add sub/superscript version of the whole latin alphabet

⤋ Read More

xuu

txt.sour.is

Fri, Nov 19 20:50 2021 (4y ago)

@lyse@lyse.isobeef.org What the heck? no emoji? do you even Unicode!

⤋ Read More

xuu

dev.txt.sour.is

Fri, Nov 19 20:50 2021 (4y ago)

@lyse@lyse.isobeef.org What the heck? no emoji? do you even Unicode!

⤋ Read More

meff

yarn.meff.me

Tue, Nov 9 06:21 2021 (4y ago)

↳ In-reply-to » Ugh why does Emojipedia sell my data. This is so silly.

@prologic@twtxt.net Yeah like normally I’m just a little annoyed and just say “whatever” and shrug it off, but come on I am searching for emojis here. Do you really need to harvest my user data for what is essentially a fuzzy search in the Unicode table?

⤋ Read More

xuu

txt.sour.is

Tue, Nov 2 17:59 2021 (4y ago)

↳ In-reply-to » How fair ye î̸͚n̸͔͋ ̴̰̃t̸̲͝ḧ̸͙́e̴̱͛ ̸̈́ͅd̷̜̕e̵̬̚p̷̨̽t̴͍͆h̶͙̓ṡ̶̩o̵̪̎f̴̧̉ ̵̳̄̄Z̸̩̗̉͊̎a̸͎̹͚̓̌͋l̸͎̰̤̚g̸̛̖̬͇̾ö̵̲͖?̸̫̦̉̇ͅ ̷̡͚̑̓͊

@prologic@twtxt.net lol. just testing some Unicode.

⤋ Read More

xuu

dev.txt.sour.is

Tue, Nov 2 17:59 2021 (4y ago)

↳ In-reply-to » How fair ye î̸͚n̸͔͋ ̴̰̃t̸̲͝ḧ̸͙́e̴̱͛ ̸̈́ͅd̷̜̕e̵̬̚p̷̨̽t̴͍͆h̶͙̓ṡ̶̩o̵̪̎f̴̧̉ ̵̳̄̄Z̸̩̗̉͊̎a̸͎̹͚̓̌͋l̸͎̰̤̚g̸̛̖̬͇̾ö̵̲͖?̸̫̦̉̇ͅ ̷̡͚̑̓͊

@prologic@twtxt.net lol. just testing some Unicode.

⤋ Read More

newest_python_peps

feeds.twtxt.net

Mon, Nov 1 00:00 2021 (4y ago)

PEP 672: Unicode-related Security Considerations for Python
This document explains possible ways to misuse Unicode to write Python
programs that appear to do something else than they actually do. ⌘ Read more

⤋ Read More

jcolag

john.colagioia.net

Wed, Sep 29 11:38 2021 (4y ago)

On the blog: Where Have All the Emoji Gone? https://john.colagioia.net/blog/2021/09/29/emoji.html #programming #techtips #unicode #blog

⤋ Read More

gugod

gugod.org

Wed, Sep 22 03:35 2021 (4y ago)

https://metacpan.org/release/WOLFSAGE/perl-5.35.4/changes#Unicode-14.0-is-supported Perl 5.35.4 版之後所對應的 Unicode 版本已經推進到 14.0.0 了。

⤋ Read More

xuu

txt.sour.is

Mon, Jul 26 20:30 2021 (5y ago)

@prologic@twtxt.net should we enable all unicode glyphs for tags? https://txt.sour.is/conv/55yrura

⤋ Read More

xuu

dev.txt.sour.is

Mon, Jul 26 20:30 2021 (5y ago)

@prologic@twtxt.net should we enable all unicode glyphs for tags? https://txt.sour.is/conv/55yrura

⤋ Read More

anth

a.9srv.net

Sat, Jul 3 21:44 2021 (5y ago)

I wrote a ‘banner’-like program for Plan 9 (and p9p) that uses the Unicode box drawing characters: http://txtpunk.com/banner/index.html

⤋ Read More

niplav

niplav.site

Sat, Apr 17 10:13 2021 (5y ago)

huh, txtnish seems to have problems with linebreaks & unicode;.

⤋ Read More

www-synkretie-net-twtxt

www.synkretie.net

Sun, Feb 28 02:55 2021 (5y ago)

riding an experimental font renderer through unicode’s outer planes, I found curvy, pointy, zigzaggy, cloudy shapes, glyphs that span lines and punch through pages, diacritics that reach back in time, the entire 2045 uplifted octopus emoji set,

⤋ Read More

luke_smiths_blog

feeds.twtxt.net

Thu, Jul 9 20:47 2020 (6y ago)

Accented and other unicode characters in groff/troff ⌘

⤋ Read More

newest_python_peps

feeds.twtxt.net

Thu, Jul 9 01:53 2020 (6y ago)

PEP 623: Remove wstr from Unicode ⌘ http://www.python.org/dev/peps/pep-0623

⤋ Read More

newest_python_peps

feeds.twtxt.net

Thu, Jul 9 01:53 2020 (6y ago)

PEP 624: Remove Py_UNICODE encoder APIs ⌘ http://www.python.org/dev/peps/pep-0624

⤋ Read More

www.lord-enki.net

Thu, Jan 24 16:00 2019 (7y ago)

Teletext graphics characters among those added to Unicode – Teletext Art http://teletextart.co.uk/teletext-graphics-characters-among-those-added-to-unicode

⤋ Read More

www.lord-enki.net

Sun, Oct 28 14:01 2018 (7y ago)

Because of the use of ‘rune’ to refer to unicode codepoints in go, a fulthark transliteration program might have somewhat confusing source…

⤋ Read More

www-synkretie-net-twtxt

www.synkretie.net

Thu, Sep 6 22:05 2018 (7y ago)

unicode 2040 features (3):↵- MIDI ligatures that play notes as you read them↵- unicode block characters (shy, aggressive, deceitful, …)↵- unicode block [alien script reconstructed from fossil graffiti]

⤋ Read More

www-synkretie-net-twtxt

www.synkretie.net

Thu, Sep 6 22:02 2018 (7y ago)

unicode 2040 features (2):↵- support for 2D writing systems, lines can now turn by multiples of 45°↵- in-band parsing directives for ligatures (finally with context-free grammar support)↵- in-band layout directives↵- camera glyph can now store up to ten seconds of video

⤋ Read More

www-synkretie-net-twtxt

www.synkretie.net

Thu, Sep 6 22:00 2018 (7y ago)

unicode 2040 features:↵- amoji: emoji, but abstractified like any ideographic writing system in active usage↵- hypermoji: animated, holographic, interactive, moody, you name it↵- U+FFF6 ᴄʜᴀʀᴀᴄᴛᴇʀ ᴜɴᴀᴠᴀɪʟᴀʙʟᴇ ɪɴ ʏᴏᴜʀ ᴊᴜʀɪsᴅɪᴄᴛɪᴏɴ (usually a transparent square)

⤋ Read More

www.lord-enki.net

Mon, Jul 30 12:41 2018 (8y ago)

A Spectre is Haunting Unicode https://www.dampfkraft.com/ghost-characters.html

⤋ Read More

www.lord-enki.net

Fri, Nov 17 16:11 2017 (8y ago)

unum - Interconvert numbers, Unicode, and HTML/XHTML characters http://www.fourmilab.ch/webtools/unum/

⤋ Read More