@anth@a.9srv.net (I’m also a bit confused by the UTF-8 topic. I thought that the original twtxt spec has always mandated UTF-8 for the content. Why’s that an issue now? 😅 Granted, my client also got this wrong in the past, but it has been fixed ~3 years ago.)
@movq@www.uninformativ.de If my memory serves me right, I think v2 doesn’t mention UTF-8 at all. Then I came along and noted that the Content-Type: text/plain
might be not enough, as the HTTP spec defaults to Latin1 or whatever, not UTF-8. So there is a gap or room for incorrect interpretation. I could be wrong, but I understand @anth@a.9srv.net’s comment that he doesn’t want to even have a Content-Type
header in the first place.
I reckon it should be optional, but when deciding to sending one, it should be Content-Type: text/plain; charset=utf-8
. That also helps browsers pick up the right encoding right away without guessing wrong (basically always happens with Firefox here). That aids people who read raw feeds in browsers for debugging or what not. (I sometimes do that to decide if there is enough interesting content to follow the feed at hand.)
@lyse@lyse.isobeef.org Ahh, I see. So it’s not really a drama. 😅
(When the spec says “content is UTF-8”, then it kind of follows for me that I should set Content-Type: text/plain; charset=utf-8
. Lots of feeds don’t do that, though, which is why jenny ignores the header altogether and always decodes as UTF-8.)
@movq@www.uninformativ.de Yeah this is why thin @anth@a.9srv.net is that and that any v2 spec we get around to actually publishing with far better quality than the bullshit half-baked attempt I tried to 🤣; should just mandate utf-8
period. Just assume it to be true, there is no other content encoding we should ever support 😅