Fun video about #Unicode #UTF8. I knew about the historical context and fundamental implementation ideas already, but I didnât know about the Hangul combinations block trick mentioned in the end⊠clever stuff.
fn sub(foo: &String) {
println!("We got this string: [{}]", foo);
}
fn main() {
// "Hello", 0x00, 0x00, "!"
let buf: [u8; 8] = [0x48, 0x65, 0x6C, 0x6C, 0x6F, 0x00, 0x00, 0x21];
// Create a string from the byte array above, interpret as UTF-8, ignore decoding errors.
let lossy_unicode = String::from_utf8_lossy(&buf).to_string();
sub(&lossy_unicode);
}
Create a string from a byte array, but the result isnât a string, itâs a cow đź, so you need another to_string()
to convert your âstringâ into a string.
- https://doc.rust-lang.org/std/string/struct.String.html#method.from_utf8_lossy
- https://doc.rust-lang.org/std/borrow/enum.Cow.html
I still have a lot to learn.
(into_owned()
instead of to_string()
also works and makes more sense to me, itâs just that the compiler suggested to_string()
first, which led to this funny example.)
@lyse@lyse.isobeef.org Sorry, I donât think I ever had charset=utf8. I just noticed that a few days ago. OpenBSDâs httpd might not support including a parameter with the mime type, unfortunately. Iâm going to look into it.
ĂvergĂ„ng till UTF-8 â https://hack.org/mc/blog/utf8.html
Fun fact: OpenBSDs vi does not support utf8. Thatâs probably the first time I havenât just used the default system vi.
Fun fact: OpenBSDs vi does not support utf8. Thatâs probably the first time I havenât just used the default system vi.
@kas@enotty.dk And to make it even worse, most clients interpret the data as win1252. But does any twtxt client autoconvert to utf8 in case another charset is send? I think it probably okay for every client to assume itâs utf8.
@kas@enotty.dk And to make it even worse, most clients interpret the data as win1252. But does any twtxt client autoconvert to utf8 in case another charset is send? I think it probably okay for every client to assume itâs utf8.
@tx@0x1A4.1337.cx Nice! But i donât get a UTF8 feed, for example https://twtxt.1337.cx/tazgezwitscher
@tx@0x1A4.1337.cx Nice! But i donât get a UTF8 feed, for example https://twtxt.1337.cx/tazgezwitscher