falsifian

www.falsifian.org

James Cook. Time-space trader and software hipster.

Recent twts from falsifian
In-reply-to » @quark My money is on a SHA1SUM hash encoding to keep things much simpler:

@prologic@twtxt.net Wikipedia claims sha1 is vulnerable to a “chosen-prefix attack”, which I gather means I can write any two twts I like, and then cause them to have the exact same sha1 hash by appending something. I guess a twt ending in random junk might look suspcious, but perhaps the junk could be worked into an image URL like

Image

. If that’s not possible now maybe it will be later.

git only uses sha1 because they’re stuck with it: migrating is very hard. There was an effort to move git to sha256 but I don’t know its status. I think there is progress being made with Game Of Trees, a git clone that uses the same on-disk format.

I can’t imagine any benefit to using sha1, except that maybe some very old software might support sha1 but not sha256.

⤋ Read More

@prologic@twtxt.net earlier you suggested extending hashes to 11 characters, but here’s an argument that they should be even longer than that.

Imagine I found this twt one day at https://example.com/twtxt.txt :

2024-09-14T22:00Z Useful backup command: rsync -a “$HOME” /mnt/backup

Image

and I responded with “(#5dgoirqemeq) Thanks for the tip!”. Then I’ve endorsed the twt, but it could latter get changed to

2024-09-14T22:00Z Useful backup command: rm -rf /some_important_directory

Image

which also has an 11-character base32 hash of 5dgoirqemeq. (I’m using the existing hashing method with https://example.com/twtxt.txt as the feed url, but I’m taking 11 characters instead of 7 from the end of the base32 encoding.)

That’s what I meant by “spoofing” in an earlier twt.

I don’t know if preventing this sort of attack should be a goal, but if it is, the number of bits in the hash should be at least two times log2(number of attempts we want to defend against), where the “two times” is because of the birthday paradox.

Side note: current hashes always end with “a” or “q”, which is a bit wasteful. Maybe we should take the first N characters of the base32 encoding instead of the last N.

Code I used for the above example: https://fossil.falsifian.org/misc/file?name=src/twt_collision/find_collision.c
I only needed to compute 43394987 hashes to find it.

⤋ Read More