@lyse@lyse.isobeef.org and @movq@www.uninformativ.de and possibly @aelaraji@aelaraji.com and even @cuaxolotl – I’m very curious to understand and hear thoughts, pros and cons or other feelings about introducing the notation of a feed’s identify using cryptography? If we were to keep things simple, and use what’s commonly available, for example SSH ED25519 keys? using the ssh-keygen -Y sign
or ssh-keygen -Y verify
tools already available? Maybe in combination with @xuu ’s idea of generating a random unique ID for your feed, say # id =
and signing it with your ED25519 key? 🔑
@falsifian@www.falsifian.org You mean the idea of being able to inline # url =
changes in your feed?
Whatever gets used, it would be nice to be able to rotate identities. I like @lyse@lyse.isobeef.org’s idea for that.
Message to the void : présentation. https://si3t.ch/log/2024-09-17-message-to-the-void.txt
@mckinley@twtxt.net Yes, changing domains is be a problem if you tie your identity to an https url. But I also worry about being stuck with a key I can’t rotate. Whatever gets used, it would be nice to be able to rotate identities. I like @lyse@lyse.isobeef.org’s idea for that.
This scheme also only support threading off a specific Twt of someone’s feed. What if you’re not replying to anyone in particular?
Noting that this scheme cannot support disjoint threads that should be merged together once either party discovers each other 😅
@movq@www.uninformativ.de It does now as of several weeks ago or so 👌
@lyse@lyse.isobeef.org Yea I think your idea of inclining url changes in the feed works perfectly as long as folks remember to do so I guess? 🤔
@bender@twtxt.net Seems to have used the hash correctly here 🤣
Holy shot! what an old thread 🤣
@cuaxolotl@sunshinegardens.org Gitea is plenty fast for me 👌 Even with a SQLite database 🤣
I dunno whether any of this is actually true 🤷♂️ The LLM 🤖could have made (hallucinated) this shit up 🤣
WiscKey’s approach to handling key-value pairs in SSDs offers several advantages:
- It minimizes I/O amplification by separating keys and values, allowing for more efficient storage and retrieval.
- The design leverages the SSD’s performance characteristics, utilizing sequential writes and parallel random reads to enhance throughput and reduce latency.
- WiscKey maintains excellent insert and lookup performance while improving SSD lifespan by reducing the number of erase cycles required.
WiscKey minimizes CPU usage compared to LevelDB by eliminating the log file, which reduces the CPU cost associated with encoding and writing key-value pairs. While LevelDB has higher CPU usage due to its single writer protocol and background compaction using one thread, WiscKey’s architecture allows for garbage collection to be decoupled and performed later, minimizing its impact on foreground performance. Consequently, WiscKey generally exhibits lower CPU usage during workloads, except when utilizing multiple background threads for prefetching.
The four critical ideas in the design of WiscKey are:
- Separation of keys from values, with only keys stored in the LSM-tree and values in a separate log file.
- Utilization of the parallel random-read characteristic of SSD devices to handle unsorted values during range queries.
- Implementation of unique crash-consistency techniques to manage the value log efficiently.
- Optimization of performance by removing the LSM-tree log without sacrificing consistency, reducing system-call overhead from small writes.
Summary of WiscKey: Separating Keys from Values in SSD-Conscious Storage
- Authors: Lanyue Lu, Thanumalayan Sankaranarayana Pillai, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau
- Conference: 14th USENIX Conference on File and Storage Technologies (FAST ‘16)
- Key Concept: WiscKey is a key-value store that separates keys from values to minimize I/O amplification, optimizing performance for SSDs.
- Performance: WiscKey outperforms LevelDB and RocksDB in various workloads, achieving up to 111× faster loading and improved random lookup speeds.
- Design Goals: Focus on low write/read amplification, SSD optimization, and support for modern features like range queries.
afaik nobody has done this, but i really need some numbers that can indicate the relative performance of various git servers (cgit, gitea, gitlab) on comparable hardware. cgit claims to be hyperfast, but what does that mean in practice?
Can anyone recommend a decent Android ROM that strips out as much of the spyware as possible? Is GrapheneOS a good option? I need to get a new phone anyway so I don’t mind buying within a supported device list as long as I can get one on the used market for $300-$400 or less.
If anyone could recommend some learning resources for this stuff I’d really appreciate it.
@movq@www.uninformativ.de the thing is, the twtxt is in Maildir. When I reply to it, it doesn’t use the existing hash.
@bender@twtxt.net Ah, haha, --fetch-context
doesn’t go back into archived feeds … 🤦
So, today I created a space where you can send an email to the void: https://void.si3t.ch/ gemini://void.si3t.ch/ #smolnet
Thank you @movq@www.uninformativ.de Things are working again!! 🙏
Trying to fetch the original (highlighting yours) with jenny
renders this:
@movq@www.uninformativ.de I’d guess the same goes for all twtxt.social feeds… I can’t see bender’s archived twts either, didn’t check for the others.
@prologic@twtxt.net Yeah, but I reckon we can kill both birds with one stone. If we change it to support edits, it should be fairly easy to also tweak it to support feed URL changes. Like outlined in my first reply: https://twtxt.net/twt/n4omfvq The URL part sounds way easier to me. :-)
This is how my original message shows up on jenny
:
From: quark <quark>
Subject: (#o) @prologic this was your first twtxt. Cool! :-P
Date: Mon, 16 Sep 2024 12:42:27 -0400
Message-Id: <k7imvia@twtxt>
X-twtxt-feed-url: https://ferengi.one/twtxt.txt
(#o) @<prologic https://twtxt.net/user/prologic/twtxt.txt> this was your first twtxt. Cool! :-P
@sorenpeter@darch.dk There was or maybe still is a competing proposal for multiline twts that combines all twts with the same timestamp to one logical multiline twt. Not sure what happened to that, if it is used in the wild and whether anyone “here” follows a feed with that convention. “Our” solution for multiline twts is to use U+2028 Unicode LINE SEPARATOR as a newline: https://dev.twtxt.net/doc/multilineextension.html.
Hmm… I replied to this message:
From: prologic <prologic>
Subject: Hello World! 😊
Date: Sat, 18 Jul 2020 08:39:52 -0400
Message-Id: <o6dsrga>
X-twtxt-feed-url: https://twtxt.net/user/prologic/twtxt.txt
Hello World! 😊
And see how the hash shows… Is it because that hash isn’t longer used?
@prologic@twtxt.net this was your first twtxt. Cool! :-P
The bug in jenny that @aelaraji@aelaraji.com found:
Jenny has to look for the metadata fields, it must find the # prev = ...
line. To do so, I naively wrote something along these lines:
for line in content.splitlines():
if line.startswith('# prev = '):
...
Problem is, we use \u2028 a lot in twtxt feeds and Python interprets those as line separators as well. That’s not what we want here. Jenny must only split at a \n
.
Now @prologic@twtxt.net had a quote/copy of some of his metadata fields in a twt. Like so:
# prev = foo bar
Perfectly legitimate, but now jenny found the # prev =
twice (once in the actual header, once in a twt), didn’t know what to do, and thus did not fetch the archived feeds. 🤦
Should be fixed in this commit: https://www.uninformativ.de/git/jenny/commit/6e8ce5afdabd5eac22eae4275407b3bd2a167daf.html
@movq@www.uninformativ.de What’s you definition of “complete thread”? ;-) There might be feeds participating in the conversation that you have no idea of.
But yes, this has a nice discoverability bonus. And even simpler than a hash, that’s right.
@movq@www.uninformativ.de use @xuu pod as default instead, as he keeps the cache as long as I used to keep mine when I ran Yarn. @prologic@twtxt.net’s pod expires then way too soon.
@movq@www.uninformativ.de Yeah, I think so.
This is a bug in jenny. 🤦
@prologic@twtxt.net Oh so that’s how it works? The front page only shows the latest twt of each feed? 🤔
No, something is fishy. It didn’t fetch @prologic@twtxt.net’s archived feeds and now only 969 of his twts are in my maildir. 🤔
@aelaraji@aelaraji.com Yep, I just tried. It’s not that easy to verify, though. 😅 It looks fine to me. The number of twts in the maildir has gone down from 61759 to 34787 – but that’s probably because I unfollowed lots of (presumably dead) feeds in the last few weeks. 🥴
@movq@www.uninformativ.de I wiped both ~/.cache/jenny
and my maildir_target
when I tried to reset things. Still got wrecked 😅
If it’s not too much to ask, could you backup or/change your maildir_target
and give it a try with an empty directory?
@aelaraji@aelaraji.com What was going on here? 🥴 Wiping the maildir and ~/.cache/jenny
should reset everything, it doesn’t store any other state. 🤔
PS: I still can’t get your and bender’s archived twts (at least the ones I’ve noticed), nor can I --fetch-context
on replays to them. your oldest is the one from 2024-06-14 18:22
… I can see lyse’s tho! but I doubt this is related the edit issue but this helps with something.
@prologic@twtxt.net I can’t pinpoint the exact cause but here are a couple of symptoms I observed:
- It all started with a LOT of his old twts starting back in 2020 showing in a weird way, some were empty others were duplicates and a lot more got marked for deletion by neomutt with the
D
tag.
- After trying to restart things with a fresh Maildir, I couldn’t fetch a lot of twts, even mine which was a replay to one of his. but then I was able to after temporarily deleting his link from my follow file.
then @quark@ferengi.one and @bender@twtxt.net pointed out the inconsistent from: + feed url and the twt edit
@movq@www.uninformativ.de we can shorten it by six characters, with (r:https://...)
. 😅
(replyto:http://darch.dk/twtxt.txt,2024-09-15T12:06:27Z)
I think I like this a lot. 🤔
The problem with using hashes always was that they’re “one-directional”: You can construct a hash from URL + timestamp + twt, but you cannot do the inverse. When I see #weadxga
, I have no idea what that could possibly refer to.
But of course something like (replyto:http://darch.dk/twtxt.txt,2024-09-15T12:06:27Z)
has all the information you need. This could simplify twt/feed discovery quite a bit, couldn’t it? 🤔 That thing that I just implemented – jenny asking some Yarn pod for some twt hash – would not be necessary anymore. Clients could easily and automatically fetch complete threads instead of requiring the user to follow all relevant feeds.
Only using the timestamp to identify a twt also solves the edit problem.
It even is better for non-Yarn clients, because you now don’t have to read, understand, and implement a “twt hash specification” before you can reply to someone.
The only problem, really, is that (replyto:http://darch.dk/twtxt.txt,2024-09-15T12:06:27Z)
is so long. Clients would have to try harder to hide this. 😅
@quark@ferengi.one Meh I lost the plot ages ago 🤣