@xuu 🤣🤣🤣
So I whipped up a quick shell script to demonstrate what I mean by the increase in feed size on average as well as the expected increase in storage and retrieval requirements.
$ ./compare.sh
Original file size: 28145 bytes
Modified file size: 70672 bytes
Percentage increase in file size: 151.10%
...
Thank goodness we relaxed that limit and I’ve stopped being so Puritan about it but my overall point is we would be significantly increasing the human size as well as the machine size of the identity of threads as well as twts
With the original specification of 140 character Twt length recommendation. There’s only leaves you with about 78 characters worth of anything remotely useful to say in response.
Let’s say the overhead is always three bytes two parentheses under space.
So for example, if we would use @movq@www.uninformativ.de ’s feed as an example thread ID here, his feed with a particular timestamp, were already looking at a subject length of 59 bytes +/- a couple of bytes to denote the subject in the Twt itself/
One of the reasons we wanted to originally use Contant based addressing and short hashes as our threading model was to keep individual Twts short so that they were still readable if you viewed the manually by hand.
With the proposal to switch to location based addressing using a pointer to a feed and a timestamp in that feed you’re looking at roughly 2025 characters long because both the HTTP and HTML and even URI specifications do not specify maximum length for URI(s) AFAIK only recommendations.
@bender@twtxt.net I can’t see myself personally, increasing the infrastructure and costs to run this pod to support this as we switch over potentially and as things continue to grow in scale. You would never get your infinite search and infinite timeline features that you’ve always wanted for example and I would have to drastically reduce what is visible or even searchable at any given point in time to much less than what it is today.
Another interesting side effect of changing from content-based addressing to location-based addressing is that switching from 7-byte keys to 2025-character keys for 3.5 million entries would expand the database size from 24.5 MB to about 7.09 GB—an increase of roughly 7.06 GB!
@falsifian@www.falsifian.org No worries! Fell few to contribute to the doc directly I’d you wish 👌
@falsifian@www.falsifian.org Hmmm not sure sorry 🤔
@xuu Goos to know! 👌 So as long as we remain decentralized and non-commercial (I assume non/profit works too?) we’re good?
@lyse@lyse.isobeef.org Nice ! 🙏
@doesnm@doesnm.p.psf.lt Hello! 👋
@lyse@lyse.isobeef.org Yes let’s make UTF-8 mandatory 👌
@lyse@lyse.isobeef.org Agreed
Let’s try this pill for Twtxt v2 (no account required)
@lyse@lyse.isobeef.org I’m a bit indifferent whether it’s at the beginning or end tbh.
This is still a draft! Feel free to edit it 👌
@movq@www.uninformativ.de That’s what I was afraid of 🤣
yarnd
to see how many things would break and how many assumptions there are around the idea of "Content Addressing"; here's where I'm at so far:
@movq@www.uninformativ.de Makes sense 👌 I think it’s fair to implement any spec changes incrementaly for sure 👌
And yea since yarnd has a store it’s a bit easier to support edit / delete actions 😅
So I’m a location based system, how exactly do I reply to one of these two Twts from @Yarns@search.twtxt.net ? 🤔
2024-09-07T12:55:56Z 🥳 NEW FEED: @<twtxt http://edsu.github.io/twtxt/twtxt.txt>
2024-09-07T12:55:56Z 🥳 NEW FEED: @<kdy https://twtxt.kdy.ch/twtxt.txt>
@lyse@lyse.isobeef.org Yup, this is why you started seeing if you could improve the “trust” of peers right? 😅
yarnd
to see how many things would break and how many assumptions there are around the idea of "Content Addressing"; here's where I'm at so far:
@movq@www.uninformativ.de Yeah I think what I’m proposing here is a more pragmatic approach to improvements that will last much longer than our first interaction (~4 years and going strong, but running into minor issues with edit/identify and some collssions_). This scope of changes is much easier to implement for yarnd
and I suspect jenny
too. and as indicated in here quite easy to have a reference implementation written in Bash with standard UNIX tools.
It’s even sorta/somewhat compatible with our existing feeds (kind of) 🤣 – Bit too stupid to figure out how to write enough correct Bash to make threads display inline nicely in an indented/tree-like fashion, but oh well 😅
Example:
$ ./twtxt-v2.sh reply 242561ce02d "Cool! 👌"
Posted twt with hash: b2c938f9838
...
$ ./twtxt-v2.sh timeline
...
prologic@twtxt.net [2024-09-22T07:26:37Z] <242561ce02d> Okay folks, I've spent all day on this today, and I _think_ its in "good enough"™ shape to share:
**Twtxt v2**:
- Specification: https://docs.mills.io/uJXuisaYTRWYDrl8A2jADg?both
- implementation: https://gist.mills.io/prologic/afdec15443da4d7aa898f383f171ec1b
![](https://twtxt.net/media/Wb9MtAiQyEkzNQB5dyVvUR.png)
prologic@localhost [2024-09-22T07:51:16Z] <b2c938f9838> Cool! 👌 (reply-to:242561ce02d)
Okay folks, I’ve spent all day on this today, and I think its in “good enough”™ shape to share:
Twtxt v2:
- Specification: https://docs.mills.io/uJXuisaYTRWYDrl8A2jADg?both
- implementation: https://gist.mills.io/prologic/afdec15443da4d7aa898f383f171ec1b
@aelaraji@aelaraji.com No that is absolutely correct. Without cryptographic identities and signatures there is no way to verify authenticity. That is correct. And I don’t think we need to necessarily. What I was just showing and proving was that I didn’t write that spoofed Twt in the first place, which was only provable at the time of @lyse@lyse.isobeef.org short-lived attack 🤣 He essentially forked yarnd
, hosted it temporarily (I think locally) and used it to poison the caches of a few production pods.
Thankfully the gossip protocol used by yarnd
as part of its “peering” between pods isn’t fully trusted, twts are not archived for example into permanent storage. So the moment my pod re-fetched my own feed, the spoofed Twt was obliterated 😅
Eventual consistency 🤣
LOl 😂 Not only have a tried to write up a full Twtxt v2 specification, I’ve also written a Bash shell script that implements the new spec 😅
@movq@www.uninformativ.de Haha 😝 Nice one! And yes I’m also aware of some collisions too!
@aelaraji@aelaraji.com I like Nttfy 👌 I’ve wanted to replace my use of the Pushover service with this for a while now 🤔
yarnd
to see how many things would break and how many assumptions there are around the idea of "Content Addressing"; here's where I'm at so far:
👋 Reminder folks of the upcoming Yarn.social monthly online meetup:
I hope to see @david@collantes.us @movq@www.uninformativ.de @lyse@lyse.isobeef.org @xuu @sorenpeter@darch.dk and hopefully others too @aelaraji@aelaraji.com @falsifian@www.falsifian.org and anyone else that sees this! 🙏 We’re hopefully going to primarily discuss the future of Twtxt and the last few weeks of discussions 🤣
- Event: Yarn.social Online Meetup
- When: 28th September 2024 at 12:00pm UTC (midday)
- Where: Mills Meet : Yarn.social
- Cadence: 4th Saturday of every Month
Agenda:
- Let’s talk about the upcoming changes to the Twtxt spec(s)
- See #xgghhnq
- See #xgghhnq
My Position on the last few weeks of Twtxt spec discussions:
- We increase the Hash length from
7
to11
.
- We formalise the Update Commands extension.
- We amend the Twt Hash and Metadata extension to state:
Feed authors that wish to change the location of their feed (once Twts have been published) must append a new
# url =
comment to their feed to indicate the new location and thus change the “Hashing URI” used for Twts from that point onward.
This has implications of the “order” of a feed, and we should either do one of two things, either:
- Mandate that feeds are append-only.
- Or amend the Metadata spec with a new field that denotes the order of the feed so clients can make sense of “inline” comments in the feed. – This would also imply that the default order is (of course) append-only. Suggestion:
# direction = [append|prepend]
I finally decided to do a few experiments with yarnd
to see how many things would break and how many assumptions there are around the idea of “Content Addressing”; here’s where I’m at so far:
Basically I’m at a point where spending time on this is going to provide very little value, there are assumptions made in the lextwt parser, assumptions made in yarnd, assumptions in the way storage is done and the way threading works and things are looked up. There are far reaching implications to changing the way Twts are identified here to be “location addressed” that I’m quite worried about the amount of effort would be required to change yarnd
here.
@mckinley@twtxt.net Yes I have, however I’m not counting that because even using “Cloud” is not labor free.
@aelaraji@aelaraji.com We digits it out 🤣 @lyse@lyse.isobeef.org ’s little hack was good but only temporary 🤣
(replyto:…)
. It’s easier to implement and the whole edits-breaking-threads thing resolves itself in a “natural” way without the need to add stuff to the protocol.
@sorenpeter@darch.dk Lins of agree with dealing with this kind of social nonsense which we’ve all done in the past 🤣
(replyto:…)
. It’s easier to implement and the whole edits-breaking-threads thing resolves itself in a “natural” way without the need to add stuff to the protocol.
@movq@www.uninformativ.de I think your scenario doesn’t account for clients and their storage. The scenario described only really affects clients that come along later. Even then they would also be able to re-fetch mossing Twts from peers or even a search engine to fill in the gaps.
(replyto:…)
. It’s easier to implement and the whole edits-breaking-threads thing resolves itself in a “natural” way without the need to add stuff to the protocol.
@movq@www.uninformativ.de That’s kind a problem though right?
yarnd
has a couple of settings with some sensible/sane defaults:
(replyto:…)
. It’s easier to implement and the whole edits-breaking-threads thing resolves itself in a “natural” way without the need to add stuff to the protocol.
I just realized the other big property you lose is:
What if someone completely changes the content of the root of the thread?
Does the Subject reference the feed and timestamp only or the intent too?
(replyto:…)
. It’s easier to implement and the whole edits-breaking-threads thing resolves itself in a “natural” way without the need to add stuff to the protocol.
@bender@twtxt.net Yeah I’ll be honest here; I’m not going to be very happy if we go down this “location addressing” route;
- Twt Subjects lose their meaning.
- Twt Subjects cannot be verified without looking up the feed.
- Which may or may not exist anymore or may change.
- Which may or may not exist anymore or may change.
- Two persons cannot reply to a Twt independently of each other anymore.
and probably some other properties we’d stand to lose that I’m forgetting about…
(replyto:…)
. It’s easier to implement and the whole edits-breaking-threads thing resolves itself in a “natural” way without the need to add stuff to the protocol.
@movq@www.uninformativ.de One of the biggest reasons I don’t like the (replyto:…)
proposal (location addressing vs. content addressing) is that you just introduce a similar problem down the track, albeit rarer where if a feed changes its location, your thread’s “identifiers” are no longer valid, unless those feed authors maintain strict URL redirects, etc. This potentially has the long-term effect of being rather fragile, as opposed to what we have now where an Edit just really causes a natural fork in the thread, which is how “forking” works in the first place.
I realise this is a bit pret here, and it probably doesn’t matter a whole lot at our size. But I’m trying to think way ahead, to a point where Twtxt as a “thing” can continue to work and function decades from now, even with the extensions we’ve built. We’ve already proven for example that Twts and threads from ~4 years ago still work and are easily looked up haha 😝
I just read the primary spec I’m strongly in support of and it’s pretty rock solid for me 👌 💯
Do you recall what it was? I blame my maintenance window 🪟
@bender@twtxt.net Hmm what you replied to appears to be non-existent: https://twtxt.net/twt/pqst4ea
@movq@www.uninformativ.de I just saw thes come through! 🙏 Thank you very much, I’ll definitely have a read tomorrow! 👌
@bender@twtxt.net Which reply was that? 🤔