@prologic@twtxt.net where is the parent on your reply? š§š¤š
@prologic@twtxt.net right, because it was deleted, and purged from cache, of course! Good try, mister, you are in trouble. Call the Yarn Police! š
Ever wondered what it would cost to self-hosted vs. use the cloud? Well I often doubt myself every time I look at hardware prices, and I know I have to do some hardware refresh soonā¢ for the Mills DC (something I donāt have a regular plan or budget for), hereās a rough ball park:
The Mills DC has cost me around ~$15k to build and maintain over the last ~10 years or so. Roughly speaking. Iāve never actually taken a Bill of Materials or anything, but I could if anyone is interested in more specifics.
The equivalent of resources if run in the āCloudā would cost around:
- ~$1,000 for virtual machines
- ~$12000 for storage
So around ~$2,000/month to run.
Keep this in mind anytime anyone ever tries to con you into believing āCloud is cheaperā. Itās not.
@aelaraji@aelaraji.com This is one of the reasons why yarnd
has a couple of settings with some sensible/sane defaults:
I could already imagine a couple of extreme cases where, somewhere, in this peaceful world oneās exercise of freedom of speech could get them in Real trouble (if not danger) if found out, it wouldnāt necessarily have to involve something to do with Law or legal authorities. So, If someone asks, and maybe fearing fearing forā¦ letās just say āTheir well beingā, would it heart if a pod just purged their content if itās serving it publicly (maybe relay the info to other pods) and call it a day? It doesnāt have to be about some law/convention somewhere ā¦ š¤· I know! Too extreme, but Iāve seen news of people whoād gone to jail or got their lives ruined for as little as a silly joke. And it doesnāt even have to be about any of this.
There are two settings:
$ ./yarnd --help 2>&1 | grep max-cache
--max-cache-fetchers int set maximum numnber of fetchers to use for feed cache updates (default 10)
-I, --max-cache-items int maximum cache items (per feed source) of cached twts in memory (default 150)
-C, --max-cache-ttl duration maximum cache ttl (time-to-live) of cached twts in memory (default 336h0m0s)
So yarnd
pods by default are designed to only keep Twts around publicly visible on either the anonymous Frontpage or Discover View or your Timeline or the feedās Timeline for up to 2 weeks with a maximum of 150 items, whichever get exceeded first. Any Twts over this are considered āoldā and drop off the active cache.
Itās a feature that my old man @off_grid_living@twtxt.net was very strongly in support of, as was I back in the day of yarnd
ās design (nothing particularly to do with Twtxt per se) that Iāve to this day stuck by ā Even though there are some š that have different views on this š¤£
@aelaraji@aelaraji.com Thanks for this! š
@movq@www.uninformativ.de @falsifian@www.falsifian.org @prologic@twtxt.net Maybe I donāt know what Iām talking about and Youāve probably already read this: Everything you need to know about the āRight to be forgottenā coming straight out of the EUās GDPR Website itself. It outlines the specific circumstances under which the right to be forgotten applies as well as reasons that trump the oneās right to erasure ā¦etc.
Iām no lawyer, but my uneducated guess would be that:
A) twts are already publicly available/public knowledge and suchā¦ just donāt process childrenās personal data and MAYBE youāre good? Since thereās this:
ā¦ an organizationās right to process someoneās data might override their right to be forgotten. Here are the reasons cited in the GDPR that trump the right to erasure:
- The data is being used to exercise the right of freedom of expression and information.
- The data is being used to perform a task that is being carried out in the public interest or when exercising an organizationās official authority.
- The data represents important information that serves the public interest, scientific research, historical research, or statistical purposes and where erasure of the data would likely to impair or halt progress towards the achievement that was the goal of the processing.
B) What I love about the TWTXT sphere is itās Human/Humane element! No deceptive algorithms, no Corpo B.S ā¦etc. Just Humans. So maybe ā¦ If we thought about it in this way, it wouldnāt heart to be even nicer to others/offering strangers an even safer space.
I could already imagine a couple of extreme cases where, somewhere, in this peaceful world oneās exercise of freedom of speech could get them in Real trouble (if not danger) if found out, it wouldnāt necessarily have to involve something to do with Law or legal authorities. So, If someone asks, and maybe fearing fearing forā¦ letās just say āTheir well beingā, would it heart if a pod just purged their content if itās serving it publicly (maybe relay the info to other pods) and call it a day? It doesnāt have to be about some law/convention somewhere ā¦ š¤· I know! Too extreme, but Iāve seen news of people whoād gone to jail or got their lives ruined for as little as a silly joke. And it doesnāt even have to be about any of this.
P.S: Maybe make X
tool check out robots.txt? Or maybe make long-term archives Opt-in? Opt-out?
P.P.S: Already Way too many MAYBEās in a single twt! So Iāll just shut up. š
Bahahahaha very clever @lyse@lyse.isobeef.org I look forward to reading your report ! š¤£ Howeverā¦
$ yarnc debug https://twtxt.net/user/prologic/twtxt.txt | grep -E '^pqst4ea' | tee | wc -l
0
I very quickly proved that Twt was never from me š¤£
@yarn_police@twtxt.net Cool cool šāāļø
@yarn_police@twtxt.net Whatās going on?
Heads up, @prologic@twtxt.net! Weāre seeing increased spate of burglaries in your neighbourhood. Please stay alert, while we keep you safe out there.
@movq@www.uninformativ.de Yes thatās true they are only integrity checks. But beyond a malicious pod (ignore yarndāa gossiping protocol for now) how does what @lyse@lyse.isobeef.org presented work exactly? š
@prologic@twtxt.net I only saw your previous twt right now. You said:
In order for this to be true,
yarnd
would have to be maliciously fabricating a Twt with the Hash D.
Yep, thatās one way.
Now, I have no idea how any of the gossipping stuff in Yarn works, but maybe a malicious pod could also inject such a fabricated twt into your cache by gossipping it?
Either way, hashes are just integrity checks basically, not proof that a certain feed published a certain twt.
But this is no different to how jenny
does things with storing every Twt in a Maildir I suppose? š¤
This has specifically come up before in the form of āinformal complaintsā against yarnd
because of the way it permanently stores and archives Twts, so even if you decide you changed your mind, or deleted that line out of your feed, if my pod or @xuu or @abucci@anthony.buc.ci or @eldersnake@we.loveprivacy.club (or any other handful of pods still around?) saw the Twt, itād be permanently archived.
Yeah Iām curious to find out too beyond just āhere sayā. But regardless of whether we should or shouldnāt care about this or should or shouldnāt comply. We should IMO. Iād have to build something that horrendously violates someoneās rights in another country.
@movq@www.uninformativ.de Care to explain how this explicit/attack works for me? š¤£
Well that was bloody awful. This PR bokr my pod for some strange reason I canāt figure out why or how š± The process just kept getting terminated from something, somewhere (no panic). weird. Iāve reverted this PR for now @xuu
@lyse@lyse.isobeef.org Yeah, makes sense. You donāt even need hash collisions for that. š¤ (I guess only individually signed twts would prevent that. š Yet another can of worms.)
@falsifian@www.falsifian.org Iām curious myself now and might look it up (or even ask some of our legal guys/gals š ).
I think none of this matters to people outside the EU anyway. These arenāt your laws. Even if you were to start a company in the US, it would only be a marketing instrument for you: āHey, look, we follow GDPR!ā EU people might then be more inclined to become your customers. But thatās it.
That said, Iām not sure anymore if there are any other treaties between the EU and the US which cover such things ā¦
@prologic@twtxt.net I have no specifics, only hopes. (I have seen some articles explaining the GDPR doesnāt apply to a āpurely personal or household activityā but I donāt really know what that means.)
I donāt know if itās worth giving much thought to the issue unless either you expect to get big enough for the GDPR to matter a lot (I imagine making money is a prerequisite) or someone specifically brings it up. Unless you enjoy thinking through this sort of thing, of course.
Really though I only managed to save a few GB, but itās enough for now.
@bender@twtxt.net Haha š Faster? Maybe š¤ But yeah itās good to have backups! (that work)
Iāve also put up this PR Add compatible methods for Index to behave as the Archiver (transition) #1177
that will act as a transition from the old naive archiver to the new bluge-based search/index. I will switch my pod over to this soon to test it before anyone else does.
For those curious, the archive on this pod had reached around ~22GB in size. I had to suck it down to my more powerful Mac Studio to clean it up and remove a bunch of junk. Then copy all the data back. This is what my local network traffic looked like for the last few hours š±
@prologic@twtxt.net woot, woot! Glad everything went well. I feel it faster already!
And weāre back. Sorry about that š
Gotta unplug for a couple of minutes. Iām suspecting the extension cord to be the root of my monitor dead rows of pixels and flickering problems.
@lyse@lyse.isobeef.org Hmmm Iām not sure sure I get what youāre getting at here. In order for this to be true, yarnd
would have to be maliciously fabricating a Twt with the Hash D.
i.e: there must be two versions of the Twt in the feed.
@lyse@lyse.isobeef.org This is true. But the client MUST supply the original too! Or this doesnāt work š¢
@prologic@twtxt.net Let me try:
Invent anything you want, say feed A writes message text B at timestamp C. You simply create the hash D for it and reply to precisely that D as subject in your own feed E with your message text F at timestamp G. This gets hashed to H.
Now then, some a client J fetches your feed E. It sees your response from time G with text F where in the subject you reference hash D. Since client J does not know about hash D, it simply asks some peers about it. If it happens to query your yarnd for it, you could happily serve it your invention: āYou wanna know about hash D? Oh, thatās easy, feed A wrote B at time C.ā
The client J then verifies it and since everthing lines up, it looks legitimate and puts this record in its cache or displays it to the user or whatever. It does not even matter, if the client J follows feed A or not. The message text B at C with hash D could have just deleted or edited in the meantime.
Congrats, you successfully spread rumors. :-D
@prologic@twtxt.net This does not hold if the edit happened before I even got the original.
If OTOH your client doesnāt store individual Twts in a cache/archive or some kind of database, then verification becomes quite hard and tedious. However I think of this as an implementation details. The spec should just call out that clients must validate/verify the edit request and the matching hash actually exists in that feed, not how the client should implement that.
@lyse@lyse.isobeef.org Yes you do. You keep both versions in your cache. They have different hashes. So you have Twt A, a client indicates Twt B is an edit of A, your client has already seen A and cached and archived it, now your client fetches B which is indicated of editing A. You cache/archive B as well, but now indicate in your display that B replaces A (maybe display, link both) or just display B or whatever. But essentially you now have both, but an indicator of one being an edit of the other.
The right thing to do here of course is to keep A in the āthreadā but display B. Why? So the thread/chain doesnāt actually break or fork (forking is a natural consequence of editing, or is it the other way around? š¤).
(edit:ā¦)
and (delete:ā¦)
into feeds. It's not just a simple "add this to your cache" or "replace the cache with this set of messages" anymore. Hmm. We might need to think about the consequences of that, can this be exploited somehow, etc.
@lyse@lyse.isobeef.org Iām all for dropping delete
btw, Or at least not making it mandatory, as-in āclients shouldā rather than āclients mustā. But yes I agree, letās explore all the possible ways this can be exploited (if at all).
@movq@www.uninformativ.de I think not.
What about edits of edits? Do we want to āchainā edits or does the latest edit simply win?
This gets too complicated if we start to support this kind of nonsense š¤£
@movq@www.uninformativ.de Thank you! š
@lyse@lyse.isobeef.org Walk me through this? š¤ I get what youāre saying, but Iām too stupid to be a āhackerā š¤£
But yes, at the end of the day if the edit request is invalid or cannot be verified, it should be ignored as treated as āmaliciousā.
@lyse@lyse.isobeef.org @movq@www.uninformativ.de So a client that has the idea of a cache/archive wouldnāt necessarily have to re-check that the Twt being marked as āeditedā belongs to that feed or not, the client would already know that for sure. At least this is how yarnd
works and Iām sure jenny
can make similar assertions too.
@lyse@lyse.isobeef.org @falsifian@www.falsifian.org Contributions to search.twtxt.net, which runs yarns
(not to be confused with yarnd
) are always welcome š¤ ā I donāt have as much āspare timeā as I used to due to the nature of my job (Staff Engineer); but I try to make improvements every now and again šŖ
@falsifian@www.falsifian.org You make good points though, I made similar arguments about this too back in the day. Twtxt v2 / Yarn.social being at least ~4 years old now š
@falsifian@www.falsifian.org Do you have specifics about the GRPD law about this?
Would the GDPR would apply to a one-person client like jenny? I seriously hope not. If someone asks me to delete an email they sent me, I donāt think I have to honour that request, no matter how European they are.
Iām not sure myself now. So letās find out whether parts of the GDPR actually apply to a truly decentralised system? š¤
LOL š This:
anyone could claim that some feed contained a certain message which was then removed again by just creating the hash over the fake message in said feed and invented timestamp themselves
Iād like to see a step-by-step reproduction of this. I donāt buy it š¤£
Admittedly yarnd
had a few implementation security bugs, but Iām not sure this is actually possible, unless Iām missing something? š¤
@david@collantes.us Very nice! š
And they have arrived (well, they did around 3 hours ago, LOL). Buttery smooth, my 16 Pro (one with dark cover). It took a bit over an hour to transfer all my data.
Ah, and now he is āconvenientlyā sleeping. How, well, convenient! LOL.
@lyse@lyse.isobeef.org yeah, tell us, @prologic@twtxt.net, what isnāt true? š¤ You canāt just go around, āthatās not true, and thatās not true; and that, and that!ā without spelling out exactly what isnāt, and why? For the love of god, why?! š
@falsifian@www.falsifian.org Something similar exists over at https://search.twtxt.net/. But a usable search engine would be actually nice (to be fair, yarns improved a bit). :-) I donāt care about feed changes over time. In fact, it would even feel creepy to me. Of course, anyone could still surveil, but Iām not looking forward to these stats.
@movq@www.uninformativ.de We could still let the client display a warning if it cannot verify it. But yeah.