Ever wondered what it would cost to self-hosted vs. use the cloud? Well I often doubt myself every time I look at hardware prices, and I know I have to do some hardware refresh soonā„¢ for the Mills DC (something I donā€™t have a regular plan or budget for), hereā€™s a rough ball park:

The Mills DC has cost me around ~$15k to build and maintain over the last ~10 years or so. Roughly speaking. Iā€™ve never actually taken a Bill of Materials or anything, but I could if anyone is interested in more specifics.

The equivalent of resources if run in the ā€œCloudā€ would cost around:

  • ~$1,000 for virtual machines
  • ~$12000 for storage

So around ~$2,000/month to run.

Keep this in mind anytime anyone ever tries to con you into believing ā€œCloud is cheaperā€. Itā€™s not.

ā¤‹ Read More
In-reply-to » @movq @falsifian @prologic Maybe I don't know what I'm talking about and You've probably already read this: Everything you need to know about the ā€œRight to be forgottenā€ coming straight out of the EU's GDPR Website itself. It outlines the specific circumstances under which the right to be forgotten applies as well as reasons that trump the one's right to erasure ...etc.

@aelaraji@aelaraji.com This is one of the reasons why yarnd has a couple of settings with some sensible/sane defaults:

I could already imagine a couple of extreme cases where, somewhere, in this peaceful world oneā€™s exercise of freedom of speech could get them in Real trouble (if not danger) if found out, it wouldnā€™t necessarily have to involve something to do with Law or legal authorities. So, If someone asks, and maybe fearing fearing forā€¦ letā€™s just say ā€˜Their well beingā€™, would it heart if a pod just purged their content if itā€™s serving it publicly (maybe relay the info to other pods) and call it a day? It doesnā€™t have to be about some law/convention somewhere ā€¦ šŸ¤· I know! Too extreme, but Iā€™ve seen news of people whoā€™d gone to jail or got their lives ruined for as little as a silly joke. And it doesnā€™t even have to be about any of this.

There are two settings:

$ ./yarnd --help 2>&1 | grep max-cache
      --max-cache-fetchers int        set maximum numnber of fetchers to use for feed cache updates (default 10)
  -I, --max-cache-items int           maximum cache items (per feed source) of cached twts in memory (default 150)
  -C, --max-cache-ttl duration        maximum cache ttl (time-to-live) of cached twts in memory (default 336h0m0s)

So yarnd pods by default are designed to only keep Twts around publicly visible on either the anonymous Frontpage or Discover View or your Timeline or the feedā€™s Timeline for up to 2 weeks with a maximum of 150 items, whichever get exceeded first. Any Twts over this are considered ā€œoldā€ and drop off the active cache.

Itā€™s a feature that my old man @off_grid_living@twtxt.net was very strongly in support of, as was I back in the day of yarndā€™s design (nothing particularly to do with Twtxt per se) that Iā€™ve to this day stuck by ā€“ Even though there are some šŸ˜‰ that have different views on this šŸ¤£

ā¤‹ Read More
In-reply-to » @falsifian Do you have specifics about the GRPD law about this?

@movq@www.uninformativ.de @falsifian@www.falsifian.org @prologic@twtxt.net Maybe I donā€™t know what Iā€™m talking about and Youā€™ve probably already read this: Everything you need to know about the ā€œRight to be forgottenā€ coming straight out of the EUā€™s GDPR Website itself. It outlines the specific circumstances under which the right to be forgotten applies as well as reasons that trump the oneā€™s right to erasure ā€¦etc.

Iā€™m no lawyer, but my uneducated guess would be that:

A) twts are already publicly available/public knowledge and suchā€¦ just donā€™t process childrenā€™s personal data and MAYBE youā€™re good? Since thereā€™s this:

ā€¦ an organizationā€™s right to process someoneā€™s data might override their right to be forgotten. Here are the reasons cited in the GDPR that trump the right to erasure:

  • The data is being used to exercise the right of freedom of expression and information.
  • The data is being used to perform a task that is being carried out in the public interest or when exercising an organizationā€™s official authority.
  • The data represents important information that serves the public interest, scientific research, historical research, or statistical purposes and where erasure of the data would likely to impair or halt progress towards the achievement that was the goal of the processing.

B) What I love about the TWTXT sphere is itā€™s Human/Humane element! No deceptive algorithms, no Corpo B.S ā€¦etc. Just Humans. So maybe ā€¦ If we thought about it in this way, it wouldnā€™t heart to be even nicer to others/offering strangers an even safer space.
I could already imagine a couple of extreme cases where, somewhere, in this peaceful world oneā€™s exercise of freedom of speech could get them in Real trouble (if not danger) if found out, it wouldnā€™t necessarily have to involve something to do with Law or legal authorities. So, If someone asks, and maybe fearing fearing forā€¦ letā€™s just say ā€˜Their well beingā€™, would it heart if a pod just purged their content if itā€™s serving it publicly (maybe relay the info to other pods) and call it a day? It doesnā€™t have to be about some law/convention somewhere ā€¦ šŸ¤· I know! Too extreme, but Iā€™ve seen news of people whoā€™d gone to jail or got their lives ruined for as little as a silly joke. And it doesnā€™t even have to be about any of this.

P.S: Maybe make X tool check out robots.txt? Or maybe make long-term archives Opt-in? Opt-out?
P.P.S: Already Way too many MAYBEā€™s in a single twt! So Iā€™ll just shut up. šŸ˜…

ā¤‹ Read More
In-reply-to » Another thing: At the moment, anyone could claim that some feed contained a certain message which was then removed again by just creating the hash over the fake message in said feed and invented timestamp themselves. Nobody can ever verify that this was never the case in the first place and completely made up. So, our twt hashes have to be taken with a grain of salt.

@movq@www.uninformativ.de Yes thatā€™s true they are only integrity checks. But beyond a malicious pod (ignore yarndā€™a gossiping protocol for now) how does what @lyse@lyse.isobeef.org presented work exactly? šŸ˜…

ā¤‹ Read More
In-reply-to » Another thing: At the moment, anyone could claim that some feed contained a certain message which was then removed again by just creating the hash over the fake message in said feed and invented timestamp themselves. Nobody can ever verify that this was never the case in the first place and completely made up. So, our twt hashes have to be taken with a grain of salt.

@prologic@twtxt.net I only saw your previous twt right now. You said:

In order for this to be true, yarnd would have to be maliciously fabricating a Twt with the Hash D.

Yep, thatā€™s one way.

Now, I have no idea how any of the gossipping stuff in Yarn works, but maybe a malicious pod could also inject such a fabricated twt into your cache by gossipping it?

Either way, hashes are just integrity checks basically, not proof that a certain feed published a certain twt.

ā¤‹ Read More
In-reply-to » @falsifian Do you have specifics about the GRPD law about this?

This has specifically come up before in the form of ā€œinformal complaintsā€ against yarnd because of the way it permanently stores and archives Twts, so even if you decide you changed your mind, or deleted that line out of your feed, if my pod or @xuu or @abucci@anthony.buc.ci or @eldersnake@we.loveprivacy.club (or any other handful of pods still around?) saw the Twt, itā€™d be permanently archived.

ā¤‹ Read More
In-reply-to » @falsifian Do you have specifics about the GRPD law about this?

Yeah Iā€™m curious to find out too beyond just ā€œhere sayā€. But regardless of whether we should or shouldnā€™t care about this or should or shouldnā€™t comply. We should IMO. Iā€™d have to build something that horrendously violates someoneā€™s rights in another country.

ā¤‹ Read More
In-reply-to » Another thing: At the moment, anyone could claim that some feed contained a certain message which was then removed again by just creating the hash over the fake message in said feed and invented timestamp themselves. Nobody can ever verify that this was never the case in the first place and completely made up. So, our twt hashes have to be taken with a grain of salt.

@movq@www.uninformativ.de Care to explain how this explicit/attack works for me? šŸ¤£

ā¤‹ Read More
In-reply-to » I've also put up this PR Add compatible methods for Index to behave as the Archiver (transition) #1177 that will act as a transition from the old naive archiver to the new bluge-based search/index. I will switch my pod over to this soon to test it before anyone else does.

Well that was bloody awful. This PR bokr my pod for some strange reason I canā€™t figure out why or how šŸ˜± The process just kept getting terminated from something, somewhere (no panic). weird. Iā€™ve reverted this PR for now @xuu

ā¤‹ Read More
In-reply-to » Another thing: At the moment, anyone could claim that some feed contained a certain message which was then removed again by just creating the hash over the fake message in said feed and invented timestamp themselves. Nobody can ever verify that this was never the case in the first place and completely made up. So, our twt hashes have to be taken with a grain of salt.

@lyse@lyse.isobeef.org Yeah, makes sense. You donā€™t even need hash collisions for that. šŸ¤” (I guess only individually signed twts would prevent that. šŸ™ˆ Yet another can of worms.)

ā¤‹ Read More
In-reply-to » @falsifian Do you have specifics about the GRPD law about this?

@falsifian@www.falsifian.org Iā€™m curious myself now and might look it up (or even ask some of our legal guys/gals šŸ˜…).

I think none of this matters to people outside the EU anyway. These arenā€™t your laws. Even if you were to start a company in the US, it would only be a marketing instrument for you: ā€œHey, look, we follow GDPR!ā€ EU people might then be more inclined to become your customers. But thatā€™s it.

That said, Iā€™m not sure anymore if there are any other treaties between the EU and the US which cover such things ā€¦

ā¤‹ Read More
In-reply-to » @falsifian Do you have specifics about the GRPD law about this?

@prologic@twtxt.net I have no specifics, only hopes. (I have seen some articles explaining the GDPR doesnā€™t apply to a ā€œpurely personal or household activityā€ but I donā€™t really know what that means.)

I donā€™t know if itā€™s worth giving much thought to the issue unless either you expect to get big enough for the GDPR to matter a lot (I imagine making money is a prerequisite) or someone specifically brings it up. Unless you enjoy thinking through this sort of thing, of course.

ā¤‹ Read More
In-reply-to » And we're back. Sorry about that šŸ˜…

For those curious, the archive on this pod had reached around ~22GB in size. I had to suck it down to my more powerful Mac Studio to clean it up and remove a bunch of junk. Then copy all the data back. This is what my local network traffic looked like for the last few hours šŸ˜±

Download

ā¤‹ Read More
In-reply-to » Another thing: At the moment, anyone could claim that some feed contained a certain message which was then removed again by just creating the hash over the fake message in said feed and invented timestamp themselves. Nobody can ever verify that this was never the case in the first place and completely made up. So, our twt hashes have to be taken with a grain of salt.

@lyse@lyse.isobeef.org Hmmm Iā€™m not sure sure I get what youā€™re getting at here. In order for this to be true, yarnd would have to be maliciously fabricating a Twt with the Hash D.

ā¤‹ Read More
In-reply-to » Another thing: At the moment, anyone could claim that some feed contained a certain message which was then removed again by just creating the hash over the fake message in said feed and invented timestamp themselves. Nobody can ever verify that this was never the case in the first place and completely made up. So, our twt hashes have to be taken with a grain of salt.

@prologic@twtxt.net Let me try:

Invent anything you want, say feed A writes message text B at timestamp C. You simply create the hash D for it and reply to precisely that D as subject in your own feed E with your message text F at timestamp G. This gets hashed to H.

Now then, some a client J fetches your feed E. It sees your response from time G with text F where in the subject you reference hash D. Since client J does not know about hash D, it simply asks some peers about it. If it happens to query your yarnd for it, you could happily serve it your invention: ā€œYou wanna know about hash D? Oh, thatā€™s easy, feed A wrote B at time C.ā€

The client J then verifies it and since everthing lines up, it looks legitimate and puts this record in its cache or displays it to the user or whatever. It does not even matter, if the client J follows feed A or not. The message text B at C with hash D could have just deleted or edited in the meantime.

Congrats, you successfully spread rumors. :-D

ā¤‹ Read More
In-reply-to » @movq Thanks for the summary!

If OTOH your client doesnā€™t store individual Twts in a cache/archive or some kind of database, then verification becomes quite hard and tedious. However I think of this as an implementation details. The spec should just call out that clients must validate/verify the edit request and the matching hash actually exists in that feed, not how the client should implement that.

ā¤‹ Read More
In-reply-to » @movq Thanks for the summary!

@lyse@lyse.isobeef.org Yes you do. You keep both versions in your cache. They have different hashes. So you have Twt A, a client indicates Twt B is an edit of A, your client has already seen A and cached and archived it, now your client fetches B which is indicated of editing A. You cache/archive B as well, but now indicate in your display that B replaces A (maybe display, link both) or just display B or whatever. But essentially you now have both, but an indicator of one being an edit of the other.

The right thing to do here of course is to keep A in the ā€œthreadā€ but display B. Why? So the thread/chain doesnā€™t actually break or fork (forking is a natural consequence of editing, or is it the other way around? šŸ¤”).

ā¤‹ Read More
In-reply-to » It just occurs to me we're now building some kind of control structures or commands with (edit:ā€¦) and (delete:ā€¦) into feeds. It's not just a simple "add this to your cache" or "replace the cache with this set of messages" anymore. Hmm. We might need to think about the consequences of that, can this be exploited somehow, etc.

@lyse@lyse.isobeef.org Iā€™m all for dropping delete btw, Or at least not making it mandatory, as-in ā€œclients shouldā€ rather than ā€œclients mustā€. But yes I agree, letā€™s explore all the possible ways this can be exploited (if at all).

ā¤‹ Read More
In-reply-to » Another thing: At the moment, anyone could claim that some feed contained a certain message which was then removed again by just creating the hash over the fake message in said feed and invented timestamp themselves. Nobody can ever verify that this was never the case in the first place and completely made up. So, our twt hashes have to be taken with a grain of salt.

@lyse@lyse.isobeef.org Walk me through this? šŸ¤” I get what youā€™re saying, but Iā€™m too stupid to be a ā€œhackerā€ šŸ¤£

ā¤‹ Read More
In-reply-to » @david Thanks, that's good feedback to have. I wonder to what extent this already exists in registry servers and yarn pods. I haven't really tried digging into the past in either one.

@lyse@lyse.isobeef.org @falsifian@www.falsifian.org Contributions to search.twtxt.net, which runs yarns (not to be confused with yarnd) are always welcome šŸ¤— ā€“ I donā€™t have as much ā€œspare timeā€ as I used to due to the nature of my job (Staff Engineer); but I try to make improvements every now and again šŸ’Ŗ

ā¤‹ Read More
In-reply-to » @prologic Do you have a link to some past discussion?

@falsifian@www.falsifian.org Do you have specifics about the GRPD law about this?

Would the GDPR would apply to a one-person client like jenny? I seriously hope not. If someone asks me to delete an email they sent me, I donā€™t think I have to honour that request, no matter how European they are.

Iā€™m not sure myself now. So letā€™s find out whether parts of the GDPR actually apply to a truly decentralised system? šŸ¤”

ā¤‹ Read More
In-reply-to » @lyse I don't think this is true.

LOL šŸ˜‚ This:

anyone could claim that some feed contained a certain message which was then removed again by just creating the hash over the fake message in said feed and invented timestamp themselves

Iā€™d like to see a step-by-step reproduction of this. I donā€™t buy it šŸ¤£

Admittedly yarnd had a few implementation security bugs, but Iā€™m not sure this is actually possible, unless Iā€™m missing something? šŸ¤”

ā¤‹ Read More
In-reply-to » @david Thanks, that's good feedback to have. I wonder to what extent this already exists in registry servers and yarn pods. I haven't really tried digging into the past in either one.

@falsifian@www.falsifian.org Something similar exists over at https://search.twtxt.net/. But a usable search engine would be actually nice (to be fair, yarns improved a bit). :-) I donā€™t care about feed changes over time. In fact, it would even feel creepy to me. Of course, anyone could still surveil, but Iā€™m not looking forward to these stats.

ā¤‹ Read More