Yarn

Recent twts in reply to #tuizh4q

@nexeq@twtxt.net given lightweight nature of yarnd and the twtxt protocol in general relying on a text file and a cache layer, a poderator who would desire to have n(x) timeline (i.e. all tweets from 2017 to today) would have to invest heavily in infrastructure and the protocol twtxt and client yarnd would have to be redesigned from the ground up.

now that being said, let’s say there’s a post that turns into a yarn and people respond to it frequently it may be more prevalent and show up in your feed if you are indeed engaged in said yarn.

⤋ Read More

@caesar@twtxt.net What if that text file is 1MB in size? How do you display this in any reasonable way? What if it was recently rotated (something that occurs once feeds reach a certain size). Moreover, even if the feed file itself was relatively small, you would incur processing resources as you would have to parse it over and over just to serve the purpose? Which is what? To view the entire contents one someone’s feed? 😅

Hope this helps 😅

⤋ Read More

@caesar@twtxt.net

Pagination? Like Yarn uses elsewhere. Or infinite scroll, but from the server side that’s still pagination.

Sure. Possible. Infinite scroll on an SSR isn’t really possible without significant use of JS AFIAK.

Exactly. Every other social network has that feature; I’ve missed it here serveral times already and it looks like I’m not the only one.

We don’t 😀 See philosophical reasons.

I still don’t get the difficulty from a technical point of view I’m afraid. 🤔

It’s a design decision…

⤋ Read More

Feeds are periodically fetched, cache is updated and views are rendered or API responses are provided from the cache. Cache is limited by Size per Feed and TTL

⤋ Read More

@mutefall@twtxt.net pagination is also kind of tricky to do in the first place, because the entire feed has to be parsed, loaded into memory, then paginated. it’s terribly inefficient. one could argue you could use a giant big SQL database and come up with some kind of schema, but that’s not really the point I don’t think nor really desirable for many reasons.

⤋ Read More

I also totally get whet you’re saying about a twtxt file potentially growing to be huge. I guess that, and the fact that it’s necessary to work around it with a significant caching architecture, is a major downside to the model of twtxt itself which I hadn’t considered.

⤋ Read More

@caesar@twtxt.net

but I’m a little puzzled why the same issues with a feed being huge don’t present an issue every time you want to poll for updates?

They do! As I said in af4el2q Pods will refuse to fetch feeds over the --max-fetch-limit in size. Feeds are also rotated on Pods. There is also a soec for this.

⤋ Read More

@caesar@twtxt.net

Particularly with the apparent convention of the newest posts being at the bottom of the file.

This is generally the convenatio, yes. And folks like @lyse@lyse.isobeef.org @xuu@txt.sour.is @movq@www.uninformativ.de and I have considered and talked about formalizing the “direction” of a feed including supporting “Range” requests. These are both things that I will likely do myself at some point, because it further helps with optimizing the traffic/bandwidth used and helps keeps things running smoothly as the network scales over time.

⤋ Read More

@caesar@twtxt.net

As for pagination, sure, it can be hard, but why would it be harder in this case than in the cases where Yarn already does it?

It’s done in the background as a background job. See this Dashbaord for a visuaul:

Download

(As for infinite scroll, if you have pagination on the server side already, it’s trivial on the client side. Yes you need JS of course, but not a lot)

Remember the builtin Web Interface (an SSR) is designed to be able to used without Javascript (graceful degradation).

⤋ Read More

@mutefall@twtxt.net

you’re reading from cache, so it’s quicker. memory will always have significantly faster iops vs disk-bound read operations. also recommend giving the codebase a look. there’s always room for contributors. i’m planning to take a crack at a few issues.

It’s even more than just “memory is faster than disk”. The Cache is designed to have O(1) lookups on all Profile (think Feed) and User Timeline as well as Pod Discover views. This is very important for the UX.

⤋ Read More

There are very good technical reasons for this design, but there are also very good human reasons for this too .

As my old man said to me many moons ago when I was first designing this (he helped and contributed ideas here!):

If I said something X ago, I don’t want someone to say “Hey but X ago you said this”. What if I’ve changed my mind since then and now have a different opinion?

I’m paraphrasing here of course, we talk regularly on the phone, but a lot of ideas ans inspiration has come from my Dad 👌 – The idea here is that Humans forget, so should Yarn.social

⤋ Read More

One more thing @caesar@twtxt.net I forgot to add here is that the Cache Size and TTL are actually configurable at a Pod level via the -I, --max-cache-items and -C, --max-cache-ttl options which default to 150 and 240h by default. As you are a user on my pod at twtxt.net, these settings directly impact you. If you were to run your own pod (for example) you could choose to tweak these to your ‘taste”. @david@netbros.com for example runs his pod netbros.com with quite high Cache settings.

⤋ Read More

Sorry I’m late, on the discussion, but as I see it. A big redis cluster will solve that issue (Twitter uses it) and a bit of js for the pagination client side. BUT the ability to be able to edit a post (impossible in twitter) makes it hard to have a big redis cluster.
The hardest part will mainly be for the client command line app.

⤋ Read More

@mutefall@twtxt.net It’s pretty easy to delete or even edit a Twt you posted on Yarn.social 😂 – But it has unintended side-effects, due to the decentralised nature, you end up with UX problems where for example, someone makes a Twt A, realizes they’ve made a typo or mistake or something, then edits it (which is equivalent to delete + repost) and posts a new Twt A’

Dealing with this is hard™ But I have some ideas 😅

⤋ Read More

@mutefall@twtxt.net The ideas I have in mind to deal wit this are basically to get good at “detecting edits” in the first place at ingestion time. I’ve played around with a few “text similarity” algorithms and I think we can reasonably (with high confidence) say that Twt A’ was an edit of Twt A – We would cache and archive them both, but in the User Interface collapse them and show the Twt A’ (with a visual indication/link that it was an edit of Twt A)

⤋ Read More

@mutefall@twtxt.net Re RFC 3339 timestamps, if I understand you correctly, I think it’s extremely unlikely for someone to repost a Twt (an edit) within the same second (at least not humanly possible). In any case, I’ve only validated the ideas so far in isolation, the algorithm(s) need to be built, feature gated, measured, understood and finally put in place with some UX (I like @ullarah@txt.quisquiliae.com’s gugestion)

⤋ Read More

Participate

Login to join in on this yarn.