@mckinley@twtxt.net here is the dump https://github.com/tkanos/we-are-twtxt (on the tarball all-twtxt.tar.xz)
@tkanos@twtxt.net Thank you very much. I thought a collection of every twtxt feed would weigh more than 14 MiB uncompressed.
Here are the top ten feeds by size. @prologic@twtxt.net is artificially low on the list because it’s separated into chunks, and @movq@www.uninformativ.de is listed twice. Once as www.uninformativ.de, once as uninformativ.de. I blame yarnd.
du -b * | sort -nr | head -n 10
5253921 www.lord-enki.net.txt
842733 cnbeta-com-rssding-yue.txt
755925 search.twtxt.net.txt
654717 prologic.txt
394380 jlj.txt
371632 assets.txt
246520 off_grid_living.txt
243953 mckinley.txt
225256 www.uninformativ.de.txt
225256 uninformativ.de.txt
cnbeta-com-rssding-yue.txt seems to be a syndication feed for https://cnbeta.com/ in twtxt format, assets.txt is @maya@maya.land, and the rest are fairly self-explanatory.
Ah, cnbeta-com-rssding-yue.txt is hosted on feeds.twtxt.cc which isn’t blacklisted like feeds.twtxt.net is.
@mckinley@twtxt.net my bad, I ’ve blacklisted it in the code for next time (github-commit-fix)
Actually I should black list search as well.
blacklisted because those are not real people? 🤔
@prologic@twtxt.net yep. In that experiment I was only interested by real people, who talk/contacts with others I can follow.
@tkanos@twtxt.net Gotcha 👌