@prologic@twtxt.net I’m taking a look at yarn.social search codebase. Just created a tiny PR to add some documentation around setup. I was thinking that I might start with “Document how to use the query language really well” since that would be a good learning exercise. It looks like maybe https://search.twtxt.net/help
would be a good place to put that work. Were you wanting to put any search help text anywhere else in the UI?
@brasshopper@twtxt.net Oh! 🤦♂️ That was you 😅 I should have known 🤣 – The biggest thing I want to see improved as well is the memory usage of this thing, it keeps triggering “memory pressure” alarms and its kind of annoying, not really sure what to do about it yet…
@brasshopper@twtxt.net Yup a /help
endpoint sounds good to me 👌
Also note that whatever improvements you make here in the global search engine, I’d like to bring across to yarnd
itself – So let’s build it like a library as well as a search engine and crawler (server) 🙏
@prologic@twtxt.net Yeah, I figured I’d finally dox myself (my git.mills.io profile). 😉
re memory usage: is there some memory target you’re trying to hit? I would suspect that it partly depends on the data set, but maybe that plus search stats could give me enough to figure out a good set of test data.
@brasshopper@twtxt.net So right now the code uses bleve to index documents (feeds and twts here) and is also used as the query engine. It has thing thing called MemoryNeededForSearchResult that might be useful here.
Alternatively there is another good Go library for indexing and searching called bluge that I’ve been following, it might be worthwhile looking in to…
Also to give you an idea… This pile of crappy ass code I whacked together on a weekend I think could be a lot more efficient 😅 The Twtxt search space is really not that large to warrant this kind of resource utilisation:
– I don’t mind the daily CPU spikes – But even that could be improved I think.