↳
In-reply-to
»
I played around with parsers. This time I experimented with parser combinators for twt message text tokenization. Basically, extract mentions, subjects, URLs, media and regular text. It's kinda nice, although my solution is not completely elegant, I have to say. Especially my communication protocol between different steps for intermediate results is really ugly. Not sure about performance, I reckon a hand-written state machine parser would be quite a bit faster. I need to write a second parser and then benchmark them.
⤋ Read More
making a note here to check this out.