tl;dr— Comments search works again. It is much, much faster. Also search keywords for comments now only search for the whole word.
This all began forty-eight hours ago. The WPD server was down, I'd already broken out the bourbon (as is customary when your server is down), and someone was trying to search for comments containing the word china
. Comments search always was slow because we used to do an exhaustive search through the full text of all 2.2 million comments on the site. But, man, china
was even slower than slow: it was grinding the site to a halt for minutes at a time. We don't normally log search queries, but china
was so slow it was crashing the server, which showed up in the crash error message.
It was 2 AM, we were busy dealing with WPD, and I wasn't exactly sober enough to debug, so we disabled comments search until we had time to look at it. The next day, a dozen of you let us know comment search wasn't working. A dozen-minus-one of you didn't scroll in the bugs thread to notice that's what everyone else was reporting and we'd already explained why.
Anyway, it's re-enabled, and it's a lot faster. china
takes half a second, not two minutes. This comes with some minor changes in functionality. First, word substring searches don't work on comments now—carp
only finds the exact word carp
(or Carp), not carpathianflorist
or escarpment
(this probably breaks nwordcountbot; sorry @geese_suck). Also, I have no idea what "exact search"
syntax with "
s does any more; probably just guarantees you get zero results. Report weird search results here and we'll iron it out soon. Just wanted to get comment search back online.
Jump in the discussion.
No email address required.
by the way, we still love you for reporting it as many times as you did
even though it had already been reported thirty times first <3
Jump in the discussion.
No email address required.
I reported several times pls give bug report badges
Jump in the discussion.
No email address required.
More options
Context
More options
Context
There are seriously users here who need to search for and reread a comment made by another r-slurred user months ago? Why? Do you really need to see that Cirno post again from 6 months ago?
Jump in the discussion.
No email address required.
I used marseysearch trying to find a post I made on reddit years ago. Turns out I imagined making that post, but I found that I'd generated so much content that I've already forgotten. I can just peddle it as being new here as long as nobody checks the expiration date. It'll be like cleaning out my refrigerator by unloading everything from 2019 onto a stupid neighbor.
Jump in the discussion.
No email address required.
Jump in the discussion.
No email address required.
More options
Context
That's just being a lazy dramatard imho. I don't want post dated expired drama the lolcows have long moved on or died from AIDs by now.
Jump in the discussion.
No email address required.
More options
Context
More options
Context
i sit down and re-read all of my comments all of the time. i like to look at @SERGE's greatest hits to waste a few hours away.
Jump in the discussion.
No email address required.
You can just visit your profile page, nerd.
Jump in the discussion.
No email address required.
I also have a list of my favourite snappy quotes and i like to see what replies they get when their turn comes around - my favourite one is this:
anyway im bopping rn:
ILL SEE YOU WHEN YOU GET THERE - IF YOU EVER GET THERE
Jump in the discussion.
No email address required.
Ma'am we've been over this before. You need to stop.
Jump in the discussion.
No email address required.
More options
Context
More options
Context
More options
Context
More options
Context
I used it to find the "escape from Predditor Mansion" art post so I could post it on PCM, and then the "Marsey in her room" and "Marsey the country cat girl" posts when they asked if there were any similar pictures.
Jump in the discussion.
No email address required.
More options
Context
I used it to find a post about Japanese tweets I didn't get to finish reading the other day
@Transgender_spez
Jump in the discussion.
No email address required.
More options
Context
More options
Context
I don't know anything about search algorithms but I'm shocked it took this long for this to become a problem.
Jump in the discussion.
No email address required.
Apparently some people didn't have a father who would beat them at age 9 if they failed to optimize their database indexes.
Jump in the discussion.
No email address required.
avoids thining about a 60 GB unoptimized SQLite db that has random json fields strewn about that I have
Jump in the discussion.
No email address required.
#justtransgirlthings
Jump in the discussion.
No email address required.
yeah it's super dumb but i do code things sometimes
Jump in the discussion.
No email address required.
More options
Context
More options
Context
More options
Context
More options
Context
More options
Context
I'll let LLM know
Jump in the discussion.
No email address required.
He already knows
Jump in the discussion.
No email address required.
More options
Context
More options
Context
marseyschizos
finds commentsmarseyschizosa
ormarseyschizosal
finds nothingit's not a length thing,
marseyace
finds nothingJump in the discussion.
No email address required.
Jump in the discussion.
No email address required.
More options
Context
More options
Context
Jump in the discussion.
No email address required.
Actual footage of @TwoLargeSnakesMating while he was making this post.
Jump in the discussion.
No email address required.
More options
Context
More options
Context
search doesn't really work? searching for marseybeanquestion gives no results for example
Jump in the discussion.
No email address required.
Should be fixed now There was a minor issue with long words and composite words like marsey names. (cc: @grizzly — same bug you reported)
Jump in the discussion.
No email address required.
it looks like it works modulo english affixes, so
schizos
also yieldsschizo
and vice versaJump in the discussion.
No email address required.
Postgres has this neat way of vectorizing text that's language-aware. On one hand, I'm not sure the lexeme analyzer knows how to really parse some agglutination like
marseyschizogetogetolove
, but I think it can at least do it consistently. … Also means I don't need to set up a real search service. I worked with Solr once years ago, and I hope it can be many more years before I have to touch it again.Jump in the discussion.
No email address required.
Shit is useless for people who know what they're looking for, I hate it
Jump in the discussion.
No email address required.
More options
Context
More options
Context
More options
Context
More options
Context
Neither does searching for "the" (without quotes)
Jump in the discussion.
No email address required.
Ah, yeah, that was another casualty of the fix. Insignificant words like "the" and "a" get optimized out now.
Jump in the discussion.
No email address required.
But how can I distinguish between "Batman" and "The Batman"?????
Jump in the discussion.
No email address required.
More options
Context
Ok!
Jump in the discussion.
No email address required.
More options
Context
More options
Context
More options
Context
More options
Context
Does this mean the jannies can see what we're searching?
Jump in the discussion.
No email address required.
you're good king
Jump in the discussion.
No email address required.
More options
Context
More options
Context
I think that's acceptable here. I only need this capability when I'm searching for pirated stuff on P2P networks where they've got weird naming conventions or when I'm doing some kind of repetitive data entry where I only need to type in 3 characters for the computer to know what I'm saying.
Jump in the discussion.
No email address required.
More options
Context
thank you king
Jump in the discussion.
No email address required.
More options
Context
This makes it much harder to dig up months old comments that I vaguely remember, but anything in the name of
Jump in the discussion.
No email address required.
More options
Context
Oh boy can’t wait to see if the minetest server is back up
Jump in the discussion.
No email address required.
I swear it will be back this weekend
Jump in the discussion.
No email address required.
hope so I need my minecraft fix and I'm not about to join some zoomer server
Jump in the discussion.
No email address required.
More options
Context
More options
Context
More options
Context
chinagate
Jump in the discussion.
No email address required.
More options
Context
good thing no one uses it lol
Jump in the discussion.
No email address required.
More options
Context
tl;dr search is still broken
You need to build a prefix table for all the words in your comments table so that the search will work correctly as well as fast. Ill tell you how for DC
Jump in the discussion.
No email address required.
More options
Context
cant you just use elasticsearch?
if it doesnt have substrings or any kind of stemming i dont think you get to sound to pleased with yourself
even postgres builtin full-text indexing gives you stemming and the like. are you using mysql or something lol
Jump in the discussion.
No email address required.
More options
Context