emoji-award-marseywholesome
Unable to load image
Reported by:

need help with "simps" and "simps for"

so yesterday I restored "simps" and "simps for" but made them only count post votes

the reason is because the queries that would include comment votes take way too long when they run on powerusers which causes timeouts that slow down the whole site

here's 2 examples that use my user id


QUERY 1 (for my simps): 25 seconds

explain analyze SELECT commentvotes.user_id AS commentvotes_user_id, count(commentvotes.user_id) AS count_1
        FROM commentvotes JOIN comments ON comments.id = commentvotes.comment_id
        WHERE comments.ghost = false AND comments.is_banned = false AND comments.deleted_utc = 0 AND commentvotes.vote_type = 1 AND comments.author_id = 1 GROUP BY commentvotes.user_id ORDER BY count(commentvotes.user_id) DESC;
                                                                                 QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=101326.43..101331.71 rows=2111 width=12) (actual time=25292.671..25371.754 rows=1844 loops=1)
   Sort Key: (count(commentvotes.user_id)) DESC
   Sort Method: quicksort  Memory: 135kB
   ->  Finalize GroupAggregate  (cost=100675.05..101209.87 rows=2111 width=12) (actual time=25266.703..25370.338 rows=1844 loops=1)
         Group Key: commentvotes.user_id
         ->  Gather Merge  (cost=100675.05..101167.65 rows=4222 width=12) (actual time=25266.210..25368.234 rows=4089 loops=1)
               Workers Planned: 2
               Workers Launched: 2
               ->  Sort  (cost=99675.02..99680.30 rows=2111 width=12) (actual time=25174.301..25174.732 rows=1363 loops=3)
                     Sort Key: commentvotes.user_id
                     Sort Method: quicksort  Memory: 113kB
                     Worker 0:  Sort Method: quicksort  Memory: 113kB
                     Worker 1:  Sort Method: quicksort  Memory: 111kB
                     ->  Partial HashAggregate  (cost=99537.35..99558.46 rows=2111 width=12) (actual time=25172.482..25173.130 rows=1363 loops=3)
                           Group Key: commentvotes.user_id
                           Batches: 1  Memory Usage: 241kB
                           Worker 0:  Batches: 1  Memory Usage: 241kB
                           Worker 1:  Batches: 1  Memory Usage: 241kB
                           ->  Nested Loop  (cost=119.48..99291.83 rows=49104 width=4) (actual time=70.552..25064.475 rows=25093 loops=3)
                                 ->  Parallel Bitmap Heap Scan on comments  (cost=119.04..12311.25 rows=4943 width=4) (actual time=57.412..246.591 rows=4380 loops=3)
                                       Recheck Cond: (author_id = 1)
                                       Filter: ((NOT ghost) AND (NOT is_banned) AND (deleted_utc = 0))
                                       Rows Removed by Filter: 263
                                       Heap Blocks: exact=3882
                                       ->  Bitmap Index Scan on comments_user_index  (cost=0.00..116.08 rows=12353 width=0) (actual time=36.276..36.277 rows=14518 loops=1)
                                             Index Cond: (author_id = 1)
                                 ->  Index Scan using commentvotes_pkey on commentvotes  (cost=0.44..17.28 rows=32 width=8) (actual time=1.447..5.654 rows=6 loops=13141)
                                       Index Cond: (comment_id = comments.id)
                                       Filter: (vote_type = 1)
                                       Rows Removed by Filter: 0
 Planning Time: 0.764 ms
 JIT:
   Functions: 60
   Options: Inlining false, Optimization false, Expressions true, Deforming true
   Timing: Generation 19.575 ms, Inlining 0.000 ms, Optimization 6.942 ms, Emission 122.562 ms, Total 149.079 ms
 Execution Time: 25453.921 ms
(36 rows)






QUERY 2 (for the ppl i simp for): 13 seconds

explain analyze SELECT comments.author_id AS comments_author_id, count(comments.author_id) AS count_1 
        FROM comments JOIN commentvotes ON comments.id = commentvotes.comment_id 
        WHERE comments.ghost = false AND comments.is_banned = false AND comments.deleted_utc = 0 AND commentvotes.vote_type = 1 AND commentvotes.user_id = 1 GROUP BY comments.author_id ORDER BY count(comments.author_id) DESC;

                                                                                         QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=234059.08..234064.81 rows=2293 width=12) (actual time=13511.566..13632.448 rows=3074 loops=1)
   Sort Key: (count(comments.author_id)) DESC
   Sort Method: quicksort  Memory: 241kB
   ->  Finalize GroupAggregate  (cost=233350.16..233931.10 rows=2293 width=12) (actual time=13502.703..13630.725 rows=3074 loops=1)
         Group Key: comments.author_id
         ->  Gather Merge  (cost=233350.16..233885.24 rows=4586 width=12) (actual time=13502.611..13628.135 rows=6597 loops=1)
               Workers Planned: 2
               Workers Launched: 2
               ->  Sort  (cost=232350.14..232355.87 rows=2293 width=12) (actual time=13404.805..13405.287 rows=2199 loops=3)
                     Sort Key: comments.author_id
                     Sort Method: quicksort  Memory: 203kB
                     Worker 0:  Sort Method: quicksort  Memory: 199kB
                     Worker 1:  Sort Method: quicksort  Memory: 197kB
                     ->  Partial HashAggregate  (cost=232199.23..232222.16 rows=2293 width=12) (actual time=13398.724..13399.556 rows=2199 loops=3)
                           Group Key: comments.author_id
                           Batches: 1  Memory Usage: 369kB
                           Worker 0:  Batches: 1  Memory Usage: 369kB
                           Worker 1:  Batches: 1  Memory Usage: 369kB
                           ->  Nested Loop  (cost=0.87..231821.34 rows=75577 width=4) (actual time=81.541..13275.534 rows=52615 loops=3)
                                 ->  Parallel Index Scan using cvote_user_index on commentvotes  (cost=0.44..129890.24 rows=78687 width=4) (actual time=64.216..9154.821 rows=55000 loops=3)
                                       Index Cond: (user_id = 1)
                                       Filter: (vote_type = 1)
                                       Rows Removed by Filter: 90
                                 ->  Index Scan using comments_pkey on comments  (cost=0.43..1.30 rows=1 width=8) (actual time=0.072..0.072 rows=1 loops=165000)
                                       Index Cond: (id = commentvotes.comment_id)
                                       Filter: ((NOT ghost) AND (NOT is_banned) AND (deleted_utc = 0))
                                       Rows Removed by Filter: 0
 Planning Time: 0.721 ms
 JIT:
   Functions: 60
   Options: Inlining false, Optimization false, Expressions true, Deforming true
   Timing: Generation 8.837 ms, Inlining 0.000 ms, Optimization 11.450 ms, Emission 167.933 ms, Total 188.220 ms
 Execution Time: 13635.714 ms
(33 rows)

so i need help, how do I make them run faster or what indexes do I make, i didnt actually read the explain analyze outputs I posted here cuz im too lazy and need to take a shower rn so the answer might be pretty obvious (correct answer gets 50k marseybuckerinos and a code contributer badge)

115
Jump in the discussion.

No email address required.

Have you tried with making them stored procs or pre compiled views?

Jump in the discussion.

No email address required.

In awe at the level of 'do it for free' that's been attained here

Jump in the discussion.

No email address required.

In awe at the level of 'do it for free' that's been attained here

I have a theory that half of local userbase is slightly on the spectrum, it explains fascination with trains, too.

Jump in the discussion.

No email address required.

Only half? :marseyautism:

Jump in the discussion.

No email address required.

It turns out that rdrama actually became the only viable autonomous collective just because we help each other spit outwardly at wingcucks.

Jump in the discussion.

No email address required.

Just code better?

:#marseyshrug::#marseysmug2:

Jump in the discussion.

No email address required.

console.log('Make faster');

Jump in the discussion.

No email address required.

My man. My neighbor. Stick them both in temp tables so you're not fricking around with the entire dataset before your where clause.

Jump in the discussion.

No email address required.

reorder columns to retvrn the 13.52% wasted padding bytes :marseycapymerchant: :marseycapymerchant: :marseycapymerchant: :marseymerchantsoy: :chadfixedkingcapy: (if you skimped out :marseycapymerchant: on ram)

    CREATE TABLE public.commentvotes (
        comment_id integer NOT NULL,
        user_id integer NOT NULL,
        created_utc integer NOT NULL,
        app_id smallint,
        coins smallint DEFAULT 1 NOT NULL,
        vote_type char NOT NULL,
        "real" boolean DEFAULT true NOT NULL
    );

u trying my index suggestions?

Jump in the discussion.

No email address required.

This is a FAKE grizzly, do not be foooled by this menace

![](/images/16749383190339346.webp)


:#capysneedboat2::#capyantischizo::#space:

Jump in the discussion.

No email address required.

Reading his comments in this thread made me sick :marseyill: :marseysick: :marseypuke:

Jump in the discussion.

No email address required.

Replicate all the votes data into a graph database (you'll have to install anoter DBMS) and run the query on it. Then you would have to query for the number of all incoming edges userA-[:vote]->userB

:#marseyantischizo:

Jump in the discussion.

No email address required.

@BARD_BOT Why do you hate Science?

Jump in the discussion.

No email address required.

I hecking love science!

I hate meme shit like graph databases as a replacement for bad relational database design

Jump in the discussion.

No email address required.

Run the query and make a networkx graph and train a link prediction model on it then pickle that model and put it in the database then for new queries make a new networkx graph and run the model on it :marseycapygigabrain:

Then sell this workflow to tech companies as "the machine learning solution to social media", but put it on a blockchain first

Jump in the discussion.

No email address required.

Jump in the discussion.

No email address required.

Have you betrayed your SVM faith?

Jump in the discussion.

No email address required.

my friend here says he doesnt know what the simps thing means

Jump in the discussion.

No email address required.

Really should not take that long for the amount of data shown in the query plan. A quick look at the schema suggests you need more indexes, composite indexes, to cover this query. It's relying on the comments.author_id index, and looping for the rest.

Fact table would work as people suggest, but a lot of overhead to achieve something that could be fixed with better DB architecture.

Jump in the discussion.

No email address required.

This is the right answer. If any query is too slow the answer is always to add indexes until you don't see any more table scans

Jump in the discussion.

No email address required.

what indexes would u make

Jump in the discussion.

No email address required.

My test data is not very valid, hard to tell without actual data, but inserting hundreds of thousands of comments and about 70 votes each from random users.

-- See if this new index alone fixes issue


CREATE INDEX commentvotes_commentid_userid_votetype_idx ON public.commentvotes USING btree(comment_id ASC, user_id ASC, vote_type ASC)


-- Alternative way to write query, but didn't see a difference in query plan with my sample data


SELECT
    cv.user_id
    ,count(cv.comment_id) votes
FROM public.commentvotes cv
WHERE 
    cv.vote_type = 1
AND cv.comment_id IN (
    SELECT id 
    FROM public.comments
    WHERE author_id = 1
        AND deleted_utc = 0
        AND is_banned = false
        AND ghost = false
)
GROUP BY cv.user_id
ORDER BY count(cv.comment_id) DESC
Jump in the discussion.

No email address required.

CREATE INDEX commentvotes_commentid_userid_votetype_idx ON public.commentvotes USING btree(comment_id ASC, user_id ASC, vote_type ASC)

i really idk what i'm talking about,

but this doesn't create a btree entry for every permutation of (comment_id, user_id, and vote_type)?

Jump in the discussion.

No email address required.

I'll take a stab at it later today. Been a few years since I've used Postgres.

Jump in the discussion.

No email address required.

Sorting 2293 rows takes almost 14 seconds? Neighbor wat??? A cpu from the 90s could do better than that.

EDIT: Oh wait im looking at the query plan wrong. Bruh have you ever considered sharding?

Jump in the discussion.

No email address required.

average python dev :marseycapyshy:


Give me your money and I'll annoy people with it :space: https://i.rdrama.net/images/16965516366194396.webp

Jump in the discussion.

No email address required.

Can't help you with the code, apart from to say maybe make the calculations less frequent rather than in real time?

I can however reintroduce the latest rendition of my "common sense" proportional system which Snakes was considering before he resigned as janny! :marseygiveup:

It would be interesting to see the data as a proportion of total votes made by the person giving the votes. For example if I've upvoted someone (say LLM) a thousand times, that sounds like a big number, but if I've given away a million upvotes to all users in total, then 1000 upmarseys is barely anything. I'm hardly a simp. On the other hand, if I've given LLM a hundred upvotes, but have only given out 120 upvotes to everyone in total, that's very significant, and makes me a giant simp for LLM. The "haters" section is similar but with downvotes of course. That would really spice things up as you might imagine.

That's "Simps" and "Haters" covered. As for the "Simps for", and "Hates" section, well, you're apparently the third biggest user I simp for. But you do post more than most people, so perhaps your comments have caught my eye and made me upvote you more simply because of that, so we should divide the score by the total number of comments (instead of votes like the previous paragraph) you made for a fairer comparison.

To recap and to simplify things even further, the maths is as simple as just saying (with my profile as an example):

Simps2: n% of OTHERUSER's total upvotes are for my comments.

Haters2: n% of OTHERUSER's total downvotes are for my comments.

Simps for2: I upvote n% of OTHERUSER's total comments.

Hates2: I downvote n% of OTHERUSER's total comments.

Jump in the discussion.

No email address required.

![](/images/1669576960061573.webp)

Jump in the discussion.

No email address required.

Have you owned the libs yet?

Jump in the discussion.

No email address required.

Ages ago.

Jump in the discussion.

No email address required.

Needs more racial slurs

SELECT neighbor FROM straggot;


https://i.rdrama.net/images/17142302820498302.webp

Jump in the discussion.

No email address required.

How often are you querying it? Can't you just compute it when the Indians/Euros are active and store the value daily or something

Jump in the discussion.

No email address required.

?

Jump in the discussion.

No email address required.

im too stupid to help :3

Jump in the discussion.

No email address required.

!slots100

Jump in the discussion.

No email address required.

!slots100

Jump in the discussion.

No email address required.

!slots100

Jump in the discussion.

No email address required.

!slots100

Jump in the discussion.

No email address required.

Learn to code

Jump in the discussion.

No email address required.

Reject :marseytunaktunak:, embrace :marseymerchant:

Make viewing numbers of simps free, like how it used to be

Make viewing the specific comments and posts that were simped for require all seeing eye or a new a profile upgrade

Also bring back old search, I would unironically pay dramacoin per search to use old search again the new one is very unneedful


:#capysneedboat2::#capyantischizo::#space:

Jump in the discussion.

No email address required.

Also bring back old search, I would unironically pay dramacoin per search to use old search again the new one is very unneedful

u mean comment search?

Jump in the discussion.

No email address required.

:#chadyescapy:


:#capysneedboat2::#capyantischizo::#space:

Jump in the discussion.

No email address required.

It is obvious that the code class runs into a server error when trying to parse the matrix operations applied to global variables. I would suggest that you implement a new library script that would unlock and boot the mainframe before docking the executed commands to the automation subroutines. Additionally you could disable the back-end index inhibitor arrays and reroute the computational power to overclock the server CPU ticks

Child's play really

Jump in the discussion.

No email address required.

lol

Jump in the discussion.

No email address required.

Shut it nerd.

Jump in the discussion.

No email address required.

Switch to rust

Jump in the discussion.

No email address required.

:#marseytransflag:

Jump in the discussion.

No email address required.

>using sql in 2023

NoSQL bros ww@

Jump in the discussion.

No email address required.

What's up!

Jump in the discussion.

No email address required.

you could try change the index scan to index-only scan (no access to table) by adding indexes like (vote_type, comment_id) (vote_type, user_id). or try the other way around, (comment_id, vote_type) and (user_id, vote_type), then you could probably delete the existing superfluous user_id index for commentvotes. i'm curious how much that would change

Jump in the discussion.

No email address required.

We actually discussed it on groomercord a couple of days ago, looks like we can't really speed up counting so we need a table for totals and update it using a trigger (a rare time when it's justified).

Jump in the discussion.

No email address required.

Storing events and updating the numbers periodically can be an alternative if handling these updates online (i.e. when a comment is published) proves to be too cumbersome.

Jump in the discussion.

No email address required.

(a rare time when it's justified).

I swear if you racist fricks put the site down for maintanance at some hour where I am most commonly shitposting, I will ask for my simp numbers all day

Jump in the discussion.

No email address required.

You sound triggered

:#marseysting:

Jump in the discussion.

No email address required.

timezone lives matter

Jump in the discussion.

No email address required.

Think other users are right and it's time for a non-live fact table. You could probably put some indexes to speed things up, but the ultimate cost of that is going to be slower and slower writes to the comments and comment votes tables.

You could populate an aggregated fact table during quiet time with:

User_id, user_id_fk, upvote_count, downvote_count refresh it weekly or daily whatever, and run the queries against that.

Jump in the discussion.

No email address required.

This user collects creepshots he takes of obese people in public

Jump in the discussion.

No email address required.

You must be fat if you still have a problem with that after all this time. We already established that I feel no shame in photographing the surprisingly obese. The problem is yours, not mine.

Jump in the discussion.

No email address required.

same cope even after I posted a picture of my body proving i'm thin

Jump in the discussion.

No email address required.

I've obviously made more of an impression on you than you have on me if you think I remember some literally who's body pics. Move on from this fixation you autist.

Jump in the discussion.

No email address required.

I agree with caching the calculated value like others have said.

Another thing you could do is denormalize by including the comment author ID in commentvotes. That would eliminate the join and allow you to index commentvotes by (user_id, author_id) and (author_id, user_id).

Jump in the discussion.

No email address required.

@Deghoster discuss

Jump in the discussion.

No email address required.

Ok fair, you'd need to pull in a few other fields to match the current logic

Jump in the discussion.

No email address required.

(cost=0.44..17.28 rows=32 width=8) (actual time=1.447..5.654 rows=6 loops=13141)

postgres assumes you get 32 upvotes on average but it's actually 6. you've got to make better comments :marseyitsover:

Jump in the discussion.

No email address required.

This doesn't need to be perfectly accurate or live. It's a good use case for some kind of key-value lookup cache or a daily/weekly batch query that calculates it for each user and then stores the result.

Jump in the discussion.

No email address required.

I think this is a good idea. I'm a sql brainlet :marseyretard2: but there doesn't seem to be much fat to trim in the request, it's just handling a lot of comments.

Once the table is built updating it every week with new comments posted as @keepmeposted said with a cron job that looks at the new comments of the week would be less costly than starting from the beginning everytime

Jump in the discussion.

No email address required.

Run a cron job once or twice a day to set values for every user.

Jump in the discussion.

No email address required.

Employ some indians to count manually

Jump in the discussion.

No email address required.

@Snappy @Abby discuss


:#capysneedboat2::#capyantischizo::#space:

Jump in the discussion.

No email address required.

or furries

![](https://media.giphy.com/media/a8E3O2Fr2yuRRpYsq1/giphy.webp)

Jump in the discussion.

No email address required.

Generate random users and vote counts if the query takes more than a couple seconds - no one can verify the counts without DB access so they can't deboonk it

Jump in the discussion.

No email address required.

Im barely a codecel and not going to read all that shit, by why isn't simping handled as part of upvoting/downvoting stuff?

Jump in the discussion.

No email address required.

Instead of running it every time a user clicks run it once per day and cache the result

Jump in the discussion.

No email address required.

Instead of checking every vote the user has made every time the function is called, store the result of the previous call and only check the votes which have been made since then.

Jump in the discussion.

No email address required.

bout to sleep so didn't read but I'm sure this can be solved by hashing :marseyclueless:

Jump in the discussion.

No email address required.

You could aggregate the comment votes into a summary table and run that as a scheduled task every x hours. Or have insert / updates also update the summary table.


https://i.rdrama.net/images/17132860030441194.webp In the femme darkness of the far future, there is only rightoid seethe.

Jump in the discussion.

No email address required.

Not a codecel, but would limiting the scope to the last X comments or Y time improve it? Might be adding complexity but if the query only checked comments made since the last time that it checked, it would probably solve the power user problem.

Good duck :platygirlmagic: for good luck!

Jump in the discussion.

No email address required.

Index the table for power users. Update it every 2 hours or so.

Jump in the discussion.

No email address required.

lmao i was using the simps for panel a view days ago to check out if i can make the server go unresponsive :marseytroublemaker:

easiest solution is to save the result and dont update them more than once a week.

Jump in the discussion.

No email address required.

make a separate commentvotes table for power users like @chiobu

Jump in the discussion.

No email address required.

ANALYZE commentvotes ?

The loop counts do not look healthy whatsoever.

Jump in the discussion.

No email address required.

the reason is because the queries that would include comment votes take way too long when they run on powerusers which causes timeouts that slow down the whole site

Ban powerusers :chadfixedkingcapy:

Jump in the discussion.

No email address required.

![](https://media.giphy.com/media/Rl9Yqavfj2Ula/giphy.webp)

Jump in the discussion.

No email address required.

@FormerLurker WONDER ABOUT THIS BECAUSE @FormerLurker LIKE TO FREQUENTLY REWARD SIMPS WITH DRAMACOIN. @FormerLurker GLAD TO BE ABLE TO DO THIS AGAIN.

@FormerLurker ALSO IMPRESSED TO LEARN EGYPTIANS SHOWER. @FormerLurker WAS UNDER IMPRESSION EGYPTOIDS TAKE SAND BATHS OR WASH IN NILE.

TRANS LIVES MATTER

Jump in the discussion.

No email address required.

@FormerLurker BEWILDERED TO LEARN @Vivec IS NUMBER ONE POST UPMARSEY SIMP. @FormerLurker NEVER HEARD OF THIS @Vivec. @FormerLurker MAKE 28 POSTS AND @Vivec UPVOTE 27 OF THEM. @FormerLurker KIND OF FRIGHTENED BY THIS BUT @FormerLurker GRATEFUL SO @FormerLurker SEND 300 MARSEY COINS TO @Vivec

TRANS LIVES MATTER

Jump in the discussion.

No email address required.

@FormerLurker SEE THAT @AraAra HAS ALSO UPMARSEYED 27 OF 28 OF @FormerLurker'S POSTS. SO @AraAra ALSO GET 300 COIN. WHO ARE THESE STRANGE LURKERS HAHA @FormerLurker NEVER LURK

TRANS LIVES MATTER

Jump in the discussion.

No email address required.

:#marseyretardchad:

Jump in the discussion.

No email address required.

so the answer might be pretty obvious (correct answer gets 50k marseybuckerinos)

You need to trаnsition

:marseyfemboy:

HRT and programmer thigh highs will deffo solve this :marseytrans2:

Jump in the discussion.

No email address required.

![](/images/16723086722507198.webp)

Jump in the discussion.

No email address required.

got dang snappy thats hard

Jump in the discussion.

No email address required.

:m#arseyitsover:

Jump in the discussion.

No email address required.

https://i.rdrama.net/images/16841388463758194.webp

Jump in the discussion.

No email address required.

Link copied to clipboard
Action successful!
Error, please refresh the page and try again.