Intermittent 'High Load', during low CPU and RAM usage

Question

I have a wordpress learning platform (LearnDash) site hosted on a Vultr HF 8CPU server. It's overkill and my site doesn't currently go near either the CPU or RAM limits. However at times the site 'load' goes high and the disk operations spike. The site still functions but slower for some time.

I'm still investigating but I believe it is when a class of 30 or so students all sign up at the same time (using wordpress plugin uncanny groups enrol codes where their accounts get created and then immediately assigned to LearnDash courses and groups).

Considering the high spec server, is this simultaneous sign-up scenario really going to max out the disk read/write? Or is it unlikely that it would cause a spike?

Surely it should be similar if say 200 users are all taking a quiz at the same time (which happens without any site issues) due to constant read and writing to the dB, but my server handles this ok.

I imagine there are other sites out there that have many simultaneously sign ups without a slowing of the site (temporary high load)?

The issue is that if I'm correct these students then immediately start exploring the site and the load stays high for their first session and impacts others on the site for that time.

Normally my site can handle 100s of concurrent users without issue but it seems that a group signing up together is problematic.

I'm quite new to managing my own server and so please be kind. I would really appreciate it if someone would be willing to give a little advice as to whether 1. Simultaneous user signups could be the issue and 2. how to mitigate it?

Have contacted both Vultr and my Control Panel 'RunCloud', who both were not overly helpful.

Does this answer your question? Can you help me with my capacity planning? — Romeo Ninov, Commented Jan 13 at 20:29
Additional DB information request from your Vultr instance, please. RAM size, any SSD or NVME devices on MySQL Host server? Post TEXT data on justpaste.it and share the links. From your SSH login root, Text results of: A) SELECT COUNT(*) FROM information_schema.tables; B) SHOW GLOBAL STATUS; after minimum 24 hours UPTIME C) SHOW GLOBAL VARIABLES; D) SHOW FULL PROCESSLIST; E) STATUS; not SHOW STATUS, just STATUS; G) SHOW ENGINE INNODB STATUS; for server workload tuning analysis to provide suggestions. — Wilson Hauck, Commented Jan 14 at 13:57
@sw123456 Have you considered providing Additional DB information requested Jan 14, 2024? Suggestions to improve response time will be posted as an Answer, after analysis. — Wilson Hauck, Commented Mar 31 at 14:59

math · Accepted Answer · 2024-01-15 23:37:05Z

Remember what load is: the number of processes that are running on cpu or are runnable but are waiting for a resource, usually cpu and/or disk. Many people assume it's cpu, but waiting for disk is often the cause of load. (I sometimes wish load for cpu and disk was seperately measured, but that's complex.)

Use vmstat 3 to see what the activity is - ensure you are not thrashing swap. (I don't use swap at all on my servers - it's a risk to allow swap to thrash or spend a long time paging in when trying to shutdown a process nicely - I'd rather 'fail-fast' than limp along with a server running unusably slow.)

Try using iotop(1) utility to see if you can figure out what's using the disk. However if it is many short-live processes hammering disk, you may not capture them in the act. Tying the event to activity in logs can also assist you.

There may be some pathologically poor multi-threading aspects to your plugin, depending on how the database is used. If it's mysql/mariadb, try echo "show full processlist;" | mysql | tee /tmp/somelogfile.log for eg, and investigate if there are long-running queries that are hitting disk or even cpu (I have found this is often to do with lack of indeces on tables or poorly-constructed join clauses. You can use 'explain' on queries in mysql to see what they're up to. See articles on mysql performance for details.)

If you can survive data-loss for seconds in your transactions in case of a crash of mysql/the server, you can tune down mysql's default ACID-compliant settings (see DBA stackexchange for details) which may be overly aggressive for workloads that do not require it, causing pathological syncing of small writes.

Ensure you have your database on a fast SSD/nvme, not an HDD. Tune your caches in mysql (or any DB, postgres, etc you're using). Myriad articles describe how.

Software is responsible for the load, the platform/server itself is probably not the issue if it's fast for other uses.

Thank you so much for your fantastic response. Having looked at top -c I've narrowed the issue to mysql/mariadb and it is when multiple users (a class) are asked to sign up to the site, probably all together, via teacher instruction and it overloads the CPU with mysql queries. A few moments after (but longer than I would like) things settle but it takes time to process the backlog. I will take your excellent advice and try to see if I can optimise any problem queries. However, I also wonder whether these spikes may just be indicative of whole class simultaneous signups. All other behaviours ok — sw123456, Commented Jan 15 at 19:39
I had similar issues with the exact same type of scenario: a class (for a driving school framework) of students all finish their exams and login to log their progress - hammers the site, and becomes unusable. I got their coder to batch their transactions as a single transaction instead in his code, waiting 5 seconds for any others to come in, up to a maximum of some # before committing. This reduced load drastically. While you probably dont want to rewrite someone's terrible plugin code, you might want to tweak their db setup with a CREATE INDEX ON statement or two. — math, Commented Jan 15 at 23:35
Oh and as I say use show full processlist; in mysql, itll tell you what its busy with. You can isolate specific problem queries. Also turn on / tune up your mysql-slow.log settings. — math, Commented Jan 16 at 3:40

Stack Exchange Network

Intermittent 'High Load', during low CPU and RAM usage

1 Answer 1

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged
high-load
.

Linked

Hot Network Questions

Intermittent 'High Load', during low CPU and RAM usage

1 Answer 1

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged high-load.

Linked

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
high-load
.