Score:0

Diagnosing mySQL Random CPU Spikes

zw flag

Our organization has a fairly advanced (as in many moving pieces) web application that's been working great up until recently, yet no noticeable changes have been made.

There is the Apache Web Server, the mySQL Server (Data Processing), and then another mySQL Server that handles the repetitive requests for the public. Traffic influx to the main web server would almost only impact the secondary SQL server. The repetitive requests server is a slave of the data processing server, but no write queries are sent to the repetitive requests server - readonly.

The issue I'm facing is the data processing mySQL server seems to randomly spike to 100% CPU usage when typically operating at 15-20%, even under heavy load. The 100% CPU usage lasts for about 8 seconds, sometimes in spurts of every 2-3 minutes until it goes away randomly on its own.

There are no queries in the slow query log, aside from the ones that are attempted to be processed when at 100%. The queries leading up to the log are the same ones that are typically running without issue before the random spike.

The only activity shown by HTOP is mySQL, there are no crons scheduled on this server as well, and no increase in other query activity during these times. The open connection thread account remains steady around 3-5, and the PROCESS LIST also only has 3-5 queries in it before, during or after.

SELECT * FROM audio
WHERE associated_incident IS null
AND archive IS NULL
AND temp_skip IS null
AND length >= 3
AND (locked <> 1 or locked IS NULL)
AND timestamp > (NOW() - INTERVAL 4 HOUR)
ORDER BY `audio`.`id` DESC
LIMIT 1;

UPDATE audio SET locked=1, lockexpr = '$ulockexpr', lockuser = '$ulockuser' WHERE id = $audio_id;

The query above is ran repetitively in the software our staff use, but at a given time no more than 1-2 per second, but is the closest I can narrow down the issue. This query will also show in the slow_query_log during the high spikes. But just to reclarify, this query is used throughout th entire day ad we don't always encounter these issues.

I can stress-test the server with this query, running it 100x within 5 seconds, and still get an average result of .002 to .005 seconds for a return time.

In /var/log/mysql I have 8.2Gb of mysql-bin files.

Only possible modifications to my.cnf

key_buffer_size         = 16M
myisam-recover-options  = BACKUP
log_error = /var/log/mysql/error.log
slow_query_log          = 1
slow_query_log_file     = /var/log/mysql/mysql-slow.log
long_query_time = 2

 server-id              = 1
 log_bin                        = /var/log/mysql/mysql-bin.log
max_binlog_size   = 100M
binlog_do_db            = incident_log

Everything else is unspecified or commented out.

Mysql Tuner

 >>  MySQLTuner 2.1.1
         * Jean-Marie Renouard <[email protected]>
         * Major Hayden <[email protected]>
 >>  Bug reports, feature requests, and downloads at http://mysqltuner.pl/
 >>  Run with '--help' for additional options and output filtering

[--] Skipped version check for MySQLTuner script
[OK] Logged in using credentials from Debian maintenance account.

[OK] Currently running supported MySQL version 8.0.32-0ubuntu0.20.04.2
[OK] Operating on 64-bit architecture

-------- Log file Recommendations ---------------------------------------------
[OK] Log file /var/log/mysql/error.log exists
[--] Log file: /var/log/mysql/error.log (0B)
[--] Log file /var/log/mysql/error.log is empty. Assuming log-rotation. Use --server-log={file} for explicit file

-------- Storage Engine Statistics ---------------------------------------------
[--] Status: +ARCHIVE +BLACKHOLE +CSV -FEDERATED +InnoDB +MEMORY +MRG_MYISAM +MyISAM +PERFORMANCE_SCHEMA
[--] Data in InnoDB tables: 1.5G (Tables: 107)
[OK] Total fragmented tables: 0

-------- Analysis Performance Metrics ---------------------------------------------
[--] innodb_stats_on_metadata: OFF
[OK] No stat updates during querying INFORMATION_SCHEMA.

-------- Views Metrics ---------------------------------------------

-------- Triggers Metrics ---------------------------------------------

-------- Routines Metrics ---------------------------------------------

-------- Security Recommendations ---------------------------------------------
[--] Skipped due to unsupported feature for MySQL 8.0+

-------- CVE Security Recommendations ---------------------------------------------
[--] Skipped due to --cvefile option undefined

-------- Performance Metrics ---------------------------------------------
[--] Up for: 12h 41m 34s (3M q [73.289 qps], 1M conn, TX: 5G, RX: 445M)
[--] Reads / Writes: 81% / 19%
[--] Binary logging is enabled (GTID MODE: OFF)
[--] Physical Memory     : 1.9G
[--] Max MySQL memory    : 10.1G
[--] Other process memory: 0B
[--] Total buffers: 176.0M global + 65.9M per thread (151 max threads)
[--] Performance_schema Max memory usage: 239M
[--] Galera GCache Max memory usage: 0B
[!!] Maximum reached memory usage: 10.2G (528.62% of installed RAM)
[!!] Maximum possible memory usage: 10.1G (525.28% of installed RAM)
[!!] Overall possible memory usage with other process exceeded memory
[OK] Slow queries: 0% (3/3M)
[!!] Highest connection usage: 100% (152/151)
[OK] Aborted connections: 0.09% (1203/1391547)
[!!] Name resolution is active: a reverse name resolution is made for each new connection which can reduce performance
[--] Query cache has been removed since MySQL 8.0
[OK] Sorts requiring temporary tables: 0% (0 temp sorts / 10K sorts)
[!!] Joins performed without indexes: 5915
[OK] Temporary tables created on disk: 0% (0 on disk / 8K total)
[OK] Thread cache hit rate: 94% (75K created / 1M connections)
[OK] Table cache hit rate: 99% (1M hits / 1M requests)
[OK] table_definition_cache (2000) is greater than number of tables (434)
[OK] Open file limit used: 0% (3/10K)
[OK] Table locks acquired immediately: 100% (1K immediate / 1K locks)
[OK] Binlog cache memory access: 100.00% (346520 Memory / 346520 Total)

-------- Performance schema ---------------------------------------------
[--] Performance_schema is activated.
[--] Memory used by Performance_schema: 239.2M
[--] Sys schema is installed.

-------- ThreadPool Metrics ---------------------------------------------
[--] ThreadPool stat is disabled.

-------- MyISAM Metrics ---------------------------------------------
[--] MyISAM Metrics are disabled since MySQL 8.0.

-------- InnoDB Metrics ---------------------------------------------
[--] InnoDB is enabled.
[--] InnoDB Thread Concurrency: 0
[OK] InnoDB File per table is activated
[!!] InnoDB buffer pool / data size: 128.0M / 1.5G
[!!] Ratio InnoDB log file size / InnoDB Buffer pool size (75%): 48.0M * 2 / 128.0M should be equal to 25%
[OK] InnoDB buffer pool instances: 1
[--] Number of InnoDB Buffer Pool Chunk: 1 for 1 Buffer Pool Instance(s)
[OK] Innodb_buffer_pool_size aligned with Innodb_buffer_pool_chunk_size & Innodb_buffer_pool_instances
[OK] InnoDB Read buffer efficiency: 99.87% (403976971 hits / 404504329 total)
[!!] InnoDB Write Log efficiency: 59.62% (1713194 hits / 2873563 total)
[OK] InnoDB log waits: 0.00% (0 waits / 1160369 writes)

-------- Aria Metrics ---------------------------------------------
[--] Aria Storage Engine not available.

-------- TokuDB Metrics ---------------------------------------------
[--] TokuDB is disabled.

-------- XtraDB Metrics ---------------------------------------------
[--] XtraDB is disabled.

-------- Galera Metrics ---------------------------------------------
[--] Galera is disabled.

-------- Replication Metrics ---------------------------------------------
[--] Galera Synchronous replication: NO
[--] No replication slave(s) for this server.
[--] Binlog format: ROW
[--] XA support enabled: ON
[--] Semi synchronous replication Master: Not Activated
[--] Semi synchronous replication Slave: Not Activated
[--] This is a standalone server

-------- Recommendations ---------------------------------------------
General recommendations:
    MySQL was started within the last 24 hours: recommendations may be inaccurate
    Reduce your overall MySQL memory footprint for system stability
    Dedicate this server to your database for highest performance.
    Reduce or eliminate persistent connections to reduce connection usage
    Configure your accounts with ip or subnets only, then update your configuration with skip-name-resolve=1
    We will suggest raising the 'join_buffer_size' until JOINs not using indexes are found.
             See https://dev.mysql.com/doc/internals/en/join-buffer-size.html
             (specially the conclusions at the bottom of the page).
    Buffer Key MyISAM set to 0, no MyISAM table detected
    Before changing innodb_log_file_size and/or innodb_log_files_in_group read this: https://bit.ly/2TcGgtU
Variables to adjust:
  *** MySQL's maximum memory usage is dangerously high ***
  *** Add RAM before increasing MySQL buffer variables ***
    max_connections (> 151)
    wait_timeout (< 28800)
    interactive_timeout (< 28800)
    skip-name-resolve=1
    join_buffer_size (> 256.0K, or always use indexes with JOINs)
    key_buffer_size=0
    innodb_buffer_pool_size (>= 1.5G) if possible.
    innodb_log_file_size should be (=16M) if possible, so InnoDB total log file size equals 25% of buffer pool size.

Open for all advice.

Edits:

mysql> SHOW TABLE STATUS WHERE name LIKE "audio";

| Name  | Engine | Version | Row_format | Rows   | Avg_row_length | Data_length | Max_data_length | Index_length | Data_free | Auto_increment | Create_time         | Update_time         | Check_time | Collation          | Checksum | Create_options | Comment |

| audio | InnoDB |      10 | Dynamic    | 969276 |            164 |   159039488 |               0 |     90898432 |   7340032 |        1644653 | 2023-04-13 14:41:27 | 2023-04-17 14:56:21 | NULL       | utf8mb4_0900_ai_ci |     NULL |                |         |

PHP

$sql = "SELECT * FROM audio
WHERE associated_incident IS null
AND archive IS NULL
AND temp_skip IS null
AND length >= 3
AND (locked <> 1 or locked IS NULL)
ORDER BY `audio`.`id` DESC
LIMIT 1;
";

$result = mysqli_query($conn, $sql) or die(mysqli_error());
$audio = mysqli_fetch_assoc($result);
if ($audio) {
    $audio_id = $audio['id'];
    $audio_timestamp = $audio['timestamp'];
    $audio_hash = $audio['hash'];
    $audio_length = $audio['length'];

$ulockexpr = date("Y-m-d H:i:s", strtotime('+ 5min'));
$ulockuser = $_SESSION['login_user'];
$sql_lock = "UPDATE audio SET locked=1, lockexpr = '$ulockexpr', lockuser = '$ulockuser' WHERE id = $audio_id";
Wilson Hauck avatar
jp flag
Please add to your code how $audio_id got set prior to the UPDATE request. Also, please post TEXT results of SHOW TABLE STATUS WHERE name LIKE "audio"; for analysis.
Johnathan Martin avatar
zw flag
@WilsonHauck Added, thanks.
Wilson Hauck avatar
jp flag
Johnathan, Is there a reason this line AND timestamp > (NOW() - INTERVAL 4 HOUR) if missing from the PHP recently posted? Please also post TEXT results of SHOW CREATE TABLE audio; for analysis. I suspect this will show us your audio.id is an INT datatype and your update is matching on string data which takes longer. Delays could be caused by volume of changes causing index rebuilds on the fly and with your volume of rows in audio could take a few seconds, when busy. View profile for contact info, we could talk if needed.
Wilson Hauck avatar
jp flag
Also, MySQLTuner report indicates there is no REPLICATION slave, you have SERVER-ID as 1 in your config, and might be ok, if you are managing the unused bin log's from time to time to avoid filling your storage with data that will never be used.
ua flag
Lower `long_query_time` to `0.5` and set `log_slow_admin_statements = ON` and `log_slow_slave_statements = ON`. More: http://mysql.rjweb.org/doc.php/mysql_analysis#slow_queries_and_slowlog
ua flag
"The queries leading up to the log are the same ones..." -- Do they take longer? Do lots of them all end at the same time?
Score:0
ua flag

Change to innodb_buffer_pool_size = 500M -- the old default of 128M is terribly low.

If that SELECT is done to find a row to work on and then you do that UPDATE, then use a transaction:

START TRANSACTION;
SELECT ... FROM audio ... FOR UPDATE;
... process ...
UPDATE audio SET ... WHERE id = ...;
COMMIT;

The START - FOR UPDATE - COMMIT will keep two threads from grabbing the same item to work on.

Consider using ORDER BY timestamp DESC instead of ORDER BY id DESC.

This config may be necessary to see the offending statement:

log_slow_admin_statements = ON

What indexes are on the table? Please provide SHOW CREATE TABLE.

Lower [yes, lower] max_connections to only 50. And lower the Apache settings so that it won't spawn more than 40 concurrent children. I see that Max_used_connections hit max_connections. When this happens, MySQL will thrash until the problem unwinds. Lowering the limit will let it start unwinding sooner.

I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.