Percona Server got slow with increasing data

Hello,

We installed Percona Server with Mysql 8 and it worked well, for a few months we began to notice certain slowdowns in the pages reported by Google Analytics.

We have almost 20GB of data, over 4 million records on a single table.
It is a streaming platform that displays Videos in connection with categories, actors, producers and tags.

Displaying just the first page of videos linked to a tag now takes at least 6 seconds.

Before, we were less than a second away, but we had 4 times less data.

We had a server:

  • Intel Xeon D1540 - 8c/16t - 2GHz/2.6GHz
  • 64Gb DDR4 ECC
  • NVMe SSD Hard Drive

We have now moved to a larger server:

  • AMD Epyc 7351p - 16c/32t - 2.4GHz/2.9GHz
  • 128 GB DDR4 ECC
  • NVMe SSD Hard Drive

The result remained the same unchanged despite the increase in server power.

The queries are well optimized with all the necessary indexes…

Anyone have an idea how to unlock this? We have been trying to play with the configuration for several days now but without result.

Thanks for your help.

Here is the current configuration that we added after the installation:

Operating System: Ubuntu 20.04.3 LTS

[mysqld]
#bind_address                   = 0.0.0.0
default_authentication_plugin	= mysql_native_password
skip_name_resolve
max_allowed_packet             	= 1G
#max_connect_errors             = 1000000
max_connections                	= 1000

character_set_server            = utf8mb4
collation_server                = utf8mb4_unicode_ci

innodb_lock_wait_timeout		= 600
key_buffer_size                 = 64M
innodb_buffer_pool_size         = 102G
innodb_buffer_pool_instances    = 64
thread_pool_size                = 16
innodb_log_file_size            = 13G
internal_tmp_mem_storage_engine = MEMORY
#sort_buffer_size
#innodb_dedicated_server
1 Like

Hello @ChristopherHS,
Honestly, 20GB isn’t a lot of data (compared to what I see on a regular basis), so hearing that you’ve optimized queries still raises my eyebrows with disbelief. Your buffer pool is set at 102G, which means your data can completely fit into memory. Side note, that setting of 102GB will go largely unused since InnoDB won’t ever need to store in memory more data than already exists.

Your config looks good, but since we can’t see anything else, I highly advise you to install and configure Percona Monitoring and Management (PMM) to truly get insight into your queries, CPU, memory, etc.

1 Like

Hello @matthewb,

Thank you for your answer.

I understand your surprise, we had the same reaction before moving from a 32Gb server to a 128Gb server.

To give you a simple example of a page that displays the videos of a tag without pagination and without filter and link to the translation table:

Rows: 40
Took (ms): 2802

SELECT 
  Videos.id AS Videos__id, 
  Videos.videokey AS Videos__videokey, 
  Videos.title AS Videos__title, 
  Videos.title_prefix AS Videos__title_prefix, 
  Videos.thumb AS Videos__thumb, 
  Videos.thumbs_path AS Videos__thumbs_path, 
  Videos.video_preview AS Videos__video_preview, 
  Videos.duration AS Videos__duration, 
  Videos.views AS Videos__views, 
  Videos.rating AS Videos__rating, 
  Videos.published AS Videos__published, 
  Videos.segment AS Videos__segment, 
  Videos.deleted AS Videos__deleted 
FROM 
  videos Videos 
  INNER JOIN tags_videos TagsVideos ON Videos.id = TagsVideos.video_id 
  INNER JOIN tags Tags ON (
    Tags.active = TRUE 
    AND Tags.segment = 1 
    AND Tags.id = 4633 
    AND Tags.id = TagsVideos.tag_id
  ) 
WHERE 
  (
    Videos.active = TRUE 
    AND Videos.segment = 1 
    AND (Videos.deleted) IS NULL
  ) 
ORDER BY 
  Videos.views DESC 
LIMIT 
  40 OFFSET 0

Thanks for your help.

1 Like

Couple things here:

  • Using temporary; Using filesort - Two big red flags. Filesort is a very slow and CPU intensive sorting algorithm used by the optimizer when no other index is available. This usually happens when you are doing different GROUP BY or ORDER BY columns, or when sorting in reverse (like you are). Since you are using MySQL 8, I would look at using the new DESC indexes if this query should always return in that order.
  • Look at the number of rows. 1 row * 1M rows * 1 row means this query is examining 1M rows each execution. While the query is indexed, you’re still looking at “a lot of data” each time. You should optimize this query by reworking the JOIN. You should be filtering out as much data as possible in the base table before joining with the remainder tables. Or, you need to utilize multi-column indexes over single column since MySQL cannot use more than 1 index per table.

Good luck!

1 Like

Hello @matthewb,

Thanks again for your help and insightful feedback.

Indeed in the queries we use the reverse order, I followed your advice and we applied new DESC indexes on all the tables which use the order on columns.

Unfortunately no visible change.

We even tried to use the EverSQL service and did a test on the same query, but the proposed optimized query did not return the expected result.

The problem comes from the ORDER BY Videos.views DESC, because when we remove this instruction the result is fast.

The indexes I think are good, is this the limit of Mysql in terms of speed/performance or should we look for other optimizations?

Here is the schema of the videos table:

CREATE TABLE IF NOT EXISTS `videos` (
  `id` INT UNSIGNED NOT NULL AUTO_INCREMENT,
  `channel_id` INT UNSIGNED NULL,
  `videokey` VARCHAR(25) NOT NULL,
  `title_prefix` VARCHAR(120) NULL DEFAULT NULL,
  `title` VARCHAR(120) NULL DEFAULT NULL,
  `title_niche` VARCHAR(120) NULL DEFAULT NULL,
  `thumb` VARCHAR(100) NULL DEFAULT NULL,
  `thumbs_path` VARCHAR(100) NULL DEFAULT NULL,
  `video_preview` VARCHAR(100) NULL DEFAULT NULL,
  `duration` MEDIUMINT UNSIGNED NOT NULL DEFAULT 0,
  `views` INT UNSIGNED NOT NULL DEFAULT 0,
  `rating` DECIMAL(5,2) NOT NULL DEFAULT 0.00,
  `ratings` INT UNSIGNED NOT NULL DEFAULT 0,
  `segment` TINYINT NOT NULL DEFAULT 1,
  `related_count` INT UNSIGNED NOT NULL DEFAULT 0,
  `active` TINYINT(1) NULL DEFAULT NULL,
  `initialized` TINYINT(1) NOT NULL DEFAULT 0,
  `related` TINYINT(1) NOT NULL DEFAULT 0,
  `translated` TINYINT(1) NOT NULL DEFAULT 0,
  `published` DATETIME NULL DEFAULT NULL,
  `deleted` DATETIME NULL DEFAULT NULL,
  `modified` DATETIME NULL DEFAULT NULL,
  PRIMARY KEY (`id`),
  UNIQUE INDEX `videokey_UNIQUE` (`videokey` ASC) VISIBLE,
  INDEX `videokey_INDEX` (`videokey` ASC, `deleted` ASC, `active` ASC, `segment` ASC) VISIBLE,
  INDEX `channel_id_INDEX` (`channel_id` ASC, `deleted` ASC, `active` ASC, `segment` ASC) VISIBLE,
  INDEX `duration_INDEX` (`duration` DESC, `deleted` ASC, `active` ASC, `segment` ASC) VISIBLE,
  INDEX `views_INDEX` (`views` DESC, `deleted` ASC, `active` ASC, `segment` ASC) VISIBLE,
  INDEX `published_INDEX` (`published` DESC, `deleted` ASC, `active` ASC, `segment` ASC) VISIBLE,
  INDEX `deleted_INDEX` (`deleted` ASC, `active` ASC, `segment` ASC) VISIBLE,
  FULLTEXT INDEX `title_FULLTEXT` (`title_prefix`, `title`) VISIBLE,
  INDEX `title_niche_INDEX` (`title_niche` ASC, `active` ASC, `deleted` ASC) VISIBLE,
  INDEX `initialized_INDEX` (`initialized` ASC, `active` ASC, `deleted` ASC) VISIBLE,
  INDEX `related_INDEX` (`related` ASC, `active` ASC, `deleted` ASC) VISIBLE,
  INDEX `translated_INDEX` (`translated` ASC, `active` ASC, `deleted` ASC) VISIBLE)
ENGINE = InnoDB;

Thank you.

1 Like

what indexes you have for tags_videos table? also what is the cardinality of Videos.active and Videos.segment?

1 Like

Hi @Ivan_Groenewold,

Here is the Schema of the tags_videos table:

CREATE TABLE IF NOT EXISTS `tags_videos` (
  `video_id` INT UNSIGNED NOT NULL,
  `tag_id` INT UNSIGNED NOT NULL,
  PRIMARY KEY (`video_id`, `tag_id`),
  INDEX `tag_video_INDEX` (`tag_id` ASC, `video_id` ASC) VISIBLE)
ENGINE = InnoDB;

Videos.active & Videos.segment have numeric values and are part of the indexes.

CREATE TABLE IF NOT EXISTS `videos` (
  `id` INT UNSIGNED NOT NULL AUTO_INCREMENT,
  `channel_id` INT UNSIGNED NULL,
  `videokey` VARCHAR(25) NOT NULL,
  `title_prefix` VARCHAR(120) NULL DEFAULT NULL,
  `title` VARCHAR(120) NULL DEFAULT NULL,
  `title_niche` VARCHAR(120) NULL DEFAULT NULL,
  `thumb` VARCHAR(100) NULL DEFAULT NULL,
  `thumbs_path` VARCHAR(100) NULL DEFAULT NULL,
  `video_preview` VARCHAR(100) NULL DEFAULT NULL,
  `duration` MEDIUMINT UNSIGNED NOT NULL DEFAULT 0,
  `views` INT UNSIGNED NOT NULL DEFAULT 0,
  `rating` DECIMAL(5,2) NOT NULL DEFAULT 0.00,
  `ratings` INT UNSIGNED NOT NULL DEFAULT 0,
  `segment` TINYINT NOT NULL DEFAULT 1,
  `related_count` INT UNSIGNED NOT NULL DEFAULT 0,
  `active` TINYINT(1) NULL DEFAULT NULL,
  `initialized` TINYINT(1) NOT NULL DEFAULT 0,
  `related` TINYINT(1) NOT NULL DEFAULT 0,
  `translated` TINYINT(1) NOT NULL DEFAULT 0,
  `published` DATETIME NULL DEFAULT NULL,
  `deleted` DATETIME NULL DEFAULT NULL,
  `modified` DATETIME NULL DEFAULT NULL,
  PRIMARY KEY (`id`),
  UNIQUE INDEX `videokey_UNIQUE` (`videokey` ASC) VISIBLE,
  INDEX `videokey_INDEX` (`videokey` ASC, `deleted` ASC, `active` ASC, `segment` ASC) VISIBLE,
  INDEX `channel_id_INDEX` (`channel_id` ASC, `deleted` ASC, `active` ASC, `segment` ASC) VISIBLE,
  INDEX `duration_INDEX` (`duration` DESC, `deleted` ASC, `active` ASC, `segment` ASC) VISIBLE,
  INDEX `views_INDEX` (`views` DESC, `deleted` ASC, `active` ASC, `segment` ASC) VISIBLE,
  INDEX `published_INDEX` (`published` DESC, `deleted` ASC, `active` ASC, `segment` ASC) VISIBLE,
  INDEX `deleted_INDEX` (`deleted` ASC, `active` ASC, `segment` ASC) VISIBLE,
  FULLTEXT INDEX `title_FULLTEXT` (`title_prefix`, `title`) VISIBLE,
  INDEX `title_niche_INDEX` (`title_niche` ASC, `active` ASC, `deleted` ASC) VISIBLE,
  INDEX `initialized_INDEX` (`initialized` ASC, `active` ASC, `deleted` ASC) VISIBLE,
  INDEX `related_INDEX` (`related` ASC, `active` ASC, `deleted` ASC) VISIBLE,
  INDEX `translated_INDEX` (`translated` ASC, `active` ASC, `deleted` ASC) VISIBLE)
ENGINE = InnoDB;

Thank you.

1 Like

What I meant by cardinality is what % of the rows actually have active=true and what % have segment=1 . Also what is the execution plan if you remove the ORDER BY Videos.views ?

1 Like

For Videos.active

active count
-1 652319
0 14515
1 3219435
null 69675

For Videos.segment

segment count
1 3559340
2 396604

Without the ORDER BY, the request goes from 2.6746 seconds to 0.0238 second

Thanks.

1 Like

Ok based on the info you posted, I think most likely the slowdown is coming from tmp tables being created on disk. You can try increasing the tmp_table_size and max_heap_table_size parameters to try and get the sort happen in memory.

2 Likes

Hi @Ivan_Groenewold,

Thank you for your help.

I just tried but unfortunately the result remains unchanged.

I put 128M at the beginning and I restarted Mysql then I tried with 512M but nothing either, the execution time remains the same.

First test

tmp_table_size                  = 128M
max_heap_table_size             = 128M

Second test

tmp_table_size                  = 512M
max_heap_table_size             = 512M

I don’t know if I should increase more than that ?

1 Like

You can increase that even further to 1G or more but be careful to not run out of memory. You can check wether the query used a disk-based tmp table by checking the value for your session of created_tmp_disk_tables before and after you run the query.

1 Like

It’s incredible, the result remains the same. I am now on 5G

Here is the result of tmp table before switching to 5G
Created_tmp_disk_tables: 2137
Created_tmp_files: 982
Created_tmp_tables: 23058

I have a headache…

1 Like

So, instead of examining 1M rows each time, how about you look at less data? Modify the query to be more selective so it examines less data which requires less CPU and less temp table space.

1 Like

Hello @matthewb,

Yes, we have already thought about it but unfortunately we have no choice on sorting, we have to.

We increased the values to 20G but the result remains the same.

tmp_table_size                  = 20G
max_heap_table_size             = 20G

We are thinking of taking another more powerful server and we are hesitating between two configurations:

Current server:

CPU: AMD Epyc 7351p - 16c/32t - 2.4GHz/2.9GHz
Memory: 128 GB DDR4 ECC
Storage: NVMe SSD

First choice: With more memory

CPU: AMD Epyc 7351p - 16c/32t - 2.4GHz/2.9GHz
Memory: 256 GB DDR4 ECC
Storage: NVMe SSD

Second choice: With more cores but same amount of memory as the current server

CPU: AMD Epyc 7451 - 24c/48t - 2.3GHz/2.9GHz
Memory: 128 GB DDR4 ECC
Storage: NVMe SSD

Do you think it will solve the problem of changing servers to a more powerful one?

Thank you again for your help.

1 Like

Let’s think about this: As I said above, you already have 4x more memory than data, so we know that even more memory won’t help. Also, MySQL cannot use more than 1 thread per query execution, so we now know that a CPU with more threads/cores also won’t help. A faster CPU helps in query execution speed, however going to a 3.4Ghz CPU might gain you 5-10% faster execution. Throwing hardware at this is not the solution.

Let’s also consider growth of the data. Say you double the data in the next 2 months. Now this query is examining 2M rows and is slower. Why? Because it has to examine more data.

You need to re-read my comment. I didn’t say to remove the sorting. I said you need to examine less data. Each time this query runs 1M+ rows are examined. Why? Have you tried other query patterns? Have you tried using LEFT JOINs? Have you tried re-ordering the JOINs? Have you tried STRAIGHT_JOIN modifier? Have you analyzed the tables* (to update index stats)? You need to think about the query. Your query should be taking the “largest table” and filtering the most data to give you what you need. Then from that set, join the next table and filter away some more, then take that set and join again. Don’t just write queries expecting the database to magically “figure it out”.

*Set innodb_stats_persistent_sample_pages=200 before you analyze

2 Likes

Hello,

Yes the analysis of the tables is done at the end of each update automatically.

Thanks to your advice, we found the cause (the optimizer), as soon as I added the STRAIGHT_JOIN modifier the query ran quickly.

We couldn’t believe it, after several days of headaches, finally relieved.

I wanted to thank you all for your help, because without you we would not have found the cause and the solution, we did not have the right diagnoses…

A VERY BIG THANK YOU.

1 Like

Just FYI, STRAIGHT_JOIN is considered a “band-aid” solution. The correct solution is to rewrite your query so that the optimizer better understands how your tables should be joined. You might want to enable optimizer_tracing as well to better understand how the optimizer is making decisions.

2 Likes

Noted, we’ll try to work on that :wink:
Thank you.

1 Like