Query optimization

I don’t know if this is the right forum to use.
I’ve moved a Percona 5.7 database from our local premise to a Google Cloud Instance.
After the move, queries that where executed in 0.000 seconds on our local server, started immediatly to run in 0.030-0.060 seconds.

I have to optimize this. On our local server I used a (huge, 1GB) query cache, but for some mutex contention on GCE, I had to disable it totally. I had tons of query waiting due to caching but disabling it increased a lot the query execution time.

Any idea or suggestion ?

Hi there, you are in the right place!
I’ll see if the team has any suggestions for you.

Now i’m trying with query cache enabled and just 100M or query_cache_size.
Our slowest procedure (that makes hundreds of queries) dropped from 2.5 seconds to 1.6-1.7

Current config:

[mysqld]
skip-name-resolve
performance_schema = OFF

max_connections = 400

query_cache_limit = 1M
query_cache_size = 100M
query_cache_type = 1
#query_cache_size = 0
#query_cache_type = 0

table_open_cache = 10240
tmp_table_size = 512M
max_heap_table_size = 512M

thread_cache_size = 128
thread_pool_size = 32

open_files_limit = 8192

wait_timeout = 30
interactive_timeout = 30

join_buffer_size = 4M

innodb_file_per_table = ON
innodb_buffer_pool_size = 4G
innodb_buffer_pool_instances = 4
innodb_log_file_size = 128M
innodb_flush_log_at_trx_commit = 2
innodb_flush_method = O_DIRECT
innodb_stats_on_metadata = OFF
innodb_lock_wait_timeout = 50

Any hint? I’m really stuck in trying to optimize this server.

Hi gandalf

The best option is to disable the query_cache and focus your attention on why the queries are slow; the query cache is disabled by default for a reason, it doesn’t scale (as you experienced) and has in fact been removed from MySQL. You can read more about this on the MySQL Server Team blog ([url]https://mysqlserverteam.com/mysql-8-0-retiring-support-for-the-query-cache/[/url]).

For example, how much data do you have? There is only a 4G buffer pool and so if you have more than that amount of data then you may be needing to pull from disk. Similarly, if you are executing queries that need to sort on disk then queries once again become slower. The following resources should help you:
[LIST]
[][url]Upcoming Webinars
[
][url]https://www.percona.com/doc/percona-server/5.7/diagnostics/slow_extended.html[/url].
[/LIST]
You could also use PMM ([url]Percona Monitoring and Management) to investigate what is behind the queries being slower than you expect.

Kind regards

Ceri

The whole db is less than 500mb. Query cache solved a lot of issues, i had trouble with query cache only after moving from our server google Cloud. Keep in mind that i have about 1600-1800 selects per second VS less than 7-10 (ten) writes per second

If could be useful: i don’t have myisam tables so i can reduce as much as possible everything related to it

I’ve disabled query cache (immediatly execution time increased a lot) and added the following:

Percona

log_slow_filter = full_scan,full_join,tmp_table_on_disk,filesort,filesort_on_disk
slow_query_log_file = /var/log/mysql/slow.log
long_query_time = 10
slow_query_log = ON

I don’t see anything logged, thus all executed query are properly using indexes and not making full scans or similiar.

This is the other query:

select * from orders where id IN (
SELECT orders_drivers.order_id
FROM orders_drivers
WHERE driver_id = 134
AND ignored = 0
AND FROM_UNIXTIME(delivery_at, GET_FORMAT(DATE,“ISO”)) = FROM_UNIXTIME(1537561200, GET_FORMAT(DATE,“ISO”))
AND delivery_at > 1537561200
) order by delivery_at asc limit 1;

Hi, judging by the query response time that you had concerns about, setting

long_query_time = 10

will show you nothing. Try something like the following:

long_query_time=0
log_slow_rate_limit=100
log_slow_rate_type=query
log_slow_verbosity=full
log_slow_admin_statements=ON
log_slow_slave_statements=ON
slow_query_log_always_write_time=1
slow_query_log_use_global_control=all
innodb_monitor_enable=all
userstat=1

For the query that you have posted here

FROM_UNIXTIME(delivery_at, GET_FORMAT(DATE,"ISO")) = FROM_UNIXTIME(1537561200, GET_FORMAT(DATE,"ISO")) 

If this is aiming to look for an order that is for delivery on the current day then you would be better to check that delivery_at is within a range with constants, e.g.

mysql> SET @ts := UNIX_TIMESTAMP(), @day_start_ts := UNIX_TIMESTAMP(CURDATE()), @day_end_ts := UNIX_TIMESTAMP(CURDATE() + INTERVAL 86399 SECOND);
Query OK, 0 rows affected (0.01 sec)

mysql> SELECT @ts BETWEEN @day_start_ts AND @day_end_ts, @ts, @day_start_ts, @day_end_ts;
+-------------------------------------------+------------+---------------+-------------+
| @ts BETWEEN @day_start_ts AND @day_end_ts | @ts | @day_start_ts | @day_end_ts |
+-------------------------------------------+------------+---------------+-------------+
| 1 | 1537532265 | 1537484400 | 1537570799 |
+-------------------------------------------+------------+---------------+-------------+
1 row in set (0.00 sec)

The use of variables here is just to demonstrate, but you would use delivery_at and could just use the functions in the comparison, i.e.:

delivery_at BETWEEN UNIX_TIMESTAMP(CURDATE()) AND UNIX_TIMESTAMP(CURDATE() + INTERVAL 86399 SECOND)

Hopefully that will help you find the queries.

Ceri

Thank you for the reply.


long_query_time = 10

this was a typo, i’ve set that to 0 and log_slow_rate_limit=50 just to increase verbosity.

I’ve replaced the above query with 2 single queries like the following:



SELECT `orders_drivers`.`order_id` 
FROM `orders_drivers` FORCE INDEX(`driver_id_AND_ignored_AND_delivery_at`)
WHERE `driver_id` = 46 
AND `ignored` = 0 
AND `delivery_at` BETWEEN UNIX_TIMESTAMP(CURDATE()) AND UNIX_TIMESTAMP(CURDATE() + INTERVAL 86399 SECOND)
AND `delivery_at` < 1537549200 

ORDER BY `delivery_at` DESC
LIMIT 1


and then just a simple



SELECT * FROM `orders` WHERE `id` = :id


:id is the order_id returned by the previous query Thanks to the index “FORCE INDEX(driver_id_AND_ignored_AND_delivery_at)” I was able to remove the filesort:



+----+-------------+----------------+------------+-------+---------------------------------------+---------------------------------------+---------+------+------+----------+-----------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+----------------+------------+-------+---------------------------------------+---------------------------------------+---------+------+------+----------+-----------------------+
| 1 | SIMPLE | orders_drivers | NULL | range | driver_id_AND_ignored_AND_delivery_at | driver_id_AND_ignored_AND_delivery_at | 11 | NULL | 3 | 100.00 | Using index condition |
+----+-------------+----------------+------------+-------+---------------------------------------+---------------------------------------+---------+------+------+----------+-----------------------+



This should be faster, right ? The aim is to look for the first delivery immediatly before the provided timestamp. (the where on CURDATE is to avoid searching across the whole table but insted just fetching the current day orders. I don’t know if this is useful in some way)

Now there are at least 2 more queries to optimize, if possible.
The biggest one (called tons of times):



EXPLAIN

select * from `delivery_zones` where ST_Contains(zone, POINT(x,y)) and `restaurant_id` = 100232 and ((`weekdays` & 16 or `weekdays` is null) and ((`hour_starts_at` is null and `hour_ends_at` is null) or (`hour_starts_at` is null and `hour_ends_at` >= '18:50:00') or (`hour_starts_at` <= '18:50:00' and `hour_ends_at` is null) or ('18:50:00' BETWEEN `hour_starts_at` AND `hour_ends_at`)) and ((`starts_at` is null and `ends_at` is null) or (`starts_at` is null and `ends_at` >= '2018-09-21 18:50:00') or (`starts_at` <= '2018-09-21 18:50:00' and `ends_at` is null) or ('2018-09-21 18:50:00' BETWEEN `starts_at` AND `ends_at`))) and `active` = 1 and `delivery_zones`.`deleted_at` is null order by `layer` asc;

+----+-------------+----------------+------------+------+-----------------------------------------------------------------+---------------+---------+-------+------+----------+----------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+----------------+------------+------+-----------------------------------------------------------------+---------------+---------+-------+------+----------+----------------------------------------------------+
| 1 | SIMPLE | delivery_zones | NULL | ref | zone,active,weekdays,restaurant_id,deleted_at,hours,starts_ends | restaurant_id | 5 | const | 1 | 5.00 | Using index condition; Using where; Using filesort |
+----+-------------+----------------+------------+------+-----------------------------------------------------------------+---------------+---------+-------+------+----------+----------------------------------------------------+
1 row in set, 1 warning (0.00 sec)




is making a filesort every time due to the ORDER BY layer

and



EXPLAIN 

select * from `products_categories_promotions` where `products_categories_promotions`.`category_id` = 1333 and `products_categories_promotions`.`category_id` is not null and `active` = 1 and (`starts_at` is null or `starts_at` < '2018-09-21 17:08:00') and (`ends_at` is null or `ends_at` > '2018-09-21 17:08:00') and `products_categories_promotions`.`deleted_at` is null order by `created_at` desc;

+----+-------------+--------------------------------+------------+------+-------------------------------------------------------+-------------------------------------------------------+---------+-------+------+----------+----------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------------------------------+------------+------+-------------------------------------------------------+-------------------------------------------------------+---------+-------+------+----------+----------------------------------------------------+
| 1 | SIMPLE | products_categories_promotions | NULL | ref | FK_products_categories_promotions_products_categories | FK_products_categories_promotions_products_categories | 4 | const | 1 | 5.00 | Using index condition; Using where; Using filesort |
+----+-------------+--------------------------------+------------+------+-------------------------------------------------------+-------------------------------------------------------+---------+-------+------+----------+----------------------------------------------------+
1 row in set, 1 warning (0.05 sec)




that is still making a filesort but is called less time. Is not so bad.

Any help with my latest two queries ? I’m still trying to optimize this app one query per time

Is safe to replace all occurrencies like the following



and (
(`starts_at` is null and `ends_at` is null) 
or (`starts_at` is null and `ends_at` >= '2018-09-26 18:30:00') 
or (`starts_at` <= '2018-09-26 18:30:00' and `ends_at` is null) 
or ('2018-09-26 18:30:00' BETWEEN `starts_at` AND `ends_at`)
)


to something like this:



AND (
'2018-09-26 18:30:00' BETWEEN COALESCE(`starts_at`, (NOW() - INTERVAL 1 DAY)) AND COALESCE(`ends_at`,(NOW() + INTERVAL 1 DAY))
)


Obviously in the real code ‘2018-09-26 18:30:00’ is the current datetime.
The aim is to retrieve all records where starts_at is in the past (or is null) AND ends_at is in the future (or is null)

In other words, all records that are not explicitly expired (ends_at in the past) or still to come (starts_at in the future)

Any help ?

Hey there, thanks for reaching out. Hopefully someone else from the community might be able to chime in on this for you.

If you have LOTS of queries coming up one at a time, it’s not going to be practical or realistic for the Percona team to be helping you out on a case-by-case basis, as the questions are going to be too specific to your own environment. We’ve given it a shot, but really once it becomes particular to an application we’re at the end of where it’s appropriate to provide professional assistance in an ‘open source and free’ forum.

If you need some professional support, though, I am happy put you in touch with the sales engineers here. They’d be able to look your case and work out how best to help you out.

The whole system is hard to describe here
what im trying to do is optimizing a couple of query called hundreds time per page access