Spatial Index rebuilds

mikes · February 11, 2021, 12:28pm

Hi,

Is there any way, other than doing a DROP INDEX followed by an ALER TABLE x ADD SPATIAL INDEX of carrying out SPATIAL INDEX maintanence ? I had hoped that using pt-online-schema-change might do the trick, but it seems not.

Also, is there any good method of showing the state of a spatial index. PMM revealed the underlying issue with a particular query, but is there any general test to show the state of an index.

The query in question’s performance improved by a couple of orders of magnitude, and needs to be done soon on our production setup.

Version | 5.7.22-22
Platform | Linux
Release | CentOS Linux release 7.7.1908 (Core)
Kernel | 3.10.0-1062.9.1.el7.x86_64

Thanks

Michael_Coburn · February 11, 2021, 4:40pm

Hi @mikes - welcome back to the Percona Forums!

What specific maintenance are you looking to achieve?

Generally you can use the following statement to gather information about your indexes on a per table basis - which specific data are you looking for?

SHOW INDEXES FROM <TABLENAME>;

PMM doesn’t have information regarding index usage but this sounds like a good feature request. If you’re using Percona Server you’ll find the userstat variable, when enabled, reveals index usage information User Statistics — Percona Server 8.0 Documentation

Can you show us the exact failure exhibited by pt-online-schema-change ?

mikes · February 12, 2021, 10:14am

Thanks Michael

The process was basically me trying to understand why certain queries were taking as long as they did. PMM does a great job of showing exactly which queries they are, but you then need to do a bit of detective work.

I was using our test DB servers to try and understand the logic being used in the spatial query that was reported as slow. It was using a bounding box, based on geographic points stored in spatial indexes.

All that

SHOW INDEXES FROM ;

Does is show me that I have 3 spatial indexes defined ( I did want to check that was the case) and that we have a sub_part setting of 32.

It was only by chance that running the same command against the test instance, which is a small t2.medium in AWS, did I eventually see the major performance difference.

I think the logic was that as both servers were showing a similar bad performance, then I may as well try an index rebuild, which brought an instant 100x speed up.

This is quite a small table, less than 3 GB, and took less than 2 mins to rebuild the index. That is likely to be much quicker on the main production server with SSD’s, but I tried one other thing.

The reasoning being that I hoped to be able to use pt-online-schema-change to copy the table and gain the performance increase that way.

I tried this on a second test instance, the command worked, and saved a little space, but no magic performance increase, which I think is to be expected. That only improved with an index rebuild also.

So my ultimate questions

Can you do an offline index rebuild - expect that the answer to that is no - use a secondary and promote it perhaps
How do I find out that an index needs a rebuild. What stats can I use to indicate that an index might need some maintenance. Waiting till it gets to the top of the slow queries list in PMM is perhaps not the best way to find out.

Kind regards,

Mike

Michael_Coburn · February 12, 2021, 8:49pm

Hi @mikes , thanks for your reply - In general you don’t need to schedule maintenance on indexes in general, nor on spatial indexes specifically. The best way to tell if your query is optimized is to use the EXPLAIN function, which will show you what the MySQL query optimizer would do when it executes the query. This ought to answer your question regarding

understand why certain queries were taking as long as they did

Regarding your questions:

An “offline” build could mean you pull the server from taking production queries, and run a no-op ALTER TABLE ... ENGINE=InnoDB. This has the benefit of rebuilding not just the indexes but the data rows as well, and will reduce the size on disk as the data will be sorted in Primary Key order. You can also do this while the instance is serving queries, however ALTER TABLE will block writes until it completes.
Generally examining the query performance over time is the best way to tell when you need to perform maintenance such as index rebuilds. You can do this via PMM Query Analytics and plot over the last 30 days - how has the avg performance changed? This is more pro-active than waiting until the query bubbles to the top of Query Analytics.

If you could share the full output of the SHOW INDEXES and SHOW CREATE TABLE, along with some of the queries you’re running we can potentially investigate this further.

mikes · February 13, 2021, 12:30pm

Hi Michael,

That is the strange thing here, the ALTER TABLE …ENGINE=InnoDB was the first option I tried, it reduced the table in size, but didn’t increase the performance. That only changed when the index was dropped and rebuilt.

I’ll send you more details later on.

Thanks,

Mike

mikes · March 3, 2021, 9:31am

Hi Michael,

An update. I restored a recent backup to a new DB server yesterday, and thought I would test out index rebuilding etc, before running a comparison test. To my surprise, the new server, which is vastly more powerful the the AWS instance used to compare against, showed little improvement, and was approx 100x slower. The same query was being used, and the same result generated. An EXPLAIN statement showed that the larger machine was not selecting the spatial index. Instead, it seemed to pick up on the final ORDER BY instead.

Forcing the query to use the spatial index did the expected, and reduced the run time from 8-9 sec to 0.1 sec. I’ve checked the settings for both DB servers, run SHOW WARNINGS after the EXPLAIN, and done an ANALYZE TABLE and EXPLAIN againt the table, there appear to be no differences.

Where else might I look to see what is causing this difference in behaviour ?

Code snippet, results and EXPLAIN follow, startpoint, endpoint, and startpointFipped are the spatial indexes.

SELECT
r0_.routeId AS routeId0,
IF(
length(r0_.startpoint) = 0,
“”,
ST_AsText(r0_.startpoint)
) AS startpoint1,
IF(
length(r0_.startpointFlipped) = 0,
“”,
ST_AsText(r0_.startpointFlipped)
) AS startpointFlipped2,
IF(
length(r0_.endpoint) = 0,
“”,
ST_AsText(r0_.endpoint)
) AS endpoint3,

r0 defs follow

FROM
route_metadata r0_
WHERE
MBRWITHIN(
r0_.startpoint,
ST_GeomFromText(
‘MULTIPOINT(50.758851 5.122327, 51.909877 6.964633)’
)
)
AND
(
r0_.owner = ‘1234567’
OR r0_.isPublic = 1
)
AND r0_.category IN (‘10’, ‘41’, ‘72’, ‘30’, ‘11’)
AND r0_.difficulty IN (0, 3, 2, 5, 1, 4)
AND r0_.length >= ‘500’
AND r0_.length < ‘1000’
AND r0_.rating >= ‘0’
AND (
r0_.owner = ‘1234567’
OR r0_.price <= ‘0’
)
ORDER BY
r0_.qualityScore DESC
LIMIT
10 OFFSET 0;

RESULT

Primary DB
±-------------±--------------------------±--------------------------±--------------------------±--------±--------±--------±----------±----------±------------------------±---------------±---------------------------±------------------------------------------------------+±-------------------------------------------------------±--------±-----------±-------------±----------±-----------+
| routeId0 | startpoint1 | startpointFlipped2 | endpoint3 | length4 | rating5 | price6 | hasImage7 | hasAudio8 | shortDescriptionLength9 | qualityScore10 | name11 | shortDescription12 || coverImage27 | owner28 | category29 | difficulty30 | surface31 | supplier32 |
±-------------±--------------------------±--------------------------±--------------------------±--------±--------±--------±----------±----------±------------------------±---------------±---------------------------±------------------------------------------------------+±-------------------------------------------------------±--------±-----------±-------------±----------±-----------+
| 841_23509889 | POINT(51.258987 6.732483) | POINT(6.732483 51.258987) | POINT(51.258987 6.732483) | 765.83 | 0 | 0.00000 | 1 | 0 | 222 | 45000 | CARAVAN SALON Route Nord 1 | CARAVAN SALON Route N1 startet in || http ://img/280/210/36868436/.jpg | 1234567 | 10 | 0 | 3 | 12600 |
±-------------±--------------------------±--------------------------±--------------------------±--------±--------±--------±----------±----------±------------------------±---------------±---------------------------±------------------------------------------------------+±-------------------------------------------------------±--------±-----------±-------------±----------±-----------+
1 row in set (9.11 sec)

AWS Instance
±-------------±--------------------------±--------------------------±--------------------------±--------±--------±--------±----------±----------±------------------------±---------------±---------------------------±------------------------------------------------------+±-------------------------------------------------------±--------±-----------±-------------±----------±-----------+
| routeId0 | startpoint1 | startpointFlipped2 | endpoint3 | length4 | rating5 | price6 | hasImage7 | hasAudio8 | shortDescriptionLength9 | qualityScore10 | name11 | shortDescription12 || coverImage27 | owner28 | category29 | difficulty30 | surface31 | supplier32 |
±-------------±--------------------------±--------------------------±--------------------------±--------±--------±--------±----------±----------±------------------------±---------------±---------------------------±------------------------------------------------------+±-------------------------------------------------------±--------±-----------±-------------±----------±-----------+
| 841_23509889 | POINT(51.258987 6.732483) | POINT(6.732483 51.258987) | POINT(51.258987 6.732483) | 765.83 | 0 | 0.00000 | 1 | 0 | 229 | 45000 | CARAVAN SALON Route Nord 1 |

CARAVAN SALON Route N1 startet in || http ://img/280/210/25277664/.jpg | 1234567 | 10 | 0 | 3 | 12600 |
±-------------±--------------------------±--------------------------±--------------------------±--------±--------±--------±----------±----------±------------------------±---------------±---------------------------±------------------------------------------------------+±-------------------------------------------------------±--------±-----------±-------------±----------±-----------+
1 row in set (0.10 sec)

EXPLAIN

Mike

Topic		Replies	Views
How to use Percona server for mysql indexing? Other MySQL® Questions pmm , mysql , percona	1	473	December 29, 2023
pt-online-schema-change and disk space Percona Toolkit	1	1411	March 4, 2014
Add index on huge table when using pt-online-schema-change Percona Toolkit mysql , percona	1	1733	July 9, 2021
Index prefix optimisation Percona Server for MySQL 8.0	3	635	October 17, 2019
Queries performing worse in 5.6 with optimizations Percona Server for MySQL 5.6	3	751	June 12, 2014

Spatial Index rebuilds

Related topics