Any suggestions to tackle a slow query on a large table with composite PK?

Actual query:SELECT SQL_NO_CACHE country_abbreviation, country_name, region, city, isp, latitude, longitude FROM ip2location WHERE ip_first <= 3335238065 AND ip_last >= 3335238065; 4.6 secondsProfile:Status Time(initialization) 0.000004checking query cache for query 0.000073Opening tables 0.000014System lock 0.000022Table lock 0.00001init 0.000032optimizing 0.000015statistics 0.000213preparing 0.000131executing 0.000011Sending data 4.651696 <— What? it’s 1 recordend 0.000047query end 0.000008freeing items 0.000019closing tables 0.000011logging slow query 0.000004Explain:1 SIMPLE ip2location range PRIMARY,ip_last,ip_first PRIMARY 4 NULL 1886656 Using whereResearch:SELECT SQL_NO_CACHE ip_first FROM ip2location WHERE ip_last >= 3335238065; .0007 secondsSELECT SQL_NO_CACHE ip_last FROM ip2location WHERE ip_first <= 3335238065; .0007 secondsSELECT SQL_NO_CACHE count(ip_last) FROM ip2location WHERE ip_first <=3335238065; 2,251,823 rows (1.3 seconds)SELECT SQL_NO_CACHE count(ip_first) FROM ip2location WHERE ip_last >= 3335238065; 1,406,113 rows (.65 seconds)SELECT SQL_NO_CACHE max(ip_first) AS first_max FROM ip2location WHERE ip_last >= 3335238065; 4278190080 (1 second)SELECT SQL_NO_CACHE min(ip_first) AS first_min FROM ip2location WHERE ip_last >= 3335238065; 3335237632 (.69 seconds)SELECT SQL_NO_CACHE max(ip_last) AS last_max FROM ip2location WHERE ip_first <= 3335238065; 3335239167 (1.4 seconds)SELECT SQL_NO_CACHE min(ip_last) AS last_min FROM ip2location WHERE ip_first <= 3335238065; 33996343 (1.37 seconds)

The table itself has a composite PK on ip_first and ip_last. I wasn’t sure if it was not correctly using the keys because it was a composite, so i added a normal index on both ip_last and ip_first separately. Initially I thought I saw a 2-4 second improvement in query performance, however, that was months ago. to my knowledge queries like this have not ever run under 4 seconds though. This is a query which runs when someone first visits the site which causes a 5-10 second delay in some cases. It would be great if I could find a way to tune this down somehow to around or below 1 second.

Thanks in advance for any help!

The schema, for reference:

ip_first int(10) unsigned NOT NULL,
ip_last int(10) unsigned NOT NULL,
country_abbreviation varchar(2) NOT NULL default ‘’,
country_name varchar(42) NOT NULL default ‘’,
region varchar(45) NOT NULL default ‘’,
city varchar(37) NOT NULL default ‘’,
latitude float default NULL,
longitude float default NULL,
isp varchar(103) NOT NULL default ‘’,
PRIMARY KEY (ip_first,ip_last),
KEY ip_last (ip_last),
KEY ip_first (ip_first)

The problem is the “WHERE first key part … AND second key part …” which MySQL handles by performing a range scan on the entire index which in your case is the primary key which in InnoDB basically means a table scan since data is stored in leaves.

Some info needed to try to solve your problem:
Do you have ip_first,ip_last combinations that overlap anywhere?
If you don’t then by tweaking your data a bit (if you have “holes” in you sequence) you can for example write your query like this:

…WHERE ip_first >= 123456ORDER BY ip_first DESCLIMIT 1;

Which MySQL will be able to optimize to use only the ip_first index and return a result very fast.

Try to add a key: first_last(first, last) and force mysql to use it. Then show us an explain results along with timing.

Notice: to force mysql to use some key, try USE INDEX(keyname) after the table name.

That would be essentially the PK since it is a composite between ip_first and ip_last. I deleted the two other indexes for this set, but left the PK

Status Time
(initialization) 0.000005
checking query cache for query 0.000195
Opening tables 0.000032
System lock 0.000025
Table lock 0.000016
init 0.000043
optimizing 0.000019
statistics 0.000138
preparing 0.000087
executing 0.000015
Sending data 6.511645
end 0.000044
query end 0.000007
freeing items 0.000017
closing tables 0.000009
logging slow query 0.001079

1 SIMPLE ip2location range PRIMARY PRIMARY 4 NULL 1864658 Using where

(1 total, Query took 4.7419 sec)

Did you read my post three posts up?

[B]sterin wrote on Fri, 05 October 2007 17:52[/B]

Some info needed to try to solve your problem:
Do you have ip_first,ip_last combinations that overlap anywhere?

SELECT COUNT(*) AS total_rows, ip_first
FROM ip2location
GROUP BY ip_first
HAVING total_rows >1;
0 results

SELECT COUNT(*) AS total_rows, ip_last
FROM ip2location
GROUP BY ip_last
HAVING total_rows >1;
0 results

SELECT COUNT(*) AS total_rows, CONCAT_WS(’_’,ip_first,ip_last) as first_last
FROM ip2location_disk
GROUP BY first_last
HAVING total_rows > 1;
0 results (as you’d expect from a pk)

I didn’t realize before this that ip_first and ip_last were also unique on their own. Need to test more queries.

[B]sterin wrote on Sun, 07 October 2007 08:47[/B]
Did you read my post three posts up?

I’ve run several queries with different ips and the results are identical to ip_first / last combo WHERE only you’re right, mysql optimized out the difference.

.00007 seconds is much improved. Thanks for helping me tackle this )