Curious apparent index lookup failure. Can anyone explain this unexpected behavior?

I have run across what appears to be a strange indexing failure, and a rather counter-intuitive workaround for it. This is in Percona Server 5.6.29-76.2-log, amd64 platform, running on RHEL6.7.

Consider the following table definition:

CREATE TABLE ip2location (
ip_from int(10) unsigned zerofill NOT NULL DEFAULT ‘0000000000’,
ip_to int(10) unsigned zerofill NOT NULL DEFAULT ‘0000000000’,
country_code char(2) NOT NULL DEFAULT ‘’,
country_name varchar(255) NOT NULL DEFAULT ’ ',
PRIMARY KEY (ip_from,ip_to)
)

This table contains 378,261 rows. Each ip_from and ip_to value appears in the table exactly once, as the first and last address in INET_ATON form of a specific IP range assigned to a particular country. So not only is (ip_from, ip_to) unique, but ip_from and ip_to are themselves also unique. ip_from and ip_to both increase strictly monotonically, ip_to is always strictly greater than ip_from, and there are no overlapping ranges. Any IP address will fall between the ip_from and ip_to fields of exactly one row.

Now, we do a lookup of an IP in this table as follows:

SELECT COUNT(*) FROM ip2location WHERE ip_from <= INET_ATON(‘65.78.23.18’) AND ip_to >= INET_ATON(‘65.78.23.18’);

This should and does return a single row. It should also need to SCAN only a single row. INET_ATON(‘65.78.23.18’) resolves to 1095636754, and there is exactly one row in which ip_from <= 1095636754 <= ip_to. That is this row:

mysql> SELECT * FROM ip2location WHERE ip_from <= INET_ATON(‘65.78.23.18’) AND ip_to >= INET_ATON(‘65.78.23.18’)\G
*************************** 1. row ***************************
ip_from: 1095628288
ip_to: 1096224919
country_code: US
country_name: United States
1 row in set (0.04 sec)

MySQL should be able use the index to resolve both WHERE conditions. However, let’s look at the execution plan:

mysql> EXPLAIN SELECT * FROM ip2location WHERE ip_from <= INET_ATON(‘65.78.23.18’) AND ip_to >= INET_ATON(‘65.78.23.18’)\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: ip2location
type: range
possible_keys: PRIMARY
key: PRIMARY
key_len: 4
ref: NULL
rows: 151312
Extra: Using where
1 row in set (0.00 sec)

We can also write the query this way, with the exact same execution plan and results:

mysql> SELECT * FROM ip2location WHERE INET_ATON(‘65.78.23.18’) BETWEEN ip_from AND ip_to\G
*************************** 1. row ***************************
ip_from: 1095628288
ip_to: 1096224919
country_code: US
country_name: United States
1 row in set (0.04 sec)

mysql> EXPLAIN SELECT * FROM ip2location WHERE INET_ATON(‘65.78.23.18’) BETWEEN ip_from AND ip_to\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: ip2location
type: range
possible_keys: PRIMARY
key: PRIMARY
key_len: 4
ref: NULL
rows: 151312
Extra: Using where
1 row in set (0.00 sec)

Whichever way we choose to write the query, we should be able to resolve this to a single row from the index. However, mysqld is scanning 151,000 rows. Why is this?

Here’s a clue:

mysql> EXPLAIN SELECT * FROM ip2location WHERE ip_from <= INET_ATON(‘65.78.23.18’)\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: ip2location
type: range
possible_keys: PRIMARY
key: PRIMARY
key_len: 4
ref: NULL
rows: 151312
Extra: Using where
1 row in set (0.00 sec)

That’s exactly the same number of rows. Apparently mysqld is looking up the first WHERE clause on ip_from against the primary key and getting 151,312 possible rows. But it is then scanning those 151,312 rows for the ip_to limit, instead of now checking the second WHERE clause against the same index to narrow it down to a single row.

I was able to devise the following somewhat counter-intuitive workaround, which exploits the query optimizer and a direct index lookup to actually get BETTER performance by adding a subquery in which the subquery’s table lookup is optimized out:

mysql> EXPLAIN SELECT * FROM ip2location WHERE ip_from = (select max(ip_from) FROM ip2location WHERE ip_from <= INET_ATON(‘65.78.23.18’))\G
*************************** 1. row ***************************
id: 1
select_type: PRIMARY
table: ip2location
type: ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 4
ref: const
rows: 1
Extra: Using where
*************************** 2. row ***************************
id: 2
select_type: SUBQUERY
table: NULL
type: NULL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: NULL
Extra: Select tables optimized away
2 rows in set (0.00 sec)

This resolves the query with only a single-row table lookup. But it should already be only needing to look up a single row, without having to use this slightly cryptic workaround.

Can anyone explain to me why the index is only resolving the ip_from condition in the “normal” query, instead of both the ip_from and ip_to conditions? Is it because the conditions are expressed as <= and >=?