Prefix table query performance

Hi all,

I’ve an interesting problem and could use some suggestions for improving the performance…

I have a MyISAM table containing about 650,000 text prefixes. These prefixes are telephone numbers such as international dialling codes, local exchange codes and also complete phone numbers.

create table numbers
{
number_id bigint unsigned auto_increment,
number char(32),
unique key (number_id),
key ( number, number_id)
}

There are some more columns of meta-info in the actual implementation but they are not important…

I am searching this table to find all matches to a complete telephone number, for example the search will return any matching prefixes and the complete number if present:

select number_id from numbers where number = left(“12345678”, length(number))

This is taking too long, in the order of seconds (on a fairly underpowered 1 CPU server to be honest) but is using the index according to EXPLAIN.

I have tried adding an extra column containing the length of the number and using that instead of the length(number) calculation in the query which helped allot, but now its not using the index.

create table numbers
{
number_id bigint unsigned auto_increment,
number char(32),
number_length int unsigned,
unique key (number_id),
key ( number, number_id)
}

select number_id from numbers where number = left(“12345678”, number_length)

Any suggestions for speeding up this kind of wierd reverse-prefix lookup greatfuly received )

Cheers!

Rupert

Questons-

  1. Does The number field contain both CountryCode and Operator Code? e.g. 92 (Pakistan Contry Code) and Operator Code (333 UFone). 92333.
    2.Do you first identify vendor/customer (Using EndPoint or Prefix) before you search for a dialed destination?

Hi,

The number field contains the national code and any more digits required to identify the prefix, eg for Central London, UK the number field would contain ‘0207’ and for the example you gave it would contain ‘0092333’ ( based on UK dialled numbers )

There is no need for a search for vendor etc, its just a one-shot lookup on this table.

Cheers,

Rupert

Try using varchar instead of char. It can affect how indexes perform.

You could try expressing your prefixes as an integer range. For example, the prefix 1-800 could be expressed as the integer range 18000000000 to 18009999999. Then your query would become SELECT * FROM prefixes WHERE 18005555555 >= range_start AND 18005555555 <= range_end.

I don’t know enough about phone numbers to be able to say if you can express all your prefixes in the same integer space, but if you can, this should help.

I have yet another alternative for you.

Keep your number_length column and add an index on
(number_length, number, number_id):

alter table numbers add index num_ix_len_num_id (number_length, number, number_id);

Then you query with a query like this:

SELECT number_idFROM numbersWHERE number IN ([full length of number], … ,‘12345678’,‘1234567’,‘123456’,‘12345’,‘1234’,‘123’,‘12’)ORDER BY number_length DESCLIMIT 1

That way it will perform a range scan of the index starting with the longest possible match and stop at the first matching row.

Hi guys

Thanks very much for the ideas, I’m looking at something else at the moment but hopefully I can get back to this and try these different approaches out and let you know how it goes.

Cheers!

Rupert