Bounding box search with order and limit....

Basically I have a site which needs to do a bounding box search and order the results by price, and provide a limit, so I have a query that is like …

SELECT * FROM table WHERE longitude BETWEEN -3 AND -1 AND latitude BETWEEN 50 AND 52 ORDER BY price LIMIT 20,30

The table has several 100’000 rows in it, but searches are very slow. I setup an index called location that holds BTREE indexes for longitude and latitude. and in index for price. I also tried combining location so it longitude, latitude, and price combined in it and it is still very slow. Obviously my main application has a tad more to the query (small join, and more in the WHERE clause but I know these are not the slow point in my query)

I know for a highly populated area that my bounding ‘box’ area will sometimes contain around 40000 results, which need to be sorted for before LIMIT can operate. However because it uses an index for the bounding box created by the lat/longs, it doesn’t seem to be able to use any indexes on the price, so a filesort is needed on 40000 odd rows at times, even if the location index has latitude, longitude and price inside it.

Is there anything I can do here to allow MySQL to make better use of indexes. I know FORCE INDEX on price would help a little on slow queries when the number of results with in the bounds is high, however it’s still too slow for my needs (web app), and I do know in these cases that if I switched to PostgreSQL the query would be a lot quicker for this given situation as it is able to combine several indexes when multiple criertia are needed. However I’d rather stick with MySQL because all my other queries are very simple and thus would be faster on MySQL as PostgreSQL has a little more overhead with processing queries.

Any ideas anyone. I would be very grateful!

Try:

INDEX(longitude, latitude, price)

Yes, I tried that, and had no luck… It just uses the index on the location bounds like before, but still sorts manually as the EXPLAIN still states it’s got to do a filesort.

Right, I think I understand…

The majority of rows in this table are actually within your range, this MySQL is resorting to basic filesort.

So is there no way round this? Am I going to have to migrate to PostgreSQL?

Migrating to Postgres won’t help unless their server isn’t limited by the laws of mathamatics.

You best bet is to have a HEAP/MEMORY table with these values and later join the rest of the information from the results of the query on the memory table.

The memory table will undoubtedly perform quite quickly.

Let’s imagine lat is DOUBLE, so is lon: 16Bytes for that.

Whats you’re MAX value for price? Might fit in SMALLINT - 2 bytes or require full int: 4 bytes. Then primary ID: 4 bytes?

So 24+ bytes per row. So depending on your number of rows, putting in memory type table should be possible.

The memory needed for one row in a MEMORY table is calculated using the following expression:

SUM_OVER_ALL_BTREE_KEYS(max_length_of_key + sizeof(char*) × 4)

  • SUM_OVER_ALL_HASH_KEYS(sizeof(char*) × 2)
  • ALIGN(length_of_row+1, sizeof(char*))

More info here:
http://dev.mysql.com/doc/refman/5.0/en/memory-storage-engine .html

What is your sort_buffer_size server variable set to???
The default 2MB is pretty small.
If this is to small then mysql will write a temporary file to disk and perform the sort on the file instead. Which will be a huge performance impact.
Try increasing it to about 10MB (or calculate how large your result set would be before the LIMIT to be able to set a more accurate value).

My general notion is that if you have a join and order by in the query then MySQL will almost always create a temporary table that is sorted. The only thing you can do then is to make sure the sort buffer is large enough and that you have enough CPU power.