Range scan question


Been reading through the High Performance book and stumbled on the part suggestion using IN(…) in order to make subsequent columns in the index accessible.

I have some questions - given the table parts with columns (id, sizes, created_at) and an index on (sizes, created_at) - if the column sizes can have values 1 through 4, would the following perform equally well:

SELECT * FROM parts WHERE sizes IN (3,4)
SELECT * FROM parts WHERE sizes > 2

This is to understand the case where there’s no need for not doing the range scan, ie. there’s no need to use a subsequent column in the index.

Another range scan question: Using != will result in a “bad range scan” right? Example:

– Bad
SELECT * FROM parts WHERE sizes != 4 AND created_at > …

– Good
SELECT * FROM parts WHERE sizes IN (1,2,3) AND created_at > …

Is this understanding correct or did I not get the subtleties right?

Thank you.

Regarding your first question, both queries should perform essentially the same range scan, so there should be no difference in performance as far as I know.

The second question I’m not entirely sure about. Assuming your != statement eliminates enough rows for the opitmizer not to prefer a table scan, then a range scan will performed. I’m not sure if it would use the created_at criteria or not.