subquery problem need help

dear all,

can anybody suggest me the way to improve the following query (it take long long time to execute now…)

select b.productname,((select unitprice from pricehistory where priceeffectivedate <= a.issuedate and productcode=a.productcode limit 1)*qty) as amount
from invoice as a
left join product as b on a.productcode=b.productcode

Thanks,

Useful information is missing in your question.

[URL]http://www.catb.org/esr/faqs/smart-questions.html[/URL]

i found in the blog that the subquery can use IN/… to improve the speed, but there is different for my case!

Please and thanks.

Uh, knowing the structure of the tables would certainly help!

Each invoice has a single productcode? Or is each record a line from the invoice? Is this query supposed to return results for a single invoice (line?) or all?

Your query is likely slow due to the query inside the SELECT clause which runs once for each invoice/product combination (and retrieves a list). This query is likely wrong any way because it limits the result - but probably arbitrarily since there’s no ordering of the list. What you probably want is the last price of the product before the issue date on the invoice. This can be fun if you’re using date columns vs datetime but that’s another issue. Realistically, why isn’t the price stored with the quantity? It would make this whole ordeal much simpler. Storing the extended price (total) would be even better.

There are a few ways to solve the problem - if the problem can be defined in more detail.

Troy

Dear Troy,

Thanks for helping this newbie.

Lets change a bit for the question, there are 4 tables:
1 “po” :
table structure → id,productcode,pono,deliverydate,qty
2 “pricehistory” :
table structure → id,productcode,effectivedate,curr,unitprice
3 “invoice” :
table structure → id,invoiceno,issuedate
4 “invoicedetail” :
table structure → id,invoiceid,poid,qty

Query to list one invoice :
select invoiceno,pono,issuedate,
(select curr from pricehistory where effectivedate <= a.issuedate and productcode=c.productcode order by effectivedate desc limit 1) as curr,
(select unitprice from pricehistory where effectivedate <= a.issuedate and productcode=c.productcode order by effectivedate desc limit 1)*c.qty as amount
from invoice as a
left join invoicedetail as b on a.id=b.invoiceid
left join po as c on b.poid=c.id
where invoiceno=“$invoiceno”

The result comes with a acceptable execute time (for me).
But, if we cancel the ‘where invoiceno=“$invoiceno”’…it takes incredible time…

How can I speed up this kind of query ?! :o

Best,

additional info,

explain query result as below

id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY a ALL NULL NULL NULL NULL 28646
1 PRIMARY b ref invoceid invoiceid 5 a.invoiceid 7
1 PRIMARY c eq_ref PRIMARY PRIMARY 4 b.poid 1
3 DEPENDENT SUBQUERY unitprice index productcode effectivedate 3 NULL 1 Using where
2 DEPENDENT SUBQUERY unitprice index productcode effectivedate 3 NULL 1 Using where


query 1 record : 0.6sec
query 50 records : 4.7sec
query all : hang… confused:

Do you have any indexes? Is it common that a product can be priced in different currencies over time (assuming the “curr” column is currency)? Can a single invoice have details with different currencies? Can you have an invoice without any details? If not, why are you using a left join between the two? Can you have an invoice detail without a PO? I guess not since the only place the productcode exists is in the PO table. Why is the “amount” calculated using the PO quantity instead of the detail quantity?

I still believe your biggest problem is not 1 but 2 queries within the SELECT clause - which will run for every invoice/detail/po combination. This also explains why it gets considerably slower when not filtering by a specific invoice. This is a schema design problem in my mind so if you’re designing something new I’d rethink it. On the other hand, if it’s existing code and you’re stuck with it then be prepared for inefficient queries and performance.

A second problem has to do with key/filter/join columns. It appears that the productcode may be a string (knowing the length would help) so the join to the history table and the PO is not very efficient. Assuming there’s a product table somewhere you should be using the id OR, at the very least, an indexed numeric column. Filtering by invoiceno also requires a string search but - as long as there’s an index on the column - it isn’t a big deal (although it would still be better if you were using the Primary Key column from the table to filter).

It would be helpful to read the currency and price from the same record in a single look up. Even better if we can get all product information for all details on all invoices (or a single detail on a single invoice without a completely new query). The problems here are that the only way to figure out which price we need is to find the last one prior to the invoice date and the LIMIT clause only works for one product at a time. Using an aggregate will help but a single pass won’t get us all the information we need. So, we need a combination. Something like this maybe:

SELECT i.invoiceno , p.pono , i.issuedate , h.curr , (h.unitprice * p.qty) AS amountFROM invoice AS i INNER JOIN invoicedetail AS d ON i.id = d.invoiceid INNER JOIN po AS p ON d.poid = p.id INNER JOIN ( SELECT li.id AS invoiceid , lp.productcode , MAX(lh.effectivedate) AS lastdate FROM invoice AS li INNER JOIN invoicedetail AS ld ON li.id = ld.invoiceid INNER JOIN po AS lp ON ld.poid = lp.id INNER JOIN pricehistory lh ON lp.productcode = lh.productcode WHERE lh.effectivedate <= li.issuedate AND li.invoiceno = “$invoiceno” GROUP BY li.id , lp.productcode ) AS iph ON i.id = iph.invoiceid AND p.productcode = iph.productcode INNER JOIN pricehistory h ON p.productcode = h.productcode AND iph.lastdate = h.effectivedateWHERE i.invoiceno = “$invoiceno”

You can remove the filters for the invoice number to get all information for all invoices but this may still not be incredibly fast. The subquery that gets the last history date for each invoice/product combination is not really optimized. It can be much more efficient IF the records are entered into the table in date order because you could get the record ID instead of the date which would make the second join to the table much nicer. As mentioned, storing the price and total amount (along with currency) with the detail would make a lot more sense.

Troy

Hello Troy,

Thank you very much for your suggestion. But it takes 47sec for 1 record…

I’m then give up using one query to do this. Using ‘where id in(…’ to create a temp table for the information of invoice,detail,po. It take under 5sec for 180K’s record.

create table abc (productcodevarchar(cool:, deliverydate date, qty int(10), key productcode(productcode), key deliverydate (deliverydate)
) select …


then

select productcode,deliverydate,qty,
(select unitp from productprice where a.deliverydate >= effdctivedate and productcode=a.productcode order by effdctivedate desc limit 1) as unitp
from abc as a

machine HANG again…

I think the main problem is the subquery… is there anyway to achieve it ?!

Thanks,

It’s because it’s still wrong. You want to remove the subquery from the SELECT clause - not make it more complex. You can do this with an additional join if you have the right data/structure.

I guessed at the query since I didn’t know the column definitions or indexes (and I still don’t - see my first and third paragraphs). Is 47 seconds faster than “incredible time”?

Again, I think your schema needs work. Storing price history like this is a recipe for disaster. The reason I’m so familiar with this problem is that I inherited a system which used the exact same structure to store changes in expected delivery dates over time. Having the history is important BUT using history for a relatively heavily used query will cause problems. Store the price with the invoice detail and you’re life will be a lot easier. OR, at the very least, store the price every day (whether it changes or not) as I mentioned earlier and you’ll only need to look up a specific date instead of trying to find the last change before a certain date. This is your (current) bottleneck - assuming you have good indexes on your tables of course!

I’m also questioning the arrangement of PO, invoice, details, and products. It could be much better (and probably more useful).

Troy

Hello Troy,

I do absolutely agree your point that put unit price into each row of the invoice detail so that can save my life. But, besides the detail, there are forecast and inventory need to link with the history of unit price, and not only the past but forecasted price… And also, it will be hard to make some revision on the invoice, such as cost down or discount the last month’s bill.

This bottleneck confusing me for long time, i can only make a monthly or half year sales report with a long long execute time.

Is that there are some other schema can suggest me ?! such as using temp table, join, union,…i believe you (experts) can help me to overcome it… ( :frowning: :frowning: (

thanks