mysql tables crashing

[B]allworknoplay wrote on Fri, 14 September 2007 01:28[/B]

±----------±-------------±-----±----±--------±------+
| Field | Type | Null | Key | Default | Extra |
±----------±-------------±-----±----±--------±------+
| device_id | int(10) | NO | PRI | | |
| timestamp | int(10) | NO | PRI | | |
| data_type | varchar(32) | NO | PRI | | |
| object | varchar(120) | NO | PRI | | |
| id | varchar(10) | NO | MUL | | |
| actual | int(1) | NO | | 0 | |
| gdata | bigint(20) | NO | | 0 | |
| gdata1 | bigint(20) | NO | | 0 | |
±----------±-------------±-----±----±--------±------+

Wait! eek: As per mySQL standards, you should have only ONE primary key per table! [URL="http://dev.mysql.com/doc/refman/5.0/en/create-table.html"]http://dev.mysql.com/doc/refman/5.0/en/create-table.html[/URL] [I]"A PRIMARY KEY is a unique index where all key columns must be defined as NOT NULL. If they are not explicitly declared as NOT NULL, MySQL declares them so implicitly (and silently). [B]A table can have only one PRIMARY KEY."[/B][/I]

So you should have only one PRI, the other indexes should be noted as MUL, just like the field “id”. confused:
That’s why your table was crashing man! I think we found out! Because the “timestamp” may not be unique (as I said earlier in message #1777 in this topic), same for the “data_type” and “object” fields! So be sure that “device_id” is UNIQUE in your design! (i.e: using AUTO_INCREMENT, not being generated by your script)

If you had had the scripting error_reporting, you would have had an error like “can’t INSERT duplicate keys” or “can’t DELETE duplicate keys”. cool:

[B]allworknoplay wrote on Fri, 14 September 2007 01:28[/B]

Plus this stuff is PRODUCTION. If I mess up at all in any which way, I am in TROUBLE!!!

Well, it can't be worse than MyISAM )....at least if you design your tables & write scripts [U]properly[/U]! D

Well I was hoping I didn’t have to explain what the table was all about but here goes…

The first 4 columns are unique as a WHOLE when I built the table, at the end I basically said,
UNIQUE(device_id, timestamp,data_type,object) so data in all 4 columns have to be a unique mix. Below is the SHOW CREATE TABLE.

CREATE TABLE data_day (
device_id int(10) NOT NULL,
timestamp int(10) NOT NULL,
data_type varchar(32) NOT NULL,
object varchar(120) NOT NULL,
id varchar(10) NOT NULL default ‘’,
actual int(1) NOT NULL default ‘0’,
gdata bigint(20) NOT NULL default ‘0’,
gdata1 bigint(20) NOT NULL default ‘0’,
UNIQUE KEY device_id (device_id,timestamp,data_type,object),
KEY id (id)
) ENGINE=MyISAM DEFAULT CHARSET=latin1

Let me ask you this, say I am inserting a lot of data that is already unique so mysql has to reject them, do you think that in itself could be causing the crash?

You can have an INDEX that is NOT UNIQUE. But you can’t have more than ONE PRIMARY INDEX, and that PRIMARY INDEX should be UNIQUE.

UNIQUE = there is no duplicate (in the same column). You can’t consider 4 columns being unique as a whole for a PRIMARY INDEX.

I don’t know if I answered your question?..

For example, if I do:
SELECT * FROM data_day WHERE device_id = 1175214041;
it should always return 0 or 1 result, even if the other columns are different as a whole.

The idea behind a PRIMARY INDEX is that it identifies each row uniquely. Let’s imagine you have a list of prices, and your items have a bar code. The bar code is a PRIMARY INDEX because you can only have one price per bar code (even if the packaging changes, or the name of the product changes, or where it comes from changes; it doesn’t matter, because we’re talking about the same product).

What cause the crash is not the INSERT, it’s the DELETE. I don’t know how you make your query, but I guess MySQL wants to delete every row that as the same device_id (or any of the other 3 PRIMARY INDEXes being the same), resulting in a crash. When mySQL uses a PRIMARY INDEX for deleting, only ONE ROW can be deleted… whereas your table may contains more than a row with the same device_id (or more than a row with the same timestamp, etc).

Imagine that I ask you: “delete the product which has the bar code 1175214041”, then you start to search on your list “which product has the bar code 1175214041?” and you end up with more than a single result, which one would you delete? (as you can only delete one) eek: … => crash :o

UNIQUE(device_id, timestamp,data_type,object) means:

  • device_id contains no duplicate ;
  • timestamp contains no duplicate ;
  • data_type contains no duplicate ;
  • object contains no duplicate.

UNIQUE(device_id, timestamp,data_type,object) doesn’t mean: “there’s only ONE UNIQUE row having a defined: device_id AND timestamp AND data_type AND object”.

If you want the SQL commands to fix your indexes, ask me.

[B]jcn50 wrote on Fri, 14 September 2007 10:28[/B]

UNIQUE(device_id, timestamp,data_type,object) means:

  • device_id contains no duplicate ;
  • timestamp contains no duplicate ;
  • data_type contains no duplicate ;
  • object contains no duplicate.

UNIQUE(device_id, timestamp,data_type,object) doesn’t mean: “there’s only ONE UNIQUE row having a defined: device_id AND timestamp AND data_type AND object”.

HMMM!!! Very very interesting…

Let me find some of my DELETE commands to see exactly how I’m doing it. But what you said is not how I intended my scripts
to work, that would explain a lot…

I always thought that including 4 unique columns would mean that the combination needs to be unique which is what my intentions were. so to make it simpler, let’s play with 2 columns…

So I could insert a row with:
device_id = 10
timestamp = 12345677

But if I wanted to insert again:

device_id = 10
timestamp = 12345677

It shouldn’t let me do that…atleast that is what I thought
I was doing…

I will confirm my DELETES, but usually the deletes are based
on timestamp, not even the device_id…

So if I have 10 records with device_id 10,12,13,14,15 etc…

My cleanup script would just delete any record that has a timestamp older than say 24 hours…

So I didn’t think that I needed to be specific in that sense when deleting???

Thoughts?

OK, if you didn’t intend to do what I wrote… it simply means that you have NO PRIMARY INDEX! :smiley:

I’m not saying that what you want you to do is wrong, I’m just saying that the columns which are set to UNIQUE (or “PRIMARY INDEX”) are wrong/misused.

SQL commands are:
ALTER TABLE data_day DROP PRIMARY KEY
(should delete all primary keys)

ALTER TABLE data_day DROP INDEX device_id
ALTER TABLE data_day DROP INDEX timestamp
ALTER TABLE data_day DROP INDEX data_type
ALTER TABLE data_day DROP INDEX object
(should remove the unique index)

ALTER TABLE data_day ADD INDEX device_id
ALTER TABLE data_day ADD INDEX timestamp
ALTER TABLE data_day ADD INDEX data_type
ALTER TABLE data_day ADD INDEX object
(should add a simple index)

And you’re done!

Copy/paste your fields’ type again, to show 5 beautiful “MUL” and no more “PRI”.

To answer your question, yes the data_day table should not have any column that is a primary key. The device_id is a primary key in another table and we use that in the data_day table for other purposes. Basically we dump a lot of data into the data_day table and I guess, the reason why we used Unique( 4 columns ) is that I didn’t want any duplicate entry if any of the INSERTS all matched the 4 columns exactly…

Does that make any sense? So…

col1, col2, col3, col4 could have values 1,2,3,4 respectively
and if I wanted to INSERT those again, it should reject it…

But 1,2,2,3 or 1,1,1,1 any other combo would work…

I think we’re pretty much on the same page, just want to confirm…

You have suggested other things to try earlier in this thread
so I want to try those first…you’ve given me plenty of troubleshooting ideas!!!

[B]allworknoplay wrote on Fri, 14 September 2007 19:18[/B]
To answer your question, yes the data_day table should not have any column that is a primary key. The device_id is a primary key in another table and we use that in the data_day table for other purposes. Basically we dump a lot of data into the data_day table and I guess, the reason why we used Unique( 4 columns ) is that I didn't want any duplicate entry if any of the INSERTS all matched the 4 columns exactly..

Does that make any sense? So…

Ok I got it... but it doesn't work like that. (see below)
[B]allworknoplay wrote on Fri, 14 September 2007 19:18[/B]
col1, col2, col3, col4 could have values 1,2,3,4 respectively and if I wanted to INSERT those again, it should reject it..

But 1,2,2,3 or 1,1,1,1 any other combo would work…

No it will not work: - if you already have 1,2,3,4 AND - if you try to insert 1,1,1,1 => mySQL won't proceed the INSERT (because the first value is not distinct in the column), and mySQL will simply dump 1,1,1,1.
[B]allworknoplay wrote on Fri, 14 September 2007 19:18[/B]

You have suggested other things to try earlier in this thread
so I want to try those first…you’ve given me plenty of troubleshooting ideas!!!

Debugging is part of programming )... good luck!

Ok here is one of my codes…

INSERT INTO data_day (device_id,timestamp,data_type,object,actual,gdata) VALUES (‘“.$device_id.”’,‘“.$timestamp.”’,‘disk’,‘“.$VLetter.”’,‘1’,’ “.$PercentUsedDisk.”')

The first 4 obviously are the ones in question, and the last 2 (actual,gdata) could be of any value, that doesn’t and shouldn’t have any affect on the INSERTS…

If you don’t really need to have 4 seperated columns, you could merge columns device_id,timestamp,data_type,object into a single UNIQUE column (let’s call it “merged”).
You will separate the values by a separator (let’s pick up the semicolon).
So your “merged” column’s data could look like this:
1175214041;1189629601;disk;C

I have no other suggestion for now (at the mySQL level) I could think of (yet).

This particular script doesn’t have any DELETES.

I have another script that loops through this table 2.5 million rows and just deletes anything with a timestamp older than a certain period like 24 hours…

So I don’t think I have any WHERE clauses in the DELETE statements because I assume I didn’t have to be specific, all I care is that if any record with a timestamp older than say 24 hours, then DELETE…

[B]jcn50 wrote on Fri, 14 September 2007 15:48[/B]
If you don't really need to have 4 seperated columns, you could merge columns device_id,timestamp,data_type,object into a single UNIQUE column (let's call it "merged"). You will separate the values by a separator (let's pick up the semicolon). So your "merged" column's data could look like this: 1175214041;1189629601;disk;C

I have no other suggestion for now (at the mySQL level) I could think of (yet).

Well I can’t merge them because I have other programs that call this table that look for a specific column and do something else with it, like the data_type and object.

That’s what makes this whole thing difficult, because everything is production and one change to one column or table could have affects on other scripts that I completely forgot uses it…

I think my first two steps at this point are to drop the index on the ID column and start using DELETE QUICK everywhere I can think of!!!

[B]allworknoplay wrote on Fri, 14 September 2007 19:45[/B]
Ok here is one of my codes...

INSERT INTO data_day (device_id,timestamp,data_type,object,actual,gdata) VALUES (‘“.$device_id.”’,‘“.$timestamp.”’,‘disk’,‘“.$VLetter.”’,‘1’,’ “.$PercentUsedDisk.”')

The first 4 obviously are the ones in question, and the last 2 (actual,gdata) could be of any value, that doesn’t and shouldn’t have any affect on the INSERTS…

I bet you won’t have a lot of completed INSERTs, because data_type is not UNIQUE on this line of the script!

How many results do you have for this query?
SELECT * FROM data_day WHERE data_type = “disk”;

[B]allworknoplay wrote on Fri, 14 September 2007 19:49[/B]
I have another script that loops through this table 2.5 million rows and just deletes anything with a timestamp older than a certain period like 24 hours...

So I don’t think I have any WHERE clauses in the DELETE statements because I assume I didn’t have to be specific, all I care is that if any record with a timestamp older than say 24 hours, then DELETE…

2.5 million rows?! Wow, DELETE QUICK is a must D… Because after each DELETE, your indexes are rebuilt! That’s a waste of resources.
I hope your mySQL server automatically locks the table “data_day”, because if you have an INSERT when the INDEX is being rebuilt, you are likely to have a bunch of crashes!

Here is one of my DELETE scripts, short and small, I guess all I have to do is add QUICK right?

$timeframe_day = strtotime(“-24 hours”);
$query_day = “DELETE FROM data_day WHERE timestamp <='”.$timeframe_day."’ ";
$result_day = @mysql_query($query_day);
unset($query_day,$result_day);

So…

DELETE FROM data_day WHERE timestamp

Becomes…

DELETE QUICK FROM data_day WHERE timestamp

Also, do you think the UNSET variable is necessary?
I use it in my other program that has over 2000 lines of code. I just want to release any memory PHP is using and avoid and possible overlapping of variables…

[B]allworknoplay wrote on Fri, 14 September 2007 20:25[/B]
Also, do you think the UNSET variable is necessary? I use it in my other program that has over 2000 lines of code. I just want to release any memory PHP is using and avoid and possible overlapping of variables..
If your script does something else that this DELETE QUICK thereafter, yes it could help. Otherwise, if it's only a cron job that run 1 time each day (a "short script" as you call it) the variables will be unset anyway at the end of the script's execution. If you use $query_day and $result_day again later, it's a waste of time because you destroy the memory allocated by it, and you re-create a memory space again after. I guess UNSET is usefull if your script is running for, let's say, a week...

Not to derail the topic, but how well do you know PHP?

[B]allworknoplay wrote on Fri, 14 September 2007 20:38[/B]
Not to derail the topic, but how well do you know PHP?

Well, I use it very much, also doing PHP executable for Windows. But I didn’t upgrade myself to PHP5 yet (… Also, I never used the object-oriented programming (for me the code should be sequential).

OK, when I think about it, the crash may come from the “thread_concurrency” variable set to >1.

  • Imagine that you have one of your script is running a lot of DELETE queries;
  • At the same time, another script is running a lot of INSERT queries.

Crash can occurs when an INSERT query occurs while rebuilding the INDEXes: the server will miss some inserted keys, ending with more rows than the number of keys.
mysql> check table data_day;
±-------------------±------±---------±------------------ ------------+
| Table | Op | Msg_type | Msg_text |
±-------------------±------±---------±------------------ ------------+
| collector.data_day | check | warning | Table is marked as crashed |
| collector.data_day | check | error | Found 2434089 keys of 2434281 |
| collector.data_day | check | error | Corrupt |
±-------------------±------±---------±------------------ ------------+

This is the only logical case I could think of! :o
As we said earlier, the advantage of using DELETE QUICK is that the INDEX is not rebuilt. Also, when you run OPTIMIZE TABLE, the entire table will be locked, as per mySQL documentation:
[URL=“http&#58;&#47;&#47;MySQL :: MySQL 8.0 Reference Manual :: 13.7.3.4 OPTIMIZE TABLE Statement”]http://dev.mysql.com/doc/refman/5.0/en/optimize-table.html[/URL]

So my #1 suggestion will be to set thread_concurrency to 1, even if you have 2 CPUs…
Also, previous mySQL version (< v5) didn’t have this thread_concurrency variable!..

I have NO IDEA what this means…

DELETE QUICK is not useful when deleted values lead to underfilled index blocks spanning a range of index values for which new inserts occur again. In this case, use of QUICK can lead to wasted space in the index that remains unreclaimed. Here is an example of such a scenario: