"Table is full" error at around 350 million rows

Hi,

we are using MySQL Cluster for storing financial data like stock quotes which results in massive INSERTs and UPDATEs. Lately, we encountered strange “table full” errors while inserting data, although data and index memory are ok (50 % free each).

The table full errors happen at around 350 million rows in 2 specific tables, but always exactly around that boundary. All other tables are fine and it is possible to INSERT without any issue. We used the table create option “max_rows=600000000”. Seems like we hit a limit of some kind (see detailed config below). Any idea?

Infrastructure:

  • 4 data nodes, 128 GB RAM each
  • 2 mysqld instances for writing data
  • 4 mysqld instances which are configured master(1)-slave(3) and are connected to the cluster for reading

NDB MGM config:
[NDBD DEFAULT]
NoOfReplicas=2
Datadir=/mnt/data/cluster
FileSystemPathDD=/mnt/data/cluster
#FileSystemPathUndoFiles=/mnt/data/cluster
#FileSystemPathDataFiles=/mnt/data/cluster
DataMemory=85000M
IndexMemory=25000M
LockPagesInMainMemory=1

MaxNoOfConcurrentOperations=1500000

StringMemory=25
MaxNoOfTables=4096
MaxNoOfOrderedIndexes=2048
MaxNoOfUniqueHashIndexes=512
MaxNoOfAttributes=24576
MaxNoOfTriggers=14336
DiskCheckpointSpeedInRestart=100M
FragmentLogFileSize=128M
InitFragmentLogFiles=SPARSE
NoOfFragmentLogFiles=300
RedoBuffer=1G

TimeBetweenLocalCheckpoints=20
TimeBetweenGlobalCheckpoints=1000
TimeBetweenEpochs=100

MemReportFrequency=30
BackupReportFrequency=10

Params for setting logging

LogLevelStartup=15
LogLevelShutdown=15
LogLevelCheckpoint=8
LogLevelNodeRestart=15

Params for increasing Disk throughput

BackupMaxWriteSize=1M
BackupDataBufferSize=16M
BackupLogBufferSize=4M
BackupMemory=20M
#Reports indicates that odirect=1 can cause io errors (os err code 5) on some systems. You must test.
ODirect=1

Watchdog

TimeBetweenWatchdogCheckInitial=60000

TransactionInactiveTimeout - should be enabled in Production

TransactionInactiveTimeout=60000

CGE 6.3 - REALTIME EXTENSIONS

#RealTimeScheduler=1
#SchedulerExecutionTimer=80
#SchedulerSpinTimer=40

DISK DATA

SharedGlobalMemory=20M
DiskPageBufferMemory=64M

Multithreading

MaxNoOfExecutionThreads=4

Increasing the LongMessageBuffer b/c of a bug (20090903)

LongMessageBuffer=32M

Hi,

From your configuration you appear to be using disk based tables.

DISK DATA

SharedGlobalMemory=20M
DiskPageBufferMemory=64M

If so are your tablespaces large enough for the number of rows?.

You may need to add more data files to expand the tablespace.

http://dev.mysql.com/doc/refman/5.1/en/alter-tablespace.html

Cheers,

Ewen

Hi,
if it is not the free space in the disk table space like Ewen suggested, I have seen with some (older) versions that you need to add the “max_rows” predicate to the create table sql. In your case, it is likely something like max_rows=1000000000 or even bigger.

Regards,

Yves

Thanks for your reply. You’re right, space for disk tables is configured, but so far we don’t use them.

Hi,

Can you give us the CREATE statements for the tables in question?.

After you receive the table is full, can you do show warnings.

If this is a recent version of NDB, can you query the ndbinfo tables?.

SELECT * FROM ndbinfo.logbuffers;
SELECT * FROM ndbinfo.logspaces;

Thanks,

Ewen

Hi again,

seems like re-creating the tables with partition info

“CREATE TABLE IF NOT EXISTS 5_quote_intraday_bid (
pfk_instrument_id int(11) unsigned NOT NULL DEFAULT ‘0’,
pk_tick mediumint(5) unsigned NOT NULL DEFAULT ‘0’,
pk_datetime timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
value float unsigned NOT NULL DEFAULT ‘0’,
PRIMARY KEY (pfk_instrument_id,pk_tick,pk_datetime)
) ENGINE=ndbcluster DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci MAX_ROWS=600000000
/*!50100 PARTITION BY KEY (pfk_instrument_id)
PARTITIONS 20 */;”

worked. Using the “MAX_ROWS=600000000” setting only does not seem to force NDB creating more partitions.

ndb_desc for the table now looks like this:
/usr/local/mysql/bin/ndb_desc -p -d instruments 5_quote_intraday_bid
– 5_quote_intraday_bid –
Version: 6
Fragment type: 9
K Value: 6
Min load factor: 78
Max load factor: 80
Temporary table: yes
Number of attributes: 4
Number of primary keys: 3
Length of frm data: 348
Row Checksum: 1
Row GCI: 1
SingleUserMode: 0
ForceVarPart: 1
FragmentCount: 20
TableStatus: Retrieved
– Attributes –
pfk_instrument_id Unsigned PRIMARY KEY DISTRIBUTION KEY AT=FIXED ST=MEMORY
pk_tick Mediumunsigned PRIMARY KEY AT=FIXED ST=MEMORY
pk_datetime Timestamp PRIMARY KEY AT=FIXED ST=MEMORY
value Float NOT NULL AT=FIXED ST=MEMORY DEFAULT 0.000000

– Indexes –
PRIMARY KEY(pfk_instrument_id, pk_tick, pk_datetime) - UniqueHashIndex
PRIMARY(pfk_instrument_id, pk_tick, pk_datetime) - OrderedIndex

– Per partition info –
Partition Row count Commit count Frag fixed memory Frag varsized memory Extent_space Free extent_space
0 4181903 4181903 167936000 0 0 0
8 4046133 4046133 162496512 0 0 0
16 4167396 4167396 167378944 0 0 0
4 4222819 4222819 169607168 0 0 0
12 4097123 4097123 164528128 0 0 0
6 4263389 4263389 171212800 0 0 0
14 4138372 4138372 166199296 0 0 0
7 4166652 4166652 167346176 0 0 0
15 4126240 4126240 165707776 0 0 0
3 4121599 4121599 165511168 0 0 0
11 4138689 4138689 166199296 0 0 0
19 4125794 4125794 165707776 0 0 0
2 4079302 4079302 163840000 0 0 0
10 4174602 4174602 167641088 0 0 0
18 4094415 4094415 164429824 0 0 0
1 4196405 4196405 168525824 0 0 0
9 4217046 4217046 169345024 0 0 0
17 4099769 4099769 164659200 0 0 0
5 4174561 4174561 167641088 0 0 0
13 4129547 4129547 165838848 0 0 0

Thanks for all your help.