Hmmm ok so I’m not quite sure where to start here…
A bit of background to our setup…
2 Datacentres, 15 servers in each, only one datacentre active.
In Active DC - Master DB is a HP DL360, 2 x 6 Core 4800Mhz CPU , 64Gb RAM - all other servers in the DC are slaves to this.
12 of the slaves are also DL360’s - exactly the same config (RAM, CPU) , 2 of the servers are DL385: 64Gb RAM, 2 x 16 Core 3500Mhz CPU
In Standby DC we have exactly the same - 13 DL360’s and 2 DL385’s - All the same spec
One of the DL360’s is a Slave to the Master in the Active DC, all other servers are slaved from that.
Everything is good so far…
We’re using our standby DC to try and bottom out some performance issues - specifically the two DL385s are under performing by orders of magnitude compared to the DL360s.
In addition, on one of the DL385’s, If I up the innodb_buffer_pool to 35Gb - Mysql won’t start, yet on the other DL385 it’s fine.
In terms of my.cnf parameters - apart from the obvious bin logging enabled on the masters, everything is the same and controlled by puppet.
I’m kind of lost as to:
a) why won’t mysql start with anything greater than 35Gb buffer pool on one server, but on another identical one it’s fine
b) why are the DL385’s performing so badly
I know the information I have provided is probably only a fraction of what is needed for a much more detailed investigation but just as a top level guess, can anyone think of anything that I’m missing?
We’re using 5.5.30-rel30.2.500 on all boxes.
Key my.cnf params as follows (This is from a server that starts fine with a 35Gb Buffer pool):
innodb_additional_mem_pool_size=33554432 innodb_buffer_pool_size=37580963840 innodb_log_buffer_size=16777216 join_buffer_size=131072 key_buffer_size=8388608 sort_buffer_size=2097152
This is the error we get when we increase the buffer pool to 35Gb or more on one of the DL385s:
130711 13:50:55 mysqld_safe mysqld from pid file /var/lib/mysql/mysqld.pid ended 130711 13:50:56 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql/data 130711 13:50:56 [Note] Plugin 'FEDERATED' is disabled. 130711 13:50:56 InnoDB: The InnoDB memory heap is disabled 130711 13:50:56 InnoDB: Mutexes and rw_locks use GCC atomic builtins 130711 13:50:56 InnoDB: Compressed tables use zlib 1.2.3 130711 13:50:56 InnoDB: Using Linux native AIO 130711 13:50:56 InnoDB: Error: Linux Native AIO is not supported on tmpdir. InnoDB: You can either move tmpdir to a file system that supports native AIO InnoDB: or you can set innodb_use_native_aio to FALSE to avoid this message. 130711 13:50:56 InnoDB: Error: Linux Native AIO check on tmpdir returned error 130711 13:50:56 InnoDB: Warning: Linux Native AIO disabled. 130711 13:50:56 InnoDB: Initializing buffer pool, size = 35.0G 130711 13:50:58 InnoDB: Assertion failure in thread 47165255037984 in file ut0mem.c line 103 InnoDB: Failing assertion: ret || !assert_on_error InnoDB: We intentionally generate a memory trap. InnoDB: Submit a detailed bug report to http://bugs.mysql.com. InnoDB: If you get repeated assertion failures or crashes, even InnoDB: immediately after the mysqld startup, there may be InnoDB: corruption in the InnoDB tablespace. Please refer to InnoDB: http://dev.mysql.com/doc/refman/5.5/en/forcing-innodb-recovery.html InnoDB: about forcing recovery. 12:50:58 UTC - mysqld got signal 6 ; This could be because you hit a bug. It is also possible that this binary or one of the libraries it was linked against is corrupt, improperly built, or misconfigured. This error can also be caused by malfunctioning hardware. We will try our best to scrape up some info that will hopefully help diagnose the problem, but since we have already crashed, something is definitely wrong and this may fail. Please help us make Percona Server better by reporting any bugs at http://bugs.percona.com/ key_buffer_size=8388608 read_buffer_size=131072 max_used_connections=0 max_threads=3002 thread_count=0 connection_count=0 It is possible that mysqld could use up to key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 6577353 K bytes of memory Hope that's ok; if not, decrease some variables in the equation. Thread pointer: 0x0 Attempting backtrace. You can use the following information to find out where mysqld died. If you see no messages after this, something went terribly wrong... stack_bottom = 0 thread_stack 0x40000 /usr/sbin/mysqld(my_print_stacktrace+0x35)[0x7b1b75] /usr/sbin/mysqld(handle_fatal_signal+0x4b4)[0x68d494] /lib64/libpthread.so.0[0x2ae58190fbe0] /lib64/libc.so.6(gsignal+0x35)[0x2ae582bb4285] /lib64/libc.so.6(abort+0x110)[0x2ae582bb5d30] /usr/sbin/mysqld[0x87a4a3] /usr/sbin/mysqld[0x927136] /usr/sbin/mysqld[0x85a09e] /usr/sbin/mysqld[0x8a3783] /usr/sbin/mysqld[0x8a3c51] /usr/sbin/mysqld[0x856850] /usr/sbin/mysqld[0x816953] /usr/sbin/mysqld(_Z24ha_initialize_handlertonP13st_plugin_int+0x48)[0x68fe68] /usr/sbin/mysqld[0x59742a] /usr/sbin/mysqld(_Z11plugin_initPiPPci+0xa1d)[0x59b3ad] /usr/sbin/mysqld[0x51a5fb] /usr/sbin/mysqld(_Z11mysqld_mainiPPc+0x46d)[0x51e0fd] /lib64/libc.so.6(__libc_start_main+0xf4)[0x2ae582ba1994] /usr/sbin/mysqld[0x513339] You may download the Percona Server operations manual by visiting http://www.percona.com/software/percona-server/. You may find information in the manual which will help you identify the cause of the crash. 130711 13:50:58 mysqld_safe mysqld from pid file /var/lib/mysql/mysqld.pid ended