Given that description it sounds like MySQL is getting backed up like you have seen in the past, which then uses up all available memory, which causes the work load to hit the disk, which then backs up the I/O and causes the CPU to spike and eventually crumble.
Were you ever able to get that monitoring bash script setup to write the processlist and InnoDB stats out to a file when the MySQL connections spiked up? That is still your best bet to see what is going on when things go sideways.
Regardless of what your free memory says, if the server gets slammed it will end up in the same situation. Meaning if memory is at 50%, and MySQL gets swamped like you are saying, then it is going to use up all the memory anyway. So if it’s at 50% or 100%, it really does not matter in this case as the only difference may be a slightly longer ramp up period before the server crashes.
So the first step (as it has always been) is still to find out what queries (and their source) are causing MySQL to get backed up. One temporary thing you could do would be to setup a script and / or use pt-kill to kill long running queries, which may stop MySQL from getting backed up by killing any queries that are blocked or simply taking too long. But you would of course lose any work the query was attempting to do (how bad this is would depend on how smart the application is as far as handling rollbacks), and would just be a band-aid until you find the source of the problem.