Multi Threaded Production Load Test

I’m looking for a way to perform a multi-threaded benchmark of mysql using production data. I have a server that is populated with my production dataset, and I have a log of queries that were run in production just after the snapshot finished.

I need to be able to test with 1-128 threads, resetting the data after each time. I have tried pt-log-player, but it does not do a good job, and is not recommended for this by Percona. Someone suggested I try xargs, but I am not able to reproduce the performance I get with a raw sql file with that. Are there any other tools out there that may be able to do what I need?