anybody know of an option to cause LOAD DATA INFILE to replicate purely as a statement and not transfer the contents of the file through the binlog?
e.g. you could out-of-band pre-populate your master & slaves with a horrendously large .CSV file in /tmp and then issue the LOAD DATA INFILE without incurring a binlog and replication lag hit, just IO?
we sometimes have some very large bulk updates happening, and the congestion they cause an otherwise healthy replication chain would be nice to avoid.
i seem to remember such an option or default behavior in the misty past of MySQL 3, but it seems to replicate contents now in 5.0.
yes, i know doing such a thing fairly obviates the point of a binlog to completely recover your state.
if not, the only alternative i can think of is to do LOAD DATA INFILE into replication excluded tables and manually issue the statement on the master & each slave.
I don’t believe there is an option for that, but I have had similar needs and solved it with a interim load. Since the interim load effectively doubles the I/O and processing required, I use a dev server. Here is a pseudo-transcript of what the process may look like:
workstation:~$ scp new_data.csv devbox:/dev/shm #because I like using tmpfs (~RAMDISK) for thisworkstation:~$ ssh devboxdevbox:~$ mysqldump -h prodbox --no-data schema_name table_name | mysql test_schemadevbox:~$ mysql -e "create schema temp_schema; use temp_schema; load data local infile ‘/dev/shm/new_data.csv’ into table table_name"devbox:~$ mysqldump --no-create-info temp_schema table_name | mysql -h prodbox schema_namedevbox:~$ mysql -e “drop schema temp_schema”
Another alternative which I have used in the past is to write a custom AWK script that converts each line of the csv file into an insert statement. I’ll post it here if I can find it.