Could someone see any way to make this query faster ?
Any help or ideas are welcome
UPDATE customscan_track_attributes INNER JOIN tracks on tracks.url=customscan_track_attributes.url and customscan_track_attributes.musicbrainz_id is null set customscan_track_attributes.track=tracks.id where customscan_track_attributes.track!=tracks.id
I’ve tried this version also, but it is a little bit slower:
UPDATE tracks,customscan_track_attributes SET customscan_track_attributes.track=tracks.id where customscan_track_attributes.musicbrainz_id is null and tracks.url=customscan_track_attributes.url and customscan_track_attributes.track!=tracks.id
The tables look like this:
CREATE TABLE tracks ( id int(10) unsigned NOT NULL auto_increment, url text collate utf8_unicode_ci NOT NULL, title blob, titlesort text collate utf8_unicode_ci, titlesearch text collate utf8_unicode_ci, customsearch text collate utf8_unicode_ci, album int(10) unsigned default NULL, tracknum int(10) unsigned default NULL, content_type varchar(255) collate utf8_unicode_ci default NULL, timestamp int(10) unsigned default NULL, filesize int(10) unsigned default NULL, audio_size int(10) unsigned default NULL, audio_offset int(10) unsigned default NULL, year smallint(5) unsigned default NULL, secs float unsigned default NULL, cover blob, thumb blob, vbr_scale varchar(255) collate utf8_unicode_ci default NULL, bitrate float unsigned default NULL, samplerate int(10) unsigned default NULL, samplesize int(10) unsigned default NULL, channels tinyint(1) unsigned default NULL, block_alignment int(10) unsigned default NULL, endian tinyint(1) default NULL, bpm smallint(5) unsigned default NULL, tagversion varchar(255) collate utf8_unicode_ci default NULL, drm tinyint(1) default NULL, rating tinyint(1) unsigned default NULL, disc tinyint(1) unsigned default NULL, playCount int(10) unsigned default NULL, lastPlayed int(10) unsigned default NULL, audio tinyint(1) default NULL, remote tinyint(1) default NULL, lossless tinyint(1) default NULL, lyrics text collate utf8_unicode_ci, moodlogic_id int(10) unsigned default NULL, moodlogic_mixable tinyint(1) default NULL, musicbrainz_id varchar(40) collate utf8_unicode_ci default NULL, musicmagic_mixable tinyint(1) default NULL, replay_gain float default NULL, replay_peak float default NULL, PRIMARY KEY (id), KEY trackTitleIndex (title(255)), KEY trackAlbumIndex (album), KEY ctSortIndex (content_type), KEY trackSortIndex (titlesort(255)), KEY trackSearchIndex (titlesearch(255)), KEY trackCustomSearchIndex (customsearch(255)), KEY trackBitrateIndex (bitrate), KEY trackDiscIndex (disc), KEY trackFilesizeIndex (filesize), KEY trackTimestampIndex (timestamp), KEY trackTracknumIndex (tracknum), KEY trackRatingIndex (rating), KEY trackPlayCountIndex (playCount), KEY trackLastPlayedIndex (lastPlayed), KEY trackAudioIndex (audio), KEY trackLyricsIndex (lyrics(255)), KEY trackRemoteIndex (remote), KEY trackLosslessIndex (lossless), KEY urlIndex (url(255)), KEY trackStatMBIndex (musicbrainz_id), CONSTRAINT tracks_ibfk_1 FOREIGN KEY (album) REFERENCES albums (id) ON DELETE CASCADE) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
CREATE TABLE customscan_track_attributes ( id bigint(20) unsigned NOT NULL auto_increment, track int(10) default NULL, url varchar(511) collate utf8_unicode_ci NOT NULL, musicbrainz_id varchar(40) collate utf8_unicode_ci default NULL, module varchar(40) collate utf8_unicode_ci NOT NULL, attr varchar(255) collate utf8_unicode_ci NOT NULL, value varchar(255) collate utf8_unicode_ci default NULL, valuesort varchar(255) collate utf8_unicode_ci default NULL, extravalue varchar(255) collate utf8_unicode_ci default NULL, valuetype varchar(255) collate utf8_unicode_ci default NULL, PRIMARY KEY (id), UNIQUE KEY id (id), KEY track_attr_idx (track,module,attr,id), KEY musicbrainzIndex (musicbrainz_id), KEY urlIndex (url(255)), KEY module_attr_value_idx (module,attr,value), KEY attr_module_idx (attr,module), KEY extravalue_attr_module_track_idx (extravalue,attr,module,track), KEY track_module_attr_extravalue_idx (track,module,attr,extravalue), KEY module_attr_extravalue_idx (module,attr,extravalue)) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
When the tracks table contains about 40000 rows and the customscan_track_attributes table contains about 160000 rows the UPDATE statements takes about 5 seconds to execute on Linux, but on Windows it takes about 15 seconds. A user has also reported that it takes several minutes when the customscan_track_attributes table is larger. This is the case even if the update doesn’t actually update any rows.
I tried to make a SELECT first to check if the update was required, like this:
SELECT count(*) FROM customscan_track_attributes INNER JOIN tracks on tracks.url=customscan_track_attributes.url and customscan_track_attributes.musicbrainz_id is null where customscan_track_attributes.track!=tracks.id
When running this select with explain I get the following result:
id select_type table type possible_keys key key_len ref rows Extra1 SIMPLE tracks ALL urlIndex (null) (null) (null) 40541 1 SIMPLE customscan_track_attributes ref musicbrainzIndex,urlIndex urlIndex 767 slimserver65_large.tracks.url 1 Using where
This SELECT statement takes approximately the same time to execute as the UPDATE statement.
I have full control of the customscan_track_attributes table, so if some changes in column types or indexes would make it better that is possible. The tracks table is from an external product, I can add index to this but I can’t change any column definitions.