Skip to content

Conversation

@tylercrocker
Copy link

@tylercrocker tylercrocker commented Sep 19, 2018

We recently noticed when upgrading a bunch of gems in our system that acts-as-taggable-on has some performance issues when doing a match_all query.

See below a fairly simple example of me fetching images that have two tags:

SELECT `assets`.* 
FROM   `assets` 
       INNER JOIN `taggings` `asset_taggings_1c1918a` 
               ON `asset_taggings_1c1918a`.`taggable_id` = `assets`.`id` 
                  AND `asset_taggings_1c1918a`.`taggable_type` = 'Asset' 
                  AND `asset_taggings_1c1918a`.`tag_id` IN (SELECT `tags`.`id` 
                                                            FROM   `tags` 
                                                            WHERE 
                          Lower(`tags`.`name`) LIKE 'school!_logos' escape '!') 
       INNER JOIN `taggings` `asset_taggings_0430ff7` 
               ON `asset_taggings_0430ff7`.`taggable_id` = `assets`.`id` 
                  AND `asset_taggings_0430ff7`.`taggable_type` = 'Asset' 
                  AND `asset_taggings_0430ff7`.`tag_id` IN (SELECT `tags`.`id` 
                                                            FROM   `tags` 
                                                            WHERE 
                          Lower(`tags`.`name`) LIKE 'uofdenver' escape '!') 
       LEFT OUTER JOIN `taggings` 
                    ON `taggings`.`taggable_id` = `assets`.`id` 
                       AND `taggings`.`taggable_type` = 'Asset' 
WHERE  `assets`.`type` IN ( 'Assets::Image' ) 
GROUP  BY `assets`.`id` 
HAVING Count(`taggings`.`taggable_id`) = (SELECT Count(*) 
                                          FROM   `tags` 
                                          WHERE  ( 
              Lower(`tags`.`name`) LIKE 'school!_logos' escape '!' 
               OR Lower(`tags`.`name`) LIKE 'uofdenver' escape '!' )) 
LIMIT  1;

After fiddling with the query a bit to see what part was causing the inefficiencies I landed on the subquery comparisons in the two INNER JOINs, where they say that tag_id IN (SUBQUERY). If I simply changed the IN to an equals (=) then I saw performance improvements of ~110ms.

This change should be fine as long as we're not using wildcards, since the tags table has a unique constraint on name anyway.

Note that we're working with about 6500 tags, 1.8M taggings, and 330k assets, MySQL 5.7, Ruby 2.5.1, Rails 5.2, and I'm on a newer macbook pro (to bring some context to the performance numbers I saw).

I don't know what you usually do for version bumping in pull requests, but I bumped the version from what it was, that's what we do for our internal gems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants