MySQL has been naughty for me lately.
First, I ran into a neat little issue on FreshBooks’ production servers last week involving the table cache and an O(n) algorithm for selecting a table to close. I wrote up a little explanation over on the FreshBooks blog that you might find interesting if you find any of this interesting. The short version is that if you’re going to be running with a full table cache and still opening tables regularly, you’ll be better off with a much smaller table cache, because finding the least-recently-used table to close is big-O of the size of the table cache. Smaller table cache = fewer tables to determine the LRU.
And then last night, out of the blue, a web forum about tinwhistles that I host hit a wall. About 8:30, my mostly-idle Linode went heavily IO-bound — as in one of the four CPUs spinning in diskwait all the time. What had originally been complex but fast (and common) queries were suddenly taking minutes and minutes to run: things like “get a list of topics in a forum”, and especially “get a list of posts for a forum’s RSS feed”.
There’s a lot of EXPLAIN output here, so I’d better put this behind a cut.
I took a look at the RSS feed query, which seemed the worst not only because it took a long time — 600+ seconds — but because it read-locked every important table in the database. Here’s the query, a big inner join:
SELECT t.topic_title, t.topic_last_post_id, t.forum_id, f.forum_name, p.post_time, pt.post_text, pt.bbcode_uid, u.username, u.user_id FROM phpbb_topics t, phpbb_posts p, phpbb_posts_text pt, phpbb_users u, phpbb_forums f WHERE t.topic_status != 1 AND p.post_id = t.topic_last_post_id AND pt.post_id = p.post_id AND u.user_id = p.poster_id AND t.forum_id = f.forum_id ORDER BY t.topic_last_post_id DESC LIMIT 0, 15;
Here’s the mk-visual-explain output. I’ve replaced the table aliases with readable things:
Filesort +- TEMPORARY table temporary(forums,topics,posttext,posts,users) +- JOIN +- Bookmark lookup | +- Table | | table users | | possible_keys PRIMARY | +- Unique index lookup | key users->PRIMARY | possible_keys PRIMARY | key_len 3 | ref chiffbb.posts.poster_id | rows 1 +- JOIN +- Bookmark lookup | +- Table | | table posts | | possible_keys PRIMARY,poster_id | +- Unique index lookup | key posts->PRIMARY | possible_keys PRIMARY,poster_id | key_len 3 | ref chiffbb.topics.topic_last_post_id | rows 1 +- JOIN +- Bookmark lookup | +- Table | | table posttext | | possible_keys PRIMARY | +- Unique index lookup | key posttext->PRIMARY | possible_keys PRIMARY | key_len 3 | ref chiffbb.topics.topic_last_post_id | rows 1 +- JOIN +- Filter with WHERE | +- Bookmark lookup | +- Table | | table topics | | possible_keys forum_id,topic_status,topic_last_post_id | +- Index lookup | key topics->forum_id | possible_keys forum_id,topic_status,topic_last_post_id | key_len 2 | ref chiffbb.forums.forum_id | rows 2579 +- Table scan rows 23 +- Table table forums possible_keys PRIMARY
See that temporary table at the top that gets used in a filesort? Well…
The whole thing was multiple joins which were then ORDERed and LIMITed. So that meant that it had to find all posts to the forum, ever, and shove them in a temporary table, sort that, and take the 15 most recent posts.
“All posts to the forum, ever” is about 500MB of data. That made the temporary table big enough to go to disk. So every time this query ran and couldn’t be answered from the query cache, it had to write that 500MB file. And the cached query was invalidated whenever someone posted to the forum, which is pretty often.
The problem in this case wasn’t (entirely) the SQL. MySQL was optimizing the query poorly because the key distribution statistics were off. An ANALYZE TABLE on the affected tables fixed that, and gave us:
JOIN +- Bookmark lookup | +- Table | | table forums | | possible_keys PRIMARY | +- Unique index lookup | key forum->PRIMARY | possible_keys PRIMARY | key_len 2 | ref chiffbb.topics.forum_id | rows 1 +- JOIN +- Bookmark lookup | +- Table | | table users | | possible_keys PRIMARY | +- Unique index lookup | key users->PRIMARY | possible_keys PRIMARY | key_len 3 | ref chiffbb.posts.poster_id | rows 1 +- JOIN +- Bookmark lookup | +- Table | | table posttext | | possible_keys PRIMARY | +- Unique index lookup | key posttext->PRIMARY | possible_keys PRIMARY | key_len 3 | ref chiffbb.posttext.post_id | rows 1 +- JOIN +- Bookmark lookup | +- Table | | table posts | | possible_keys PRIMARY,poster_id | +- Unique index lookup | key posts->PRIMARY | possible_keys PRIMARY,poster_id | key_len 3 | ref chiffbb.topics.topic_last_post_id | rows 1 +- Filesort +- Filter with WHERE +- Bookmark lookup +- Table | table topics | possible_keys forum_id,topic_status,topic_last_post_id +- Index range scan key topics->topic_status possible_keys forum_id,topic_status,topic_last_post_id key_len 1 rows 57912
There’s still a filesort, but it’s now a filesort of a single 57k-row table that’s already been filtered. That table is about 5MB, and fits in tmp_table_size, so doesn’t go to disk. The joins all stack, and the ORDER BY just follows that one-table filesort. The query takes about 0.15s now, or about 4000x as fast.
Incidentally, it can still be improved: that “filter with WHERE” is because of the “WHERE t.topic_status != 1” in the query, and that means “where the topic is not locked”. The idea was that locked topics aren’t going to appear in the last-15-posts anyhow, so may as well exclude them. But if they’re not going to appear because of the sorting, and since you’re sorting anyhow, unless there are a LOT of locked posts that doesn’t matter. Taking out that restriction gets us:
JOIN +- Bookmark lookup | +- Table | | table users | | possible_keys PRIMARY | +- Unique index lookup | key users->PRIMARY | possible_keys PRIMARY | key_len 3 | ref chiffbb.posts.poster_id | rows 1 +- JOIN +- Bookmark lookup | +- Table | | table posts | | possible_keys PRIMARY,poster_id | +- Unique index lookup | key posts->PRIMARY | possible_keys PRIMARY,poster_id | key_len 3 | ref chiffbb.topics.topic_last_post_id | rows 1 +- JOIN +- Bookmark lookup | +- Table | | table posttext | | possible_keys PRIMARY | +- Unique index lookup | key posttext->PRIMARY | possible_keys PRIMARY | key_len 3 | ref chiffbb.topics.topic_last_post_id | rows 1 +- JOIN +- Bookmark lookup | +- Table | | table forums | | possible_keys PRIMARY | +- Unique index lookup | key forums->PRIMARY | possible_keys PRIMARY | key_len 2 | ref chiffbb.t.forum_id | rows 1 +- Bookmark lookup +- Table | table topics | possible_keys forum_id,topic_last_post_id +- Index scan key topics->topic_last_post_id possible_keys forum_id,topic_last_post_id key_len 3 rows 59900
And with that there isn’t even a filesort and the query finishes in <0.01 seconds, 60000x as fast as the original problem and 15x as fast as the post-ANALYZE optimization. Nice.
One response to “Fun with ANALYZE TABLE”
I know this isn’t the point of this entry, but I must admit that I hate hate hate that style of writing inner joins. I much prefer them like this:
SELECT …
FROM phpbb_topics t
inner join phpbb_posts p on p.post_id = t.topic_last_post_id
inner join phpbb_posts_text pt on pt.post_id = p.post_id
inner join phpbb_users u on u.user_id = p.poster_id
inner join phpbb_forums f on f.forum_id = t.forum_id
WHERE t.topic_status != 1
ORDER BY t.topic_last_post_id DESC
LIMIT 0, 15;