MySQL 中的索引可以按一定顺序引用多个列,这种索引叫做联合索引。形式上,一个联合索引是一个有序元组 <a1, a2, …, an>,其中每个元素是表中的一列;单列索引可视为元素数为 1 的特例。

下文以 MySQL 8.0 + Employees Sample Database 为实验环境。MySQL 5.7 的历史情况会在相关章节穿插说明 —— 5.7 已结束官方支持,但很多旧项目仍在用,行为差异值得标出来。

employees.titles 为例,查看其索引。注意 8.0 比 5.7 多了 VisibleExpression 两列:

SHOW INDEX FROM employees.titles;

+--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
| Table  | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Visible | Expression |
+--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
| titles |          0 | PRIMARY  |            1 | emp_no      | A         |      301292 |     NULL | NULL   |      | BTREE      |         |               | YES     | NULL       |
| titles |          0 | PRIMARY  |            2 | title       | A         |      442605 |     NULL | NULL   |      | BTREE      |         |               | YES     | NULL       |
| titles |          0 | PRIMARY  |            3 | from_date   | A         |      442605 |     NULL | NULL   |      | BTREE      |         |               | YES     | NULL       |
+--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
  • Visible:8.0 引入的 Invisible IndexALTER TABLE ... ALTER INDEX idx INVISIBLE 让索引对优化器不可见但物理结构仍在,适合「灰度删除索引」的场景 —— 先标为不可见观察一段时间,没出问题再真删。
  • Expression:8.0.13+ 的 Functional Key Parts(函数索引)会在这一列展示表达式,后文会用到。

全列匹配

EXPLAIN SELECT * FROM employees.titles
WHERE emp_no = '10009'
  AND title = 'Senior Engineer'
  AND from_date = '1995-02-18';

+----+-------------+--------+------------+-------+---------------+---------+---------+-------------------+------+----------+-------+
| id | select_type | table  | partitions | type  | possible_keys | key     | key_len | ref               | rows | filtered | Extra |
+----+-------------+--------+------------+-------+---------------+---------+---------+-------------------+------+----------+-------+
|  1 | SIMPLE      | titles | NULL       | const | PRIMARY       | PRIMARY | 159     | const,const,const |    1 |   100.00 | NULL  |
+----+-------------+--------+------------+-------+---------------+---------+---------+-------------------+------+----------+-------+

对索引所有列进行精确匹配(这里精确匹配指 =IN)时,索引被完整用到。

理论上索引对顺序敏感,但优化器会自动调整 WHERE 子句的条件顺序去匹配可用索引。把上面三个条件颠倒顺序,执行计划完全一致:

EXPLAIN SELECT * FROM employees.titles
WHERE from_date = '1995-02-18'
  AND emp_no IN ('10009')
  AND title = 'Senior Engineer';
-- 结果同上:const, key_len=159

最左前缀匹配

EXPLAIN SELECT * FROM employees.titles WHERE emp_no = '10009';

+----+-------------+--------+------------+------+---------------+---------+---------+-------+------+----------+-------+
| id | select_type | table  | partitions | type | possible_keys | key     | key_len | ref   | rows | filtered | Extra |
+----+-------------+--------+------------+------+---------------+---------+---------+-------+------+----------+-------+
|  1 | SIMPLE      | titles | NULL       | ref  | PRIMARY       | PRIMARY | 4       | const |    3 |   100.00 | NULL  |
+----+-------------+--------+------------+------+---------------+---------+---------+-------+------+----------+-------+

精确匹配索引最左连续若干列时,索引可被用到 —— 但只用到匹配的那部分前缀。这里 key_len=4 说明只走了第一列 emp_no

中间某个条件未提供

查询用到了索引中列的精确匹配,但中间列没有提供:

EXPLAIN SELECT * FROM employees.titles WHERE emp_no = '10009' AND from_date = '1995-02-18';

+----+-------------+--------+------------+------+---------------+---------+---------+-------+------+----------+-------------+
| id | select_type | table  | partitions | type | possible_keys | key     | key_len | ref   | rows | filtered | Extra       |
+----+-------------+--------+------------+------+---------------+---------+---------+-------+------+----------+-------------+
|  1 | SIMPLE      | titles | NULL       | ref  | PRIMARY       | PRIMARY | 4       | const |    1 |    10.00 | Using where |
+----+-------------+--------+------------+------+---------------+---------+---------+-------+------+----------+-------------+

默认情况下,中间列 title 缺失导致 from_date 无法拼回最左前缀,索引只走了第一列,from_date 退化为 Using where 过滤。MySQL 8 提供了三种解法:

方法一:补辅助索引

最直接的解法是为这种查询模式建一个辅助索引 <emp_no, from_date>,跳过 title

ALTER TABLE titles ADD INDEX idx_emp_from (emp_no, from_date);

方法二:手工填坑(IN 枚举)

如果中间列基数很低,可以用 IN 枚举把它的空位填上,让查询重新形成完整的最左前缀。title 只有 7 个不同值:

SELECT DISTINCT title FROM employees.titles;
-- 7 rows

EXPLAIN SELECT * FROM employees.titles
WHERE emp_no = '10009'
  AND title IN ('Senior Engineer', 'Staff', 'Engineer', 'Senior Staff', 'Assistant Engineer', 'Technique Leader', 'Manager')
  AND from_date = '1995-02-18';

+----+-------------+--------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
| id | select_type | table  | partitions | type  | possible_keys | key     | key_len | ref  | rows | filtered | Extra       |
+----+-------------+--------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | titles | NULL       | range | PRIMARY       | PRIMARY | 159     | NULL |    7 |   100.00 | Using where |
+----+-------------+--------+------------+------+---------------+---------+---------+------+------+----------+-------------+

key_len=159 说明三列全用到,但 type=range 暴露了实际行为:7 次 key 查找。如果 title 取值很多,这条路就走不通了。

方法三:让优化器自己跳 —— Index Skip Scan(8.0.13+)

MySQL 8.0.13 起加入 Index Skip Scan,思路与方法二一致:优化器自动枚举被跳列的所有可能值,重组前缀。命中时 EXPLAINExtra 会出现 Using index for skip scan

触发条件比较苛刻(参见 Range Optimization · Skip Scan):

  • 单表查询、覆盖索引
  • 不使用 GROUP BY / DISTINCT
  • 被跳列前后都有等值或范围条件
  • 被跳列基数低(高基数枚举代价反而高)

不满足就回到方法一或方法二。可用 hint 强制 / 禁用:

SELECT /*+ SKIP_SCAN(titles PRIMARY) */ ... ;
SELECT /*+ NO_SKIP_SCAN(titles PRIMARY) */ ... ;

5.7 历史:没有 Skip Scan,遇到「中间列缺失」只有方法一、方法二两条路。

查询条件没有指定索引第一列

EXPLAIN SELECT * FROM employees.titles WHERE from_date = '1995-02-18';

+----+-------------+--------+------------+------+---------------+------+---------+------+--------+----------+-------------+
| id | select_type | table  | partitions | type | possible_keys | key  | key_len | ref  | rows   | filtered | Extra       |
+----+-------------+--------+------------+------+---------------+------+---------+------+--------+----------+-------------+
|  1 | SIMPLE      | titles | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 442605 |    10.00 | Using where |
+----+-------------+--------+------------+------+---------------+------+---------+------+--------+----------+-------------+

没有最左前缀,常规情况下索引用不上,只能全表扫。理论上 8.0 的 Skip Scan 也能跳过前导列,但 emp_no 基数高达 30 万,枚举代价比全表扫还贵 —— 优化器会拒绝。Skip Scan 不是万灵药,前导列基数低是硬性前提。

匹配某列的前缀字符串

EXPLAIN SELECT * FROM employees.titles WHERE emp_no = 10009 AND title LIKE 'Senior%';

+----+-------------+--------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
| id | select_type | table  | partitions | type  | possible_keys | key     | key_len | ref  | rows | filtered | Extra       |
+----+-------------+--------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | titles | NULL       | range | PRIMARY       | PRIMARY | 156     | NULL |    1 |   100.00 | Using where |
+----+-------------+--------+------------+------+---------------+---------+---------+------+------+----------+-------------+

通配符 % 不在开头时可以走索引,被识别为范围扫描。

范围查询

EXPLAIN SELECT * FROM employees.titles
WHERE emp_no < '10010' AND title = 'Senior Engineer';

+----+-------------+--------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
| id | select_type | table  | partitions | type  | possible_keys | key     | key_len | ref  | rows | filtered | Extra       |
+----+-------------+--------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | titles | NULL       | range | PRIMARY       | PRIMARY | 4       | NULL |   14 |    10.00 | Using where |
+----+-------------+--------+------------+-------+---------------+---------+---------+------+------+----------+-------------+

范围列本身可用索引,但范围列之后的列在同一索引中无法继续利用 —— 一个索引最多只能用一个范围列。

EXPLAIN SELECT * FROM employees.titles
WHERE emp_no < '10010'
  AND title = 'Senior Engineer'
  AND from_date BETWEEN '1986-01-01' AND '1986-12-31';
-- key_len=4,只有 emp_no 走了索引

EXPLAIN 不区分「范围扫描」和「多值精确匹配」,两者在 type 都显示为 rangeBETWEEN 也不必然是范围 —— 区间窄且值连续时,优化器可能把它当 IN 处理:

EXPLAIN SELECT * FROM employees.titles
WHERE emp_no BETWEEN '10001' AND '10010'
  AND title = 'Senior Engineer'
  AND from_date BETWEEN '1986-01-01' AND '1986-12-31';
-- key_len=159,三列全用到

emp_no BETWEEN 10001 AND 10010 被展开成 10 个等值点,于是后面的列也跟上来了。对比闭区间和开区间:

-- key_len=4
WHERE emp_no > 10000 AND emp_no < 10011 AND title = 'Senior Engineer' AND from_date BETWEEN ...

-- key_len=159
WHERE emp_no >= 10001 AND emp_no <= 10010 AND title = 'Senior Engineer' AND from_date BETWEEN ...

整数类型上,闭区间 >= AND <= 等价于可枚举集合,优化器能把它当多值匹配;开区间 > AND < 走纯 range,后续列无法跟上。结论:写整数闭区间用 >= / <=,给优化器留出多值匹配的可能。

查询条件中含有函数或表达式

EXPLAIN SELECT * FROM employees.titles
WHERE emp_no = '10009' AND LEFT(title, 6) = 'Senior';

+----+-------------+--------+------------+------+---------------+---------+---------+-------+------+----------+-------------+
| id | select_type | table  | partitions | type | possible_keys | key     | key_len | ref   | rows | filtered | Extra       |
+----+-------------+--------+------------+------+---------------+---------+---------+-------+------+----------+-------------+
|  1 | SIMPLE      | titles | NULL       | ref  | PRIMARY       | PRIMARY | 4       | const |    3 |   100.00 | Using where |
+----+-------------+--------+------------+------+---------------+---------+---------+-------+------+----------+-------------+

被函数包裹的列默认无法走索引(这里 title 退化为 Using where)。WHERE emp_no - 1 = 10000 同理 —— 即便它等价于 emp_no = 10001,优化器也不会做常量折叠,只能全表扫。

解法:Functional Key Parts(8.0.13+)

8.0.13 起可以为表达式建索引:

ALTER TABLE titles ADD INDEX idx_title_left6 ((LEFT(title, 6)));

EXPLAIN SELECT * FROM employees.titles WHERE LEFT(title, 6) = 'Senior';
-- 现在能命中 idx_title_left6

底层实现是 隐式虚拟生成列 + 普通 B-Tree 索引。需要注意:

  • 表达式必须和查询里完全一致才会被匹配 —— LEFT(title, 6)SUBSTRING(title, 1, 6) 不互通。
  • 函数索引继承所有生成列的限制(字符集、长度上限、不能引用其他生成列等),且会占用表的列数上限。
  • 改 SQL 也是一条路:把 emp_no - 1 = 10000 写成 emp_no = 10001,不需要新索引就能命中。能改 SQL 优先改 SQL。

5.7 历史:没有 Functional Key Parts,只能改 SQL 或者用「显式生成列 + 普通索引」曲线救国。

EXPLAIN ANALYZE:从执行计划到实测耗时

要量化两种写法的差距,5.7 时代常用 SHOW PROFILES(已 deprecated),8.0.18+ 推荐 EXPLAIN ANALYZE

EXPLAIN ANALYZE
SELECT * FROM employees.titles
WHERE emp_no = '10009' AND from_date = '1995-02-18';

输出会给出每个迭代器节点的 实际行数 / 实际耗时 / 循环次数,而不是 EXPLAIN 的「估算」。配合 FORMAT=TREE 还能直观看出 join 顺序与算子嵌套,是 8.0 调优的首选工具。

References

– EOF –