这个 join 类型一共包括 7 个词:cartesian left outer anti semi self join,第一次见的时候被这叠 buff 一样的前缀干晕了。
实际理解的话并不很困难,记录一下。
有参考:https://riptutorial.com/sql/example/22934/join-terminology–inner–outer–semi–anti—
基础概念
我们使用 mysql 8.0 来进行测试。
mysql> select version();
+-----------+
| version() |
+-----------+
| 8.0.25 |
+-----------+
1 row in set (0.00 sec)
mysql> create database test;
Query OK, 1 row affected (0.01 sec)
mysql> use test;
Database changed
mysql> create table t(a int);
Query OK, 0 rows affected (0.03 sec)
mysql> create table s(a int);
Query OK, 0 rows affected (0.03 sec)
mysql> insert into t values(1),(2),(3);
Query OK, 3 rows affected (0.02 sec)
Records: 3 Duplicates: 0 Warnings: 0
mysql> insert into s values(3),(4),(5);
Query OK, 3 rows affected (0.01 sec)
Records: 3 Duplicates: 0 Warnings: 0
cartesian product
笛卡尔积,指两个表(关系)中每条记录各自组合一次生成新纪录的表。
为什么 cartesian 会翻译为 笛卡尔积?
- 法国数学家笛卡尔的本名为 René Descartes.
- 当时拉丁语是学者常用的语言,于是他的作品署名也会使用拉丁化的名字 Cartesius.
- Cartesius 的衍生形容词 Cartesian 用来命名一些他的理论产物了,比如 Cartesian Product。
- 中文翻译中,我们直接称作笛卡尔积。
在 mysql 中直接使用 (cross) join 不写等值条件,就会执行一个 cartesian product.
mysql> select * from t join s;
+------+------+
| a | a |
+------+------+
| 3 | 3 |
| 2 | 3 |
| 1 | 3 |
| 3 | 4 |
| 2 | 4 |
| 1 | 4 |
| 3 | 5 |
| 2 | 5 |
| 1 | 5 |
+------+------+
9 rows in set (0.00 sec)
mysql> select * from t cross join s;
+------+------+
| a | a |
+------+------+
| 3 | 3 |
| 2 | 3 |
| 1 | 3 |
| 3 | 4 |
| 2 | 4 |
| 1 | 4 |
| 3 | 5 |
| 2 | 5 |
| 1 | 5 |
+------+------+
9 rows in set (0.00 sec)
Join
更常见的 join 会带有等值条件,也被称作 inner join,有若干种写法。
mysql> select * from t join s on t.a=s.a;
+------+------+
| a | a |
+------+------+
| 3 | 3 |
+------+------+
1 row in set (0.00 sec)
mysql> select * from t join s using (a);
+------+
| a |
+------+
| 3 |
+------+
1 row in set (0.01 sec)
mysql> select t.a, s.a from t join s using (a);
+------+------+
| a | a |
+------+------+
| 3 | 3 |
+------+------+
1 row in set (0.00 sec)
mysql> select t.a, s.a from t,s where t.a=s.a;
+------+------+
| a | a |
+------+------+
| 3 | 3 |
+------+------+
1 row in set (0.00 sec)
outer join
left (outer) join 表示即使左表的数据匹配不到右表,也会输出到结果集中。
mysql> select * from t left join s on t.a=s.a;
+------+------+
| a | a |
+------+------+
| 1 | NULL |
| 2 | NULL |
| 3 | 3 |
+------+------+
3 rows in set (0.00 sec)
mysql> select * from t left outer join s on t.a=s.a;
+------+------+
| a | a |
+------+------+
| 1 | NULL |
| 2 | NULL |
| 3 | 3 |
+------+------+
3 rows in set (0.00 sec)
right (outer) join 类似,但右表成为了主表。
mysql> select * from t right outer join s on t.a=s.a;
+------+------+
| a | a |
+------+------+
| 3 | 3 |
| NULL | 4 |
| NULL | 5 |
+------+------+
3 rows in set (0.00 sec)
在 full (outer) join 中,两张表都是主表。
-- mysql 不支持
mysql> select * from t full outer join s on t.a=s.a;
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'outer join s on t.a=s.a' at line 1
semi join
半连接,也就是只选取一张表的记录。
语法如下:
mysql> select * from t where t.a in (select * from s);
+------+
| a |
+------+
| 3 |
+------+
1 row in set (0.00 sec)
anti (semi) join
反连接,又称反半连接,只返回一张表的记录,且只返回没有匹配成功的。
mysql> select * from t where t.a not in (select * from s);
+------+
| a |
+------+
| 1 |
| 2 |
+------+
2 rows in set (0.00 sec)
self join
自连接,其实就是自己和自己进行连接。
mysql> select * from t where t.a in (select * from t);
+------+
| a |
+------+
| 1 |
| 2 |
| 3 |
+------+
3 rows in set (0.00 sec)
组合
cartesian left outer anti semi self join:
explain select a from t1 where t1.a>1 or t1.a not in (select a from t2);
TBD