简介
MySQL 5.7支持 JSON 的做法是,在server层提供了一堆便于操作 JSON 的函数,至于存储,就是简单地将 JSON 编码成 BLOB,然后交由存储引擎层进行处理,也就是说,MySQL 5.7的JSON 支持与存储引擎没有关系。
mysql json代码,主要在sql目录。
5.7.27与json相关:
item_json_func.cc
item_json_func.h
json_binary.cc
json_binary.h
json_dom.cc
json_dom.h
json_path.cc
json_path.h
opt_explain_json.cc
opt_explain_json.h
8.0又新增了:
json_diff.cc
json_diff.h
json_schema.cc
json_schema.h
json_syntax_check.cc
json_syntax_check.h
json_binary.h
json_binary.h包含了json类型的存储结构,比较简单:
The actual keys and values will come after the header, in the same
order as in the header.
object ::= element-count size key-entry* value-entry* key* value*
array ::= element-count size value-entry* value*
// number of members in object or number of elements in array
element-count ::=
uint16 | // if used in small JSON object/array
uint32 // if used in large JSON object/array
// number of bytes in the binary representation of the object or array
size ::=
uint16 | // if used in small JSON object/array
uint32 // if used in large JSON object/array
key-entry ::= key-offset key-length
key-offset ::=
uint16 | // if used in small JSON object
uint32 // if used in large JSON object
key-length ::= uint16 // key length must be less than 64KB
value-entry ::= type offset-or-inlined-value
// This field holds either the offset to where the value is stored,
// or the value itself if it is small enough to be inlined (that is,
// if it is a JSON literal or a small enough [u]int).
offset-or-inlined-value ::=
uint16 | // if used in small JSON object/array
uint32 // if used in large JSON object/array
注意,key的长度要小于64KB。
再看一张网上的图

partial update
5.7.x的json update,即便指定更新某个path,底层实现是读取整个json字段,修改完后再回写整个json字段。 对于大型的json,效率显然不高。 8.0版本引入partial update,能够只更新对应字段。
从binlog来观察对比json update的实现,就很直观,具体见Partial (Optimised) JSON updates and Replication in MySQL 8.0。
partial update的好处:
- 减少IO,提供update tps
- 减少binlog文件大小
支持partial update的函数
JSON_SET()
JSON_REPLACE()
JSON_REMOVE()
相关源码:json_diff.h
The Json_diff class is used to represent a logical change in a JSON column,
so that a replication master can send only what has changed, instead of
sending the whole new value to the replication slave when a JSON column is
updated.
json的变更,抽象到一个向量。最终的二进制结构:
/**
Vector of logical diffs describing changes to a JSON column.
*/
class Json_diff_vector
/**
The binary format has this form:
+--------+--------+--------+ +--------+
| length | diff_1 | diff_2 | ... | diff_N |
+--------+--------+--------+ +--------+
**/
开启partial update,修改my.cnf
binlog_format=ROW
binlog_row_image=MINIMAL
binlog_row_value_options=partial_json
binlog_row_value_options=PARTIAL_JSON takes effect only when binary logging is enabled and binlog_format is set to ROW or MIXED.