In early 2010s, one of the most popular emerging database was Mongo. It was popular as it was a Document or JSON based database with flexible schema. Schema-less was enticing in those days of moving fast and breaking things. Even though Mongo has crossed the chasm of hype, it is still one of the most popular databases. It also tried to bring back SQL and ACID as these were critical for a well-positioned databases. In the RDBMS world, the rise of Mongo gave an opportunity to embed JSON in the traditional RDBMS. All traditional databases such as MySQL, Postgres, Oracle and SQL Server started providing JSON support. Of course, JSON as a column type is still not a great choice due to performance penalties and loosing a well-defined domain, it is still a good use-case for some scenarios.

When JSON can be a good fit

There are some scenarios when JSON type columns can be considered to be introduced into a traditional DB. These are:

Tenant-specific data

Say you have a white label product where most of things can be put in a traditional table format. But there are some fields in different tenants which are not overlapping. For example, for an application like Shopify, some merchants would want to additional product metadata which can lead to too many columns. In such a case, you could use a JSON column for storing metadata and not having a lot of NULL-based columns for the rest of the tenants.

Telemetry data

You should use a Time-series Database for such a use-case; but if you want to leverage databases like Postgres for storing Metric or telemetry data; you coud use a JSON column

Feature flags

If you are doing a lot of A/B testing and you want to store the feature flags, you could go with a JSON column. Similar to tenant-specific data; it helps in reducing a lot of BOOLEAN columns. You can have a single column like this

{
  "SHOW_PRICES":true,
  "SHOW_DYNAMIC_PRICING":true,
  "SHOW_MFG_DATE":false,
  "SHOW_STOCK":true,
}

But keep in mind when not to use JSON. Just because you can, does not mean you should.

JSON type Support

Postgres

Postgres has JSON and JSONB. Both the types accept JSON.

JSON Introduced in Postgres 9.2 in 2012. It is a string with additional tools to do some JSON validation.
JSONB Introduced in Postgres 9.4 in 2014. JSONB is stored in binary format with indexing capabilities to optimize reads.The other advatage of JSONB over JSOn is its efficient JSON parsing.

JSON vs JSONB table

What	JSON	JSONB
Preserving Whitespace	JSON preserves whitespaces	JSONB does not preserve whitespace as it is binary format
Ordering of keys	JSON preserves key ordering	JSONB does not preserve key ordering
Duplicate keys	JSON preserves duplicate keys. But this is invalid JSON so issues might arise when reading and validating JSON	JSONB removes all duplicate keys and keeps the last one.
Ingestion performance	Ingestion is faster in JSON	Ingestion is slower in JSONB
JSON Provessing	Slow JSON procession as it is a text field	JSON processing is faster as the binary format has efficient storage and reading capabilities for JSON keys.

Caveats of Postgres JSON types

Malformed JSON:Both give error if you do not add quote (') for key and value.
Duplicate Key: No error!
- JSON: Stores as it is
- JSONB: Stores only last key, which might lead to confusion and loss of information when JSON validation is not done at application layer and it is stored in database. MySQL throws an error during insert.

SQL Script

INSERT into json_postgres (json_col_json_type) values ('{"foo":"bar","foo":"bar"}');
INSERT into json_postgres (json_col_jsonb_type) values ('{"foo":"bar","foo":"bar"}');

OUTPUT:

id|json_col_json_type       |json_col_jsonb_type|
--+-------------------------+-------------------+
 1|{"foo":"bar","foo":"bar"}|                   |
 2|                         |{"foo": "bar"}     |

MySQL

JSON has been supported from v5.7.8.

Features include:

Validating JSON documents during insert.

If you have a invalid JSON for a JSON type, MySQL will throw an error.

Example for this query: INSERT into json_mysql (json_col_json_type) values ('{"foo":bar}') Error comes as.

SQL Error [3140] [22001]: Data truncation: Invalid JSON text: "Invalid value." at position 7 in value for column 'json_mysql.json_col_json_type'.

Storage in Binary format for fast access

JSON type is in binary format which helps in faster access of JSON keys and members. If you stores the JSON in text format and you did a SELECT statement; it would do the following things

SELECT "foo" key-> JSON Parser -> Scan JSON for "foo" sequentially

On the other hand, JSON has a lookup table. Lookup table contains pointers all the the keys in the JSON document which can be queries efficiently with Binary Search.

Database	JSON
Postgres	2 data types; `JSON` and `JSONB`
MySQL	2 data types; `JSON` and `TEXT`
Oracle	3 data types. `OSON` Oracle's binary format introduced in 21c ; Can also store in `BLOB`, `CLOB` as well
Microsoft SQL server	No specific JSON type. Store in `NVARCHAR`

SQL example

MySQL

Create table

create table json_mysql (

    id bigint auto_increment primary key, 
     -- JSON is stored in binary format with metadata. Has storage overhead, but no runtime invalid JSON issues
    `json_col_json_type` JSON,
    `json_col_text_type` text  

);

Insert data

INSERT into json_mysql (json_col_json_type) values ('{"foo":"bar"}');
INSERT into json_mysql (json_col_text_type) values ('{"foo":"bar"}');
select `json_col_json_type`->>'$.foo' from json_mysql;
select `json_col_text_type`->>'$.foo' from json_mysql;

Insert invalid JSON

INSERT into json_mysql (json_col_json_type) values ('{"foo":bar}') -- Throws error
select `json_col_json_type`->>'$.foo' from json_mysql;
INSERT into json_mysql (json_col_text_type) values ('{"foo":bar}') -- No error
select `json_col_text_type`->>'$.foo' from json_mysql;

JSON Methods

There are 20 methods

JSON_OBJECT

SELECT JSON_OBJECT('foo', 'bar'); 

-- Output
|JSON_OBJECT('foo', 'bar')|
|-------------------------+
|{"foo": "bar"}           |

JSON_ARRAY and JSON_OBJECT


SELECT JSON_OBJECT('employeeIds', JSON_ARRAY('e1', 'e2', 'e3'));
-- Output
JSON_OBJECT('employeeIds', JSON_ARRAY('e1', 'e2', 'e3'))|
--------------------------------------------------------+
{"employeeIds": ["e1", "e2", "e3"]}                     |

JSON_CONTAINS_PATH

SELECT `json_col_json_type`, JSON_CONTAINS_PATH(`json_col_json_type`, 'one', '$.foo') as count_foo from json_mysql;

-- Output
json_col_json_type|count_foo|
------------------+---------+
{"foo": "bar"}    |        1|

Check size

SELECT 
max(JSON_STORAGE_SIZE(`json_col_text_type`)) as max_cell_json_size,
max(JSON_STORAGE_SIZE(`json_col_json_type`)) as max_cell_type_size,
max(char_length(sample_varchar_1)) as max_cell_varchar2_size
FROM json_mysql;

Postgres

Create table

create table json_postgres (
    id serial primary key, 
    json_col_json_type JSON, -- JSON is stored in binary format with metadata. This has storage overhead, but not runtime invalid JSON issues like Text 
    json_col_text_type text,
    json_col_jsonb_type JSONB
);

Insert data

INSERT into json_postgres (json_col_json_type) values ('{"foo":"bar","foo":"bar"}');   -- no error. stores as it is
INSERT into json_postgres (json_col_jsonb_type) values ('{"foo":"bar1","foo":"bar2"}');-- no error. but stores only last key

JSON Methods

Postgres JSON functions can be found here: Postgres JSON Functions

Index on JSON

For faster access of JSON data types, databases can be indexed for faster access.

MySQL

Step 1: Check the POSSIBLE_KEYS attribute which decides the Index key

EXPLAIN SELECT * FROM json_mysql;

Output:
id|select_type|table     |type|possible_keys|key|key_len|ref|rows|
--+-----------+----------+----+-------------+---+-------+---+----+
 1|SIMPLE     |json_mysql|ALL |             |   |       |   |   3|

Step2: JSON supports indexing iff it is a generated column.

ALTER TABLE json_mysql ADD COLUMN foo VARCHAR(5)

GENERATED ALWAYS as (json_col_json_type->>'$.foo');

Step3: Add index. Note that if you have an array like {"foo":["bar"]};, then this will lead to B-Tree issue as JSON value will not be valid for the ["bar"] in B-Tree index. Therefore, keep in mind that your JSON key maps to an normal object and not an array.

-- Check for NULL values to avoid incorrect B-Tree Indexes
SELECT JSON_UNQUOTE(JSON_EXTRACT(json_col_json_type, '$.foo')) FROM json_mysql;

-- Create index using BTree
ALTER TABLE json_mysql ADD INDEX foo (foo) USING BTREE;

Step4: See POSSIBLE_KEYS for indexing used by MySQL

EXPLAIN SELECT * FROM json_mysql jm WHERE foo = 'bar';
Output:
id|select_type|table     |type|possible_keys|key|key_len|ref  |rows|
--+-----------+----------+----+-------------+---+-------+-----+----+
 1|SIMPLE     |json_mysql|ref |foo          |foo|23     |const|   1|

Postgres

As discussed before, JSONB type in Postgres allows for indexing of JSON data. There are 3 types of indexes:

B-Tree Most common form of index data structure. It creates a tree that will he balanced when new nodes are added. While reading, the B-Tree will be. traversed to find the key and return the node with the data you are querying for.
GIN (Generalized Inverted indexes) B-Tree are great for finding normal operators with string values. But Postgres databases can store mutlitude type of data types such as JSONB, Arrays, Range types, Full-text search etc.
Hash-indexes Useful for equality operators for exact matches. It is faster than B-Tree for such operations with less memory footprint when compared to B-Tree and GIN indexes.

Memory consideration

Postgres

The page size of storing 1 row in Postgres is (2KB, 8KB or other size as configured). As JSON can lead to bigger page sizes, it can lead to overflow of the limit of 2KB. Postgres has a technique of storing overflow/oversize rows using TOAST(The Oversize Attribute Storage Technique) If data goes beyond 2KB, it will store the data to an out-of-line storage.

Postgres solves issue of row-oversizes via various techniques:

Compression
TOAST table for out-of-line storage
Combine Compression and Toast table

There are 4 strategies of TOAST storage:

Plain
Extended
External
Main

Storage type	Compression	Out-of-line storage	Usage	Order of compression + out-of-line storage
Plain	No	No	Short rows that do not go beyond TOAST Limit	Not applicable
Extended	Yes	Yes	JSONB and other data types	Compression done first; if row is still big, do OOL storage
Main	Yes	No	If TOAST data is to be stored in same tablespace as the normal non-verflow data	Only compression
External	No	Yes	If you want faster lookups than MAIN. But this comes at storage size penalty.	Only OOL

MySQL

Since MySQL has decoupled storage engines such as InnoDB, MyISAM etc; hence handling would be different for in other separate engines. Considering the most popular and default storage engine which is InnoDB; there are some configuration to handle large JSON objects.

Similar to Postgres, if the row size goes beyond the default configured 8K page size, the JSON document can be moved to an external overflow space with a pointer to current row.

There are 4 row format for handling overflows:

Redundant Legacy row format available before MySQL 5.0.3 only. Can store maximum indexes of 768 bytes for any type of large column such as VARCHAR, BLOB, TEXT. Overflow is put into new overflow page with a 20 byte pointer from main page -> overflow page. If overflow page overflows (inception!), then a new pointer is created from overflow page 1 -> overflow page 2 -> ...
Compact Similar to Redundant, but has storage savings of upto 20% due to compaction.
Dynamic Dynamic storage format tries to keep all data in one page. If data is too big, whole data is stored in seperate page instead of breaking up the data and using pointers to map from map to overflow page. Index Key prefix length is 3072 bytes instead of 768 bytes.
Compressed Similar index key prefix length of 3072 bytes. It does page-level compression which is not done in previous row formats

Conclusion

JSON is supported in Relational Databases, but the implementation and nuance on its usage is needed to be understood in order to makes a decision on how and if to use JSONs in databases such as Oracle, MySQL and Postgres!

JSON in Traditional Databases

Using JSON in traditional databases is possible! Check out on how to do it properly in MySQL and Postgres

Table of contents

When JSON can be a good fit

Tenant-specific data

Telemetry data

Feature flags

JSON type Support

Postgres

Caveats of Postgres JSON types

MySQL

SQL example

MySQL

JSON Methods

Postgres

JSON Methods

Index on JSON

MySQL

Postgres

Memory consideration

Postgres

MySQL

Conclusion

JSON in Traditional Databases

Using JSON in traditional databases is possible! Check out on how to do it properly in MySQL and Postgres

Table of contents

When JSON can be a good fit

Tenant-specific data

Telemetry data

Feature flags

JSON type Support

Postgres

Caveats of Postgres JSON types

MySQL

SQL example

MySQL

JSON Methods

Postgres

JSON Methods

Index on JSON

MySQL

Postgres

Memory consideration

Postgres

MySQL

Conclusion

Did you find this article valuable?