It’s no secret that my database of choice for most “normal” types of development situations is PostgreSQL. Today, I want to walk you thru two of the more advanced datatypes available in Postgres, JSON and JSONB, via Active Record. Let’s get the elephant out of the room, Active Record is primarily a technology used with Ruby on Rails. While it’s true that it can be used with other frameworks, I am going to assume that you’re primarily working in a Rails environment, since that’s the most common use-case; in the coming months, however, I am going to be exploring other Ruby frameworks beyond Rails. Let’s also get the second elephant out of the room. Unlike most of my posts on specific technologies, this is going to be from a twenty-thousand-foot level and be more of a quick intro. The reason for that is that so many things can affect database performance and what data types/schema make sense for your project. Further, I am assuming you are not at Twitter scale and am therefore not focused on the highest-end concerns for this post.
The plain old JSON data type is exactly what it says on the tin, text that must be formatted as valid JSON. Taking the most naive view possible, JSON is just the ActiveRecord text type with a few validators; put down your keyboards — I know that’s missing some nuance. The JSONB type is actually a binary datatype. What that means is that the JSON string or object if you like that you save is converted into a binary type by ActiveRecord.
You’re probably thinking: “Binary! Hot dog! That’s got to be the right choice!” It’s true that in a lot of cases JSONB has some significant advantages of plain JSON. The largest one is that it has indexes, unlike JSON. If you’ve worked with SQL databases of any scale, you know how important indexes can be for reasonable performance when used correctly. It’s also a binary format and can (probably is in most cases, but there are cases where this isn’t true) be faster on queries in general than text JSON.
There are of course disadvantages with the binary format, the largest being that on the initial write there’s a conversion process that needs to happen that can perform poorer than the regular old JSON type: again, I am using weasel wording here to avoid the “well, actually” type arguments that so often happen when performance is brought up. It’s also a little more complex to reason about as there are small pitfalls here and that do not exist with the text type. JSONB tables have a bit more overhead than regular JSON and therefore have larger tables. The binary type also doesn’t use a database feature called statistics, leading to potentially slower queries; digging into what PostgreSQL statistics are is beyond the scope of this, but it’s basically a way that the database tries to optimize queries. Now, I know what you’re shouting at your screen: “You just said JSONB was faster, now you’re saying it’s not!” Well, yeah. Life is complicated. Compared to a traditional database table using only old-fashioned data types (ie no JSON of either kind), then JSONB will not win that race; I am assuming a well-structured table here of course.
Both JSON types are extremely useful and have their place, but are like cumin. They can make the dish, but overused and that’s a problem. Going on a limb here, I generally recommend that they be considered as a sort of last resort data type. In general, I try to only use them when the data I want to store is actual JSON, not just serialized Ruby objects. That might seem a little odd to more experienced developers for me to even say, but an anti-pattern that I’ve seen all to frequently is these types used as a way to just shove some data into a column with little regard for structure. This is just laziness and, if done frequently enough, is going to bog down your database. JSON and JSONB are great tools to have, but they are just that tools with very specific use cases and are no means a replacement for good upfront database design and architecture work. I hope you found this helpful and please reach out to me on Twitter with any feedback or questions or just to say ‘hey.’ Oh! And be sure to subscribe to my podcast!