Why I Use TINYINT Columns Instead Of BIT Columns For Boolean Data In A MySQL Application
Since the theme of my week appears to be database interactions (having looked at index structure query-performance on "membership" tables and prepared statements in the JDBC driver), I figured I would end the week on more database-related "hot take": I no longer use BIT
columns to store "Boolean" data in MySQL. Instead, I use a TINYINT
column.
CAUTION: There is no "right" answer on this topic - this is just my opinion based on my experience and the choices that have come back to haunt me / make my database interactions more challenging.
Semantically speaking, a BIT
field is no more clear or meaningful than a TINYINT
field when it comes to storing Boolean data. Because, at least in MySQL, a BIT
field isn't a true-false data type - it's a binary data type. A BIT
field contains "bits": N-number of bits, between 1 and 64. The only reason that it can represent a true-false value within an application is because the developers working on that application have all agreed to treat it as such.
A TINYINT
field is also not a true-false data type - it's a numeric data type. And, like the BIT
field, the only reason that it can represent a true-false value within an application is because the developers working on that application have all agreed to treat it as such.
So, semantically speaking, neither BIT
nor TINYINT
are "Booleans"; but, they can both be overloaded to represent a Boolean value if, and only if, the developers working on that application all agree on said social contract.
Now, storage-wise, there's also no difference between a BIT(1)
and a TINYINT
because a BIT
field is actually stored in bytes. Meaning, in order to store a single bit, you have to store 8-bits. Which is also the amount of storage required by a TINYINT
.
So, from a physical stand-point, you're not "saving space" by using a BIT
instead of a TINYINT
when attempting to represent a Boolean value.
To recap:
Neither
BIT
norTINYINT
is a "Boolean" value. We can only treat them as Boolean values when our developers agree to treat them as Boolean values.Both
BIT(1)
andTINYINT
require a byte of storage.
Since there's no semantically meaningful difference between the two data-types when it comes to representing Boolean values, neither data-type is "more correct". But, there are downsides to using a binary field in order to represent a true-false value:
The MySQL driver for Node.js returns
BIT
data as aBuffer
; because, of course it would. After all, aBIT
column represents binary data, which is what theBuffer
represents in Node.js. As such, in order to translate aBIT(1)
result into atrue
/false
data-type, you have add special type-casing logic to your database client configuration.The MySQL driver for Java returns binary data when
BIT
is used in aCOALESCE()
call. As such, if you are performing aLEFT OUTER JOIN
on aBIT
field and attempt to provide a default value withCOALESCE()
, you have toCAST()
the resultant value back to anUNSIGNED
type in order to use the value as a Truthy within your application code.
The reason I have articles written on the above two issues is because I've personally run into these two issues and wasted hours trying to understand the issue and come up with a work-around.
On the other hand, I have never personally run into an issue when using a TINYINT
column to represent a "Boolean" value.
With all that said, I will conceit that MySQL will allow you to store 2
in a TINYINT
field; but, will prevent you from storing 2
in a BIT(1)
field because a 2
would require more bits than are defined on the column-type. As such, the BIT(1)
type does place more constraints on the stored value.
That said, in my ColdFusion applications, if I'm persisting a true
/ false
value, I'm generally not using numeric types in the code - I'm using Boolean types that get persisted as TINYINT
types at write-time (using CFQueryParam
):
public void function setIsFavorite(
required numeric movieID,
required boolean isFavorite // THIS IS A BOOLEAN ABSTRACTION!
) {
```
<cfquery>
UPDATE
movie
SET
isFavorite = <cfqueryparam value="#isFavorite#" sqltype="tinyint" />
WHERE
id = <cfqueryparam value="#movieID#" sqltype="integer" />
</cfquery>
```
}
Of course, being that ColdFusion is a very dynamic, flexible language, you can pass a numeric value in for a Boolean type. As such, one could easily pass-in a value such as 2
for the isFavorite
argument above and ColdFusion would:
Not complain that it's a numeric value.
Persist a
2
into theisFavorite
field.
To fix this issue, you can normalize the isFavorite
boolean at write time in a variety of ways such as the double-bang operator, or the booleanFormat()
, yesNoFormat()
, or trueFalseFormat()
functions:
public void function setIsFavorite(
required numeric movieID,
required boolean isFavorite
) {
```
<cfquery>
UPDATE
movie
SET
-- Normalize using Double-Bang operator:
isFavorite = <cfqueryparam value="#( !! isFavorite )#" sqltype="tinyint" />,
-- Or, booleanFormat:
isFavorite = <cfqueryparam value="#booleanFormat( isFavorite )#" sqltype="tinyint" />,
-- Or, yesNo:
isFavorite = <cfqueryparam value="#yesNoFormat( isFavorite )#" sqltype="tinyint" />,
-- Or, trueFalse
isFavorite = <cfqueryparam value="#trueFalseFormat( isFavorite )#" sqltype="tinyint" />
WHERE
id = <cfqueryparam value="#movieID#" sqltype="integer" />
</cfquery>
```
}
In the end, neither the BIT
type nor the TINYINT
type are a prefect choice when representing a Boolean value within a MySQL database. They are both semantic overloads that take up the same amount of storage space. In the end, the choice is personal; and for me, the TINYINT
makes the most sense since a TINYINT
never causes issues in my SQL queries (in the way that BIT
has and does). And for me and my style of ColdFusion development, that's the right trade-off.
Your mileage may vary.
Epilogue On The MySQL JDBC Driver
In writing this article, I discovered that the MySQL JDBC Driver has a few Boolean-related properties that can be set:
tinyInt1isBit
: Should the driver treat the datatypeTINYINT(1)
as theBIT
type (because the server silently convertsBIT
->TINYINT(1)
when creating tables)? Default: true.When I was experimenting with this, what I noticed is that the value comes into the ColdFusion application as a
0
or1
; however, the value may still be persisted as a larger integer in the data-table.transformedBitIsBoolean
: If the driver convertsTINYINT(1)
to a different type, should it useBOOLEAN
instead ofBIT
for future compatibility with MySQL-5.0, as MySQL-5.0 has aBIT
type? Default: false.
I had never seen these before, so I have no feelings about them.
Want to use code from this post? Check out the license.
Reader Comments
Very interesting. I have always used TINYINT, but, not because I am smart. It's just the way I have always done it. I guess it's maybe because the word bit frightens me!
What I don't really understand, is why you would want to do:
Wouldn't this just return true or false, when in fact, you want to submit either a 0 or 1
And wouldn't this throw an exception? Because you are trying to submit true/false into a TINYINT field?
@Charles,
Good question - the
CFQueryParam
tag is actually converting the strings"true"
and"false"
into the numeric equivalent for theTINYINT
parameter type. In this case "true" gets converted to "1" and "false" gets converted to "0". The conversions are only there to protect against someone trying sneak a number through as one of the boolean values. The reality is, nothing would really be "damaged" by having a value like2
in the database. The downside would be that a SQL query that only ever checks for0
or1
will just never match against that record.You should have an asterisk next to 1 bit uses 1 byte of storage. It is my understanding that every set of 8 bit-fields uses the same 1 byte in a row of data. I know this doesn't sound like a big difference but I guess there are databases where this could really add up.
cnt bits tinyint
1 1 byte 1 byte
4 1 byte 4 bytes
8 1 byte 8 bytes
9 2 bytes 9 bytes
15 2 bytes 15 bytes
32 4 bytes 32 bytes
I only realized this because I had the same thought awhile ago. Sometimes rules change where a column may start out as a yes/no but could turn into a type or how many. I looked into why not just make them all tinyint from the start. Well that is why.
Some good points here. For me it's this 5+ year old workbench bug which prevents it from using bit fields in any productive way: https://bugs.mysql.com/bug.php?id=79604
Is there a blog explaining why should we even use tinyint/bit over boolean? Early in career i learnt that has performance benefits. I'm yet to find a concrete explanation on how it is better. Another thing , as you have mentioned multiple times about using other options to represent boolean works because "the developers working on that application have all agreed to treat it as such" . Isn't it then difficult from maintenance point of view?
Appreciate your help on the topic.
@Aarzoo,
MySQL BOOL(EAN) type is the same as TINYINT(1). I think the MySQL maintainers said they planned on implementing real BOOL type like 10 years ago, to comply with the ANSI SQL standard, but still have not, as far as I know.
https://www.tutorialspoint.com/does-mysql-converts-bool-to-tinyint-1-internally#:~:text=Yes%2C%20MySQL%20internally%20convert%20bool,the%20smallest%20integer%20data%20type.
Your link, "a BIT field is actually stored in bytes", does not actually link to documentation that states this. See here instead for that info.
Additionally, if you are using NDB, you should be aware that a multiple of 4 bytes is reserved for all datatypes to facilitate its 4-byte alignment.
@Jivan,
Thank you for providing more appropriate links 💪 I don't think I've ever heard of NBD storage - I'll have to look that up.
Bits use less storage than tinyint (significantly less, as the number of bit columns increases). Bits process faster than tinyints. A programmer who uses tinyint is signaling to the next programmer, and the next programmer will assume, the data is meant to contain values greater than 1. If 1s and 0s only represent true and false, and are not actually true and false, then 1 more accurately represents true and 0 false that reversing them would not make sense. However, 1s and 0s do not represent true and false. 1 and 0 are true and false, as math itself is an abstraction for logic. If it were not the case, Karnaugh maps would not work, and math itself would fail. To suggest their interpretation could be reversed is to suggest 5 could represent 6. At that point, further discussion is not helpful.
@Johnny,
At least in MySQL, I don't believe you are correct about the storage requirements. From what I am seeing in the documentation for InnoDB storage, bits are stored as multiples of bytes. So, a
TINYINT
andBIT(1)
both use a single byte.As far as "bit are processed faster", I am not sure I understand what that means. What processing are you talking about? In this post, I am specifically talking about using
TINYINT
to represent True and False values, so I am not doing any processing on the value itself (other than casting it to Truthy/Falsey -- but that is happening in the application code, not in the database code).And, as far as signaling intent to the next developer, I would only offer up that this has not been an issue for the teams that I work on. Normally, when an engineer adds a datatable, they also add the data-access layer that consumes that datatable. The data access layer deals in Boolean values; and, only casts to
TINYINT
when writing to the database. As such, signaling from the code is usually very clear. Not to mention that the database column names choose a truthy-style semantic, likeisActive
,isArchived
,isApproved
, etc. As such, to try and put a5
into anisActive
column wouldn't make any sense.I definitely see your points; but, at least in my experience, they feel more grounded in theory than in practice. But, again, your mileage may vary.