In my 15+ years of web development, there are very few things I can say are unequivocally a good idea. It almost always does depend. Storing timestamps instead of booleans, however, is one of those things I can go out on a limb and say it doesn’t really depend all that much. You might as well timestamp it. There are pl...
JackbyDev
link
fedilink
English
21Y

A CRDT Boolean is also pretty easy to write. Essentially it is just a last-write-wins element set with a single possible value and instead of representing if it is added or removed it represents true or false. To get the current state you take the latest timestamp. To merge two values you update the true and false timestamps to the latest of each.

key
link
fedilink
English
8
edit-2
1Y

Only really makes any sense for flags that go from false to true and don’t go back often. And even then it has huge semantic cost. How do you distinguish a “boolean timestamp” from an actual timestamp? Is “modified at” a flag indicating a pending modification or a timestamp when it was last modified?

Much better to just have two columns, so e.g. you can see “enabled” and an 'enabled_date" that indicates when you last enabled/disabled the entity.

Domi
link
fedilink
English
41Y

Much better to just have two columns, so e.g. you can see “enabled” and an 'enabled_date" that indicates when you last enabled/disabled the entity.

That sounds good until you realize you now have two sources of truth, do you trust enabled or enabled_date? If you really want to go this route enabled should be a virtual field that checks enabled_date in the background so you can have the boolean semantics but still keep a single field.

I also used booleans a lot previously but since using Laravel I have come to enjoy the updated_at, created_at and deleted_at fields that it automatically creates and I follow this format as well now if I need more.

@Steeve@lemmy.ca
link
fedilink
English
11Y

But if it can be disabled we’d also need a disabled_date, however this implies that the state can switch from enabled to disabled and vice versa an infinite number of time, so we should create n*2 fields (enabled_date_1, disabled_date_1, …, enabled_date_n, disabled_date_n) where n is the maximum amount of state switches/2. Of course we’ll have to implement stream logging of events into a database, or at least some sort of counter, to determine the value of n, and then dynamically create new fields as needed.

Problem solved!

I think having an enabled_at field as nullable timestamp is enough.
If it’s present, it’s enabled. If it’s null, it’s disabled.
It’s a Boolean with context.

If you really need to track the history of a record being enabled/disabled, I’d suggest this should be in another table. With postgres (not sure if it’s all DBs) you could create a trigger that when a record’s enabled_at field is updated, it creates a record in the log table with a from state, a to state, a timestamp, even a role/user.

That way, you could then extract the history of that record if required.
Tbh, if using postgres, you could just make a logging table that stores a JSON of the entire old record, and a JSON of the entire new record.
Would let you rewind the history of a record, see who did what, etc.

Saves having an enabled and an enabled_at where there are potentially multiple sources of truth, or faffing around with arrays, multiple fields, over-pulling data

@Steeve@lemmy.ca
link
fedilink
English
11Y

Yes my comment was definitely just a joke lol

How do you update it to unoublish? Add another timestamp column and who’s the latest win or just set published_at to null?

@pie@programming.dev
link
fedilink
English
21Y

You wouldn’t store this information on the same table (unless you’re using a wide row db like dynamo/Cassandra). In a SQL world, you’d store version information in a separate table - one table for the HEAD state and another for history.

So, the history table have every column, but the user table has only user id and version, right?

user_history table user table

@canpolat@programming.dev
link
fedilink
English
9
edit-2
1Y

Good point. However, approaching this problem from “YAGNI” point of view is a bit misleading, I think. If you are not going to need the timestamp, you shouldn’t add it to your code base.

In my opinion, hastiness is the culprit. When a property appears to be a binary one, we jump to the conclusion to use a boolean way too quickly. We should instead stop and ask ourselves if we are really dealing with a situation that can be reduced to a single bit. The point raised by the article is a good example: you may want to record the state change as timestamp. Moreover, in a lot of the cases, the answer is not even binary. The values for is_published may be, “Yes”, “No” or “I don’t know” (and then we will be too quick to assign null to “I don’t know”). Underlying problem is that we don’t spend enough time when modeling our problems. And this is a sure way of accumulating technical debt.

I think this timestamp-as-a-boolean is a good idea if the field is always going to be interpreted as either True or False and nothing more. If the field in question allows for a 3rd (uncertain) value, then using a timestamp would be extremely confusing.

And it all depends on the problem at hand. Any of those solutions can be acceptable as long as you have a well thought out model.

@rmam@programming.dev
link
fedilink
English
0
edit-2
1Y

Good point. However, approaching this problem from “YAGNI” point of view is a bit misleading, I think. If you are not going to need the timestamp, you shouldn’t add it to your code base.

I don’t agree it was a good point. It sounds like the blog author missed a requirement a few times, and after getting repeatedly burned in the requirements gathering stage he now overcompensates previous failing with I’ll advised usages of timestamps instead of booleans.

YAGNI is always true. Always. The author’s point, even when timestamps end up being required, are moot.

Also, if state changes are required them you don’t tack on a timestamp to a row. You instead track events, including switching stuff on and off.

I feel this blog post is bad advise fueled by trauma.

Jim
link
fedilink
English
41Y

Ehhh, I don’t quite agree with this. I’ve done the same thing where I used a timestamp field to replace a boolean. However, they are technically not the same thing. In databases, boolean fields can be nullable so you actually have 3-valued boolean logic: true, false, and null. You can technically only replace a non-nullable field to a timestamp column because you are treating null in timestamp as false.

Two examples:

  1. A table of generated documents for employees to sign. There’s a field where they need to agree to something, but it’s optional. You want to differentiate between employees who agreed, employees who disagreed, and employees who have yet to agree. You can’t change the column from is_agreed to agreed_at.

  2. Adding a boolean column to an existing table. These columns need to either default to an value (which is fair) or be nullable.

@Kissaki@feddit.de
link
fedilink
English
11Y

To add to this: the DBMS may treat null as unknown rather than not set. This may be not immediately obvious or noticeable, but it means a check requires different syntax and even double checks for set and value. Using null as ‘cleared’ goes against the DBMS definition of what null means.

Sql server docs

As a side note: this difference in null behavior can become especially problematic when you use entity frame for to map tables and sql queries to C#. Because it’s not obvious and may not be known or seen.

@delial@lemmy.sdf.org
link
fedilink
English
11Y

Yeah, this feels like “premature optimization”. When you design your applications and databases, it should reflect your understanding of the problem and how you solved it as best as possible. Using DATETIMEOFFSET NULL when you actually mean BIT NOT NULL isn’t saying what you mean. If you already understand that you have a boolean option and you think you might need a timestamp to track it, use 2 columns. Or an audit table. So sayeth the holy SRP.

Victron
link
fedilink
English
11Y

Completely agree, I cram a timestamp column in every table, but booleans have their purpose too.

@lukad@programming.dev
link
fedilink
English
41Y

Using a nullable Boolean to represent 3 distinct states just adds confusion and complexity to your system. In most cases I would prefer to use an enum with 3 fields which is non nullable.

Create a post

Welcome to the main community in programming.dev! Feel free to post anything relating to programming here!

Cross posting is strongly encouraged in the instance. If you feel your post or another person’s post makes sense in another community cross post into it.

Hope you enjoy the instance!

Rules

Rules

  • Follow the programming.dev instance rules
  • Keep content related to programming in some way
  • If you’re posting long videos try to add in some form of tldr for those who don’t want to watch videos

Wormhole

Follow the wormhole through a path of communities !webdev@programming.dev



  • 1 user online
  • 1 user / day
  • 1 user / week
  • 1 user / month
  • 1 user / 6 months
  • 1 subscriber
  • 1.21K Posts
  • 17.8K Comments
  • Modlog