15

Simply put, I have date and time attributes on an Orders table. The data types for these attributes are MySQL's DATE attribute. But everytime I echo the date and time in PHP it just gives me a string, not a date object. I don't understand as to why the DATE data type exists if you can casually store date objects as VARCHAR in the db.

$date = date("M d, Y");
$time = date("H:i A");
15
  • 18
    What happens if you try to find the difference of two VARCHAR objects? Commented Feb 9, 2021 at 12:21
  • 14
    Or "how many days are there between date1 and date2?" or "what ISO week number is date1?" or "select all records where date1 is a Monday?" etc etc etc Commented Feb 9, 2021 at 12:33
  • 33
    I suggest you benchmark your method against doing the same comparisons natively in the database. Don't be surprised if the database is 100x quicker. Commented Feb 9, 2021 at 12:41
  • 3
    Lets just say you have 100 orders a year for the last 10 years. If you wanted all the orders within the last year, what is more logicial and optimised, get the database to return only the records you need OR loop through all the records yourself and collect the ones that match your criteria.
    – Crazy Dino
    Commented Feb 10, 2021 at 0:24
  • 8
    "In comparing such objects I would just convert them into datetime objects in php and then do the comparing." Why are you using an SQL database if you're not going to use SQL? Commented Feb 10, 2021 at 5:34

6 Answers 6

62

I don't understand as to why the DATE data type exists

Always store Date values in Date Fields.

What PHP gives you back when you retrieve those values is a Character Representation of the Date value but, within the database, you can (and should) perform Date operations on Date fields that you simply cannot do effectively with Strings.

For example, let's look at some similar-looking entries:

select 
  char_date
, date_date 
from table1 
order by char_date ; 

+-------------+-------------+
| char_date   | date_date   | 
+-------------+-------------+
| 01-Mar-2021 | 01-Mar-2021 |
| 12-Feb-2021 | 12-Feb-2021 |
| 23-Jan-2021 | 23-Jan-2021 | 
+-------------+-------------+

select 
  char_date
, date_date 
from table1 
order by date_date ; 

+-------------+-------------+
| char_date   | date_date   | 
+-------------+-------------+
| 23-Jan-2021 | 23-Jan-2021 | 
| 12-Feb-2021 | 12-Feb-2021 |
| 01-Mar-2021 | 01-Mar-2021 |
+-------------+-------------+

See the difference?

And because applications tend to apply far less control over how Character data is entered, your char_date data very quickly gets awfully messed up and even more difficult to interpret.
For example, when is '01/04/07'? January 4th? April 1st? April 7th (2001)? Depends where you are in the world!

OK, you could say that your application will convert the entered, Character Representation of a Date value into a proper Date and store a consistently-formatted [Character] version of that but, as soon as you start doing Date operation on those values, your database has to start doing Date conversions on the fly, which can be horribly slow.

Use the Right Tool for the Right Job ...

10
  • 31
    That's great if you're converting the data values from a single row that you can otherwise identify. What I'm talking about is when you need to /find/ rows based on Date values - say orders that are more than a year old. If your dates are stored as Character data, especially in undefined formats, this sort of simple, Date query gets tortuous and unreliable. Are your HTML controls Locale aware? If so, we're back to my '01/04/07' example and your supposedly reliable date formatting has gone right out of the window.
    – Phill W.
    Commented Feb 9, 2021 at 13:32
  • 1
    .. If you're parsing a date to validate, check assumptions about d/m/y versus y/m/d versus m/d/y; if the year is a leap-year, and is in a sensible range (no 10,000 BC etc). Typically, databases will allow you to store a date object as such and then permit queries like "select * from txn where tdate between sysdate and sysdate-7;" and other stuff like (paraphrasing) 'get sales which billed during business hours of /this/ client's timezone'.
    – jmullee
    Commented Feb 9, 2021 at 22:22
  • 20
    @mizstereo What happens if someone needs to connect a second, non-php application to your database? What happens if you have to update this thing to another tech? Are you going to port all that conversion logic, too? It may look easy, but by storing varchar instead of date, you are slowly feeding a big, bad dog, called "Technical Debt". Someday it will break out of its pen and go right for your buttocks, and you will wish it wasn't that big...
    – T. Sar
    Commented Feb 10, 2021 at 0:21
  • 3
    @mizstereo: Honestly, a string in YYYY-MM-DD format is almost as good as a proper date field, since it's unambiguous and works properly in comparisons. What you still gain with a real date field is mostly that 1) you can do date arithmetic in the DB (e.g. to find all dates at most X days before today), 2) the database rejects any invalid or ill-formed dates, so you don't have to rely on the client alone for validation, and 3) you'll save a few bytes per date due to a more compact storage format, which could matter if your database was huge and had lots of dates in it. Commented Feb 10, 2021 at 11:06
  • 1
    I'm tempted to downvote; about half of the answer is invalidated if you just use an ISO 8601 date/time format. Date-as-characters is a solved problem. I suggest focusing on the advantages that the date format has over proper varchar usage rather than ways one could misuse varchar. Commented Feb 10, 2021 at 14:28
24

In programming, we are always working with abstractions and representations:

  • The text string 2021-02-09 12:47:14 UTC is a representation of a particular point in time.
  • The integer 1612874834 is a different representation of that same point in time.
  • Both of those will ultimately be represented in binary form; a text string and a written numeral are already abstractions that saves us thinking about that fact.

We can convert between those and many other representations relatively easily, but we also want to be able to perform operations on them: take input, display output, sort dates, find the number of days between two dates, etc.

For those, we want to use appropriate abstractions - we don't want to code a bunch of binary logic for every operation, we want to build up readable code.

  • For user input and output, the abstraction of "a date is a text string that matches some pattern" is a useful one.
  • However, there may be many such patterns, depending on the context - a user in the USA will want to see and enter dates in "month / day / year" format, users elsewhere in the world will want "day / month / year". So we immediately need a neutral abstraction which represents the date rather than just storing the user's input directly.
  • For finding the number of days between two dates, we want a higher-level abstraction that is not about what the date looks like but what it represents.

Using the wrong abstraction for the job leads to unnecessarily complex code. It can also lead to bugs, e.g. if you pass all your dates around as strings, and accidentally pass in something incorrectly formatted, you'll probably get "garbage in, garbage out" rather than a clear error.

So, wherever possible, you should represent everything with higher-level abstractions. With date-time values, that means DateTimeImmutable objects in PHP, and DateTime columns in MySQL.

There is one unfortunate hurdle: when sending data to or from the database, you need it to be in a representation that both sides understand. Most database drivers use a "lowest-common denominator" approach, where that representation is a text string. As soon as you have that string from the database, though, you can and should convert it to PHP's DateTimeImmutable type, so you have the best abstraction for working with it.

8
  • 1
    Nitpicking: parts of the world wants to see dates as YYYY-MM-DD, there are a lot of variations.
    – ghellquist
    Commented Feb 10, 2021 at 4:18
  • 1
    @ghellquist Yes, I didn't want to go into too much detail; by "elsewhere", I just mean "in at least one other part of the world", rather than "absolutely everywhere else", although it is probably the most common order.
    – IMSoP
    Commented Feb 10, 2021 at 8:49
  • 2
    And 1612874834 is itself (only) a representation of an integer Commented Feb 10, 2021 at 21:35
  • 2
    @DavorŽdralo Heh, true, the string should have a timezone. The integer, assuming the normal Unix convention, does represent a point in time, since it's a number of seconds since a particular moment of time defined as "the epoch". (The epoch happens to be midnight in a particular time zone, but the timestamp itself doesn't have a timezone as such, it's just the number of seconds a clock would have ticked counting from that moment.)
    – IMSoP
    Commented Feb 11, 2021 at 11:47
  • 1
    @IMSoP, note that there are also some use-cases where you actually want to represent the wall-clock time that means different point in time in each time zone.
    – Jan Hudec
    Commented Feb 11, 2021 at 17:49
13

But everytime I echo the date and time in PHP it just gives me a string, not a date object.

That's basically just the quirk of your DB access library. When it comes down to it, all data are reducible to strings, and it's just a matter for the DB access library to convert between database data types and the programming language's data types.

why the DATE data type exists if you can casually store date objects as VARCHAR in the db

You can store date in a VARCHAR column, you can also store non-dates in that VARCHAR column. Database has Data types for columns to enforce data validation, when you use DATE columns you know that it is impossible for the column to contain anything other than dates, so you can simplify your application logic so it never have to worry about containing invalid data.

Additionally, datatypes allows the database to use datatype-specific functions, which allows you to build more complex operations like aggregate/group by queries that involves the column. If you need to write aggregate queries for reports like "find the total number of transactions grouped by the day of the week". You'll need to use the database date/time functions because loading the entire table to PHP so you can parse the dates using PHP date functions will be very slow and you can't then use those as intermediate results for a subquery.

Data types are data plus validations and operations.

In theory, you only ever need the varchar type to store any data, but you'll then have to deal with all the validations and operations yourself in the application code, which will make it much harder to write performant queries.

11

Size

A MySQL DATE is 3 bytes. Each char is a VARCHAR is 1 byte. Which means your VARCHAR date is at least 2.67 times the size of a DATE. 3.33 if you include separators. Or much larger if your string dates are not formatted sets of digits. That's not huge, and probably not a concern for space on disk. But for space in memory, when you need to fetch or sort the data, it adds up.

Data Integrity

If someone tries to INSERT a date of 'Duodecary 32nd 1999' (or even a formatted 1999/14/32), a DATE field will not allow it. It will throw an error as is good and proper. Your VARCHAR field won't. Which means you need to code against that potential, and you need to guarantee that anyone else who ever codes against the database does the same. You are creating either potential issues or more work for yourself. Usually both.

In addition, depending on the maximum length of your VARCHAR, you can get string truncation errors, or bad data if you suppress those errors. Unless your VARCHAR is longer than it needs to be, in which case your size issues may be worse.

Simplicity

This is mentioned in other answers but bears repeating. There are many functions that deal with DATE data types. Adding a certain amount of time. Determining the difference. Converting to other languages (English/Spanish/French, not C#/JavaScript/PHP). Converting to other date formats. These do not exist for VARCHAR. Meaning you have to either do it yourself, or convert the VARCHAR to a DATE in the first place. Which is both extra cycles, and something you need to remember every time you you need to manipulate your date.

See also:

https://stackoverflow.com/questions/4759012/when-to-use-varchar-and-date-datetime https://dba.stackexchange.com/questions/208716/which-column-is-better-to-be-used-here-varchar-or-datetime/208724

Also, one last thing I want to add. You should not ever design your database based on the programming language you are using to access it. You should design your database based on what your data is and proper database design standards. You might switch to Ruby or Typescript next week. Or someone else may need to access the data who doesn't use PHP. A good data structure will serve you far better then changing your data to meet a language or library's quirks.

2
  • 2
    If you've got millions of rows and multiple date columns, it really adds up.
    – OrangeDog
    Commented Feb 10, 2021 at 11:08
  • 1
    Nitpick : Each char is a VARCHAR is at least 1 byte. Actual size depends on encoding.
    – Taemyr
    Commented Feb 11, 2021 at 9:13
3

You can of course do whatever you want. But there are reasons for having the data types for dates and time and using them.

  • sorting according to date (might work anyway, but depends on how you decide on textual representation)
  • calculations: how many days between two dates

Most important to me is that a web application may have simultaneous users expecting different representations. Personally, I like 2021-02-10, some like January 10, 2021. There will be a need for conversion routines for what the actual user expects. This is messy but is helped by the standard routines available for the date type. In a similar way, I prefer 24 hour time, another user might prefer AM/PM (basically unheard in my language zone). The area is sometimes called I18N, shortening the 18 characters in Internationalization and is a well-known drawer filled with issues.

0

Take the following string: "12-01-02". What date does that refer to? December 1, 2002? January 2, 2012? Something else? This really depends on what part of the world you're living in.

For that reason, you really need to store dates as dates, and not as strings. As other(s) have mentioned, it also helps with operations being performed on the date values, but the #1 reason for this is because it helps programmers read the code more easily, as well as to not introduce bugs.

Not the answer you're looking for? Browse other questions tagged or ask your own question.