4

I'm trying to setup a (I thought) fairly simple versioning system for static html pages on a site. The goal is to keep previous versions of the content, then restore to them if needed (I guess basically creating a new version that's a duplicate of an old one), and optionally to toss out data older than X versions ago.

The table's setup is fairly straightforward:

  • id
  • reference_id (string/used to determine what page the item pertains to)
  • content (document/html page sized amount of data)
  • e_user (user who changed it last)
  • e_timestamp (when it was changed)

I just want to have something setup to create a previous version for each edit to the content, then be able to restore to it if needed.

What's the best method for accomplishing this? Should everything be in the same table, or spread across a few different ones?

I read through a few pages on the subject, but a lot of them seemed like overkill for what i'm trying to accomplish (ex http://www.jasny.net/articles/versioning-mysql-data/ )

Are there any platforms/guides about that will help me in this endeavorer?

3 Answers 3

5

Ideally you would want everything in the same table with something in your query to get the correct version, however you should be careful how you do this as an inefficient query will put extra load on your server. If normally you would select a single item like this:

SELECT * FROM your_table WHERE id = 42

This would then become:

SELECT * FROM your_table
WHERE id = 42 
AND date < '2010-10-12 15:23:24'
ORDER BY date DESC
LIMIT 1

Index (id, e_timestamp) to allow this to perform efficiently.

Selecting multiple rows in a single query is more tricky and requires a groupwise-maximum approach but it can be done.

2
  • So you're saying keep all the versions in one table, and use a version & timestamp column to track them? This probably sounds sophomoric, but how would you insert a new version? Grab the latest row, then add one to the version and insert a new row?
    – Jane Panda
    Commented Nov 8, 2010 at 20:27
  • The ORDER BY will put them in reverse order and the LIMIT 1 will grab on the first one, so this will grab the most recent version. The whole date filtering thing isn't really necessary at all. Commented Feb 16, 2012 at 15:15
4

You can use a technique called "auditing". You would set up audit tables. Then you would either write it into your code or setup triggers on the DB side so that every time a change is made, an entry is added into the appropriate audit table. Then you can go back through the audit table and see things like: "Oh, yesterday Sue went in and fixed a typo" "Uh oh, steve wiped out an entire paragraph by accident earlier today while trying to rewrite this section"

Your primary table that stores the data doesn't keep all that data, so it can stay slim. If you ever need to look at that data and say roll stuff back, you can go look in your audit table and do that. You can setup the audit table however you want, so each audit row can have the entire content BEFORE edit, and not just what was edited. That should make "rolling back" fairly easy.

1

Add a version column and a delete column (bool) and create some functions that compare the versions of rows with the same id. You'll definitely want to be able to easily find the current version and the previous version. To get rid of the data you'll want to write another function that sorts all of the versions of id, figures out which are old enough to be deleted, and marks them for deletion by another function. You'll probably want to have an option to make certain pages immune to deletion or postpone it.

Not the answer you're looking for? Browse other questions tagged or ask your own question.