I've inherited a system with an Oracle relational database with a couple of tables modeled like I've sketched below, where there's an entire child table that only stores a single status code in a one-to-one relationship with its parent.
Here's a generalized sketch of the schema I'm talking about:
The ITEM
table is quite large, with hundreds of millions of rows. COLLECTION
is a few hundred rows, and its COLLECTION_STATUS
does not actually appear to be used anywhere in the application (it is always initialized with the same value, which is never changed).
What would be the benefits or motivations for designing a schema like this? If I were designing this schema, I would have just made COLLECTION_STATUS_CODE
a column on COLLECTION
, and would have not bothered with the COLLECTION_STATUS
table. However, there's a pattern here, and it seems like it dates to the earliest history of this application. Unfortunately, the system is more than a decade old, and all the original developers have long since left the company (in a site closure).
Could it be something to do with performance? The system regularly creates new COLLECTION
s that are mostly full copies of previous COLLECTION
s, and I believe that process is IO bound in the DB (on the SELECT
not the INSERT
part), most of which happens in one DB transaction. My naive gut feel is the creating the need for a join in this way wouldn't be worth it, through.
We need to add a new kind of "status code" to COLLECTION
, so I'm faced with the choice of whether to add a new column to COLLECTION
for it, or follow the unused COLLECTION_STATUS
design and add it there.
Edit 1: Just to clarify something: as best I can tell, the COLLECTION
, COLLECTION_STATUS
, ITEM
and ITEM_STATUS
tables were all created around at the same time very early in the system's history (before it was ever in production) as part of some kind of refactor. This particular case is not a situation of trying to graft new functionality on to an old crusty system, since it was like this almost from the start.
Also, no applications, besides two that my team controls (that were originally one app), connect directly to this database.
Edit 2: Also none of these tables are very wide. COLLECTION
and ITEM
have only about 5 non-key columns. COLLECTION_STATUS
and ITEM_STATUS
only have one or two non-key columns.