3

I have the following (and more non-related) columns in my MSSQL database table: Id, ImageUrl and ImageId. The combination of ImageUrl and ImageId can occur multiple times. ImageUrl belongs to ImageId and vice versa.

The following examples use dummy values for clarity, for the real database table ImageId: int and ImageUrl: nvarchar are used as data types.

ImageId = a that belongs to ImageUrl = a.com can occur multiple times, but ImageId = a cannot be used for other ImageUrl's.

ImageUrl = a.com can occur multiple times with the same ImageId, but ImageUrl = a.com cannot occur with multiple ImageId's such as a and b.

The following data must be possible:

Id ImageId ImageUrl
1 a a.com
2 a a.com
3 b b.com

The following two examples are not allowed:

Id ImageId ImageUrl
1 a a.com
2 b a.com
Id ImageId ImageUrl
1 a a.com
2 a b.com

I don't know, and haven't found MSSQL solutions for this.

Question: Is there a solution to enforce this in MSSQL? And is there a word/term for this constraint/enforcement?

Edit: This is for my existing database with a lot of data, where this table is the "primary" table. I would rather not change the structure of this table by changing or removing columns. Adding columns is fine. That is why the answer of @LeppyR64 would not be a solution for my use-case.

0

3 Answers 3

7

Using a relational database model would have avoided this issue. The pain of converting the data and adapting the application is the price to pay for the incorrect initial design.

That said, you can enforce the constraint needed on the existing arrangement using a materialised view:

CREATE TABLE dbo.T 
(
    Id integer NOT NULL,
    ImageId char(1) NOT NULL, 
    ImageUrl varchar(900) NOT NULL
);
GO
CREATE OR ALTER VIEW dbo.ValidComboT
WITH SCHEMABINDING AS
SELECT 
    ImageId, 
    ImageUrl,
    NumRows = COUNT_BIG(*)
FROM dbo.T
GROUP BY 
    ImageId,
    ImageUrl;
GO
CREATE UNIQUE CLUSTERED INDEX
    [CUQ dbo.ValidComboT ImageId, ImageUrl]
ON dbo.ValidComboT 
    (ImageId, ImageUrl);
GO
CREATE UNIQUE NONCLUSTERED INDEX
    [UQ dbo.ValidComboT ImageId]
ON dbo.ValidComboT 
    (ImageId);
GO
CREATE UNIQUE NONCLUSTERED INDEX
    [UQ dbo.ValidComboT ImageUrl]
ON dbo.ValidComboT 
    (ImageUrl);
GO

Test:

-- Succeeds
INSERT dbo.T (Id, ImageId, ImageUrl) 
VALUES (1, 'a', 'a.com');

INSERT dbo.T (Id, ImageId, ImageUrl) 
VALUES (2, 'a', 'a.com');

INSERT dbo.T (Id, ImageId, ImageUrl) 
VALUES (3, 'b', 'b.com');
GO
-- Reset
DELETE dbo.T;

-- Fails
INSERT dbo.T (Id, ImageId, ImageUrl) 
VALUES (1, 'a', 'a.com');

INSERT dbo.T (Id, ImageId, ImageUrl) 
VALUES (2, 'b', 'a.com');
GO
-- Reset
DELETE dbo.T;

-- Fails
INSERT dbo.T (Id, ImageId, ImageUrl) 
VALUES (1, 'a', 'a.com');

INSERT dbo.T (Id, ImageId, ImageUrl) 
VALUES (2, 'a', 'b.com');

Output:

(3 rows affected)

Msg 2601, Level 14, State 1
Cannot insert duplicate key row in object 'dbo.ValidComboT' 
  with unique index 'UQ dbo.ValidComboT ImageUrl'. 
  The duplicate key value is (a.com).
The statement has been terminated.

Msg 2601, Level 14, State 1
Cannot insert duplicate key row in object 'dbo.ValidComboT' 
  with unique index 'UQ dbo.ValidComboT ImageId'. 
  The duplicate key value is (a).
The statement has been terminated.

The limit for clustered index key length is 900 bytes, so this will only work if the combination of ImageId and ImageUrl fits within that.

This solution adds a small cost to every data modification that affects the view. There is also a storage requirement for the materialised rows. On the other hand, you may find the information in the view helpful for other queries. The database engine may also be able to make use of it too.

db<>fiddle demo

Considerations

The database engine takes care of all the details needed to keep the materialised view synchronised with the base table.

If you code your own solution (e.g. using a trigger) you take responsibility for covering all the edge cases that can occur under high concurrency. This is not always trivial. Two sessions might check for a conflicting row at the same time, conclude that it does not exist, then both insert on that basis resulting in an undetected duplicate.

On the other hand, the extra safety built into view maintenance comes at a cost to potential concurrency. If many rows in the base table are summarised in a single view row, additional blocking can occur due to this 'lock concentration'. Deadlocks are also a possibility, though correct indexing can mitigate this.

Maintaining an indexed view is typically slightly faster than a trigger because the plan operators necessary to maintain the view are inserted into the plan for the data-changing statement. A trigger requires row versioning, and its logic is executed in a separate SQL context. The cost is similar to executing additional SQL commands.

The materialised view validates all data in the base table when it is created. Triggers only apply to rows added or modified after the triggers are created, and while they remain enabled. Bulk inserts to the table will skip firing triggers by default.

These are all generalisations, and not a complete list of the considerations. You need to test the various approaches in your environment. Or transition to a good design where none of this messing around is required.

0
6

As someone else suggested, the best thing is to fix the model. It is often possible to use view + instead of triggers for backward compatibility.

If not possible and your purpose is to prevent adding malformed data, you could use a trigger:

create table T
( someid int not null primary key
, imageid char(1) not null
, imageurl varchar(100) not null
);  

CREATE TRIGGER trg ON T  
FOR INSERT
AS
IF  EXISTS(SELECT NULL FROM T x 
           JOIN inserted y 
               ON x.imageid = y.imageid 
               AND x.imageurl <> y.imageurl)
BEGIN
        RAISERROR('Image exists with conflicting url', 16, 1)
        ROLLBACK TRAN
        RETURN
END;

You probably should add one for the update as well

insert into T (someid, imageid, imageurl) 
values (1, 'a', 'a.com'), (2, 'a', 'a.com'), (3, 'b', 'b.com');

insert into T (someid, imageid, imageurl)
values (4, 'b', 'a.com');

Msg 50000 Level 16 State 1 Line 10
Image exists with conflicting url
Msg 3609 Level 16 State 1 Line 1
The transaction ended in the trigger. The batch has been aborted.

Fiddle

4

Create an Images table. ImageId is the Primary Key. Create an additional Unique index on ImageUrl column.

ImageId ImageUrl
a a.com
b b.com

The Data table can then have Id, and ImageId as a Foreign Key to the Images table.

Id ImageId
1 a
2 a
3 b
2
  • This would be a great solution for a new database. Unfortunately there is al lot of existing data in my MSSQL database. Furthermore this change would impact applications that use this database structure. So I would rather not change the database structure in such a way. I was thinking about a Index, Constraint, etc, that would not require changes to data/applications. For context, this is a new requirement to an existing application/database.
    – Max
    Commented May 28 at 8:10
  • 1
    That's understandable. Paul White has already posted the natural next option to that option. It still likely ends up with issues at the application level because of rejecting data that was previously accepted and might not be properly handled at the application level. Good luck!
    – LeppyR64
    Commented May 28 at 19:56

Not the answer you're looking for? Browse other questions tagged or ask your own question.