Pseudo Random Repeatable Sort in SQL Server (not NEWID() and not RAND())

Question

I would like to randomly sort a result in a repeatable fashion for purposes such as paging. For this NEWID() is too random in that the same results cannot be re-obtained. Order by Rand(seed) would be ideal as with the same seed the same random collection would result. Unfortunately, the Rand() state resets with every row, does anyone have a solution?

declare @seed as int;
set @seed = 1000;

create table temp (
id int,
date datetime)

insert into temp (id, date) values (1,'20090119')
insert into temp (id, date) values (2,'20090118')
insert into temp (id, date) values (3,'20090117')
insert into temp (id, date) values (4,'20090116')
insert into temp (id, date) values (5,'20090115')
insert into temp (id, date) values (6,'20090114')

-- re-seeds for every item
select *, RAND(), RAND(id+@seed) as r from temp order by r
--1 2009-01-19 00:00:00.000 0.277720118060575   0.732224964471124
--2 2009-01-18 00:00:00.000 0.277720118060575   0.732243597442382
--3 2009-01-17 00:00:00.000 0.277720118060575   0.73226223041364
--4 2009-01-16 00:00:00.000 0.277720118060575   0.732280863384898
--5 2009-01-15 00:00:00.000 0.277720118060575   0.732299496356156
--6 2009-01-14 00:00:00.000 0.277720118060575   0.732318129327415
-- Note how the last column is +=~0.00002

drop table temp

-- interestingly this works:
select RAND(@seed), RAND()
--0.732206331499865 0.306382810665955

Note, I tried Rand(ID) but that just turns out to be sorted. Apparently Rand(n) < Rand(n+1)

ccook · Accepted Answer · 2009-01-24 02:16:27Z

Building off of gkrogers hash suggestion this works great. Any thoughts on performance?

declare @seed as int;
set @seed = 10;

create table temp (
id int,
date datetime)

insert into temp (id, date) values (1,'20090119')
insert into temp (id, date) values (2,'20090118')
insert into temp (id, date) values (3,'20090117')
insert into temp (id, date) values (4,'20090116')
insert into temp (id, date) values (5,'20090115')
insert into temp (id, date) values (6,'20090114')

-- re-seeds for every item
select *, HASHBYTES('md5',cast(id+@seed as varchar)) r
from temp order by r
--1 2009-01-19 00:00:00.000 0x6512BD43D9CAA6E02C990B0A82652DCA
--5 2009-01-15 00:00:00.000 0x9BF31C7FF062936A96D3C8BD1F8F2FF3
--4 2009-01-16 00:00:00.000 0xAAB3238922BCC25A6F606EB525FFDC56
--2 2009-01-18 00:00:00.000 0xC20AD4D76FE97759AA27A0C99BFF6710
--3 2009-01-17 00:00:00.000 0xC51CE410C124A10E0DB5E4B97FC2AF39
--6 2009-01-14 00:00:00.000 0xC74D97B01EAE257E44AA9D5BADE97BAF

drop table temp

EDIT: Note, the declaration of @seed as it's use in the query could be replace with a parameter or with a constant int if dynamic SQL is used. (declaration of @int in a TSQL fashion is not necessary)

May I ask if you tried my method? It does not require any extra variables or a stored proc. — JosephStyons, Commented Jan 23, 2009 at 20:51
I just tried, unfortunately it didnt provide the reliability. The declarations above are just to provide a complete environment in order to replicate the problem. The temp table is to be representative. — ccook, Commented Jan 24, 2009 at 2:14

Jack Ryan · Accepted Answer · 2009-01-19 16:28:55Z

1

You can use a value from each row to re-evaluate the rand function:

Select *, Rand(@seed + id) as r from temp order by r

adding the ID ensures that the rand is reseeded for each row. But for a value of seed you will always get back the same sequence of rows (provided that the table does not change)

answered Jan 19, 2009 at 16:28

Jack Ryan

8,4344 gold badges40 silver badges76 bronze badges

Thank you JayArr. I did try this, but unfortunately it ends up being ascended. Apparently the first random value is fairly predictable. I will update the question with that note as well.
– ccook
Commented Jan 19, 2009 at 16:31

Add a comment |

MatBailie · Accepted Answer · 2009-01-19 18:45:54Z

1

Creating a hash can be much more time consuming than creating a seeded random number.

To get more variation in the ourput of RAND([seed]) you need to make the [seed] vary significantly too. Possibly such as...

SELECT
    *,
    RAND(id * 9999)    AS [r]
FROM
   temp
ORDER BY
   r

Using a constant ensures the replicability you asked for. But be careful of the result of (id * 9999) causing an overflow if you expect your table to get big enough...

answered Jan 19, 2009 at 18:45

MatBailie

85.8k19 gold badges108 silver badges141 bronze badges

1

I think this runs the same problem as the above, where the values are incremental.
– ccook
Commented Jan 19, 2009 at 20:34
SELECT RAND(9999 * 1), RAND(9999 * 2), RAND(9999 * 3), RAND(9999 * 4), RAND(9999 * 5) 0.899884439852407 0.0861955322535983 0.27250661186434 0.458817691475082 0.645128771085824 Isn't incremental, but isn't truely random either...
– MatBailie
Commented Jan 20, 2009 at 9:36
Which database are you running this against? SQL2008?
– ccook
Commented Jan 21, 2009 at 13:10

Add a comment |

dummy · Accepted Answer · 2009-02-02 18:05:59Z

1

SELECT *, checksum(id) AS r FROM table ORDER BY r

This kind of works. Although the output from checksum() does not look all that random to me. The MSDN Documentation states:

[...], we do not recommend using CHECKSUM to detect whether values have changed, unless your application can tolerate occasionally missing a change. Consider using HashBytes instead. When an MD5 hash algorithm is specified, the probability of HashBytes returning the same result for two different inputs is much lower than that of CHECKSUM.

But may be it faster.

edited Feb 2, 2009 at 18:05

answered Jan 30, 2009 at 15:36

dummy

4,2743 gold badges26 silver badges36 bronze badges

Clarified my answer a bit. But it is the solution you already came up with anyway. Kind of.
– dummy
Commented Feb 2, 2009 at 18:07

Add a comment |

Jack Ryan · Accepted Answer · 2009-01-19 16:45:39Z

0

After doing some reading this is an accepted method.

Select Rand(@seed) -- now rand is seeded

Select *, 0 * id + Rand() as r from temp order by r

Having id in the expression causes it to be reevaluated every row. But multiplying it by 0 ensures that it doesnt not affect the outcome of rand.

What a horrible way of doing things!

answered Jan 19, 2009 at 16:45

Jack Ryan

8,4344 gold badges40 silver badges76 bronze badges

Did this work for you? I get a constant r column. It also generates an extra result. Maybe it has to be within a sp? insane though, wow.
– ccook
Commented Jan 19, 2009 at 16:58
In my test (on SQL Server 2008 R2), RAND() is only ever evaluated once. Under what conditions does it ever evaluate to something different in each row?
– binki
Commented Feb 3, 2015 at 19:44

Add a comment |

Gabe · Accepted Answer · 2010-10-19 01:09:14Z

0

create table temp (
id int,
date datetime)

insert into temp (id, date) values (1,'20090119')
insert into temp (id, date) values (2,'20090118')
insert into temp (id, date) values (3,'20090117')
insert into temp (id, date) values (4,'20090116')
insert into temp (id, date) values (5,'20090115')
insert into temp (id, date) values (6,'20090114')

-- re-seeds for every item
select *, NEWID() r
from temp order by r

drop table temp

edited Oct 19, 2010 at 1:09

Gabe

86.1k12 gold badges141 silver badges238 bronze badges

answered Sep 24, 2009 at 23:17

Qıyu Chu

Add a comment |

JosephStyons · Accepted Answer · 2009-01-23 20:28:59Z

-1

This has worked well for me in the past, and it can be applied to any table (just bolt on the ORDER BY clause):

SELECT *
FROM MY_TABLE
ORDER BY  
  (SELECT ABS(CAST(NEWID() AS BINARY(6)) % 1000) + 1);

answered Jan 23, 2009 at 20:28

JosephStyons

58.1k63 gold badges165 silver badges237 bronze badges

2

It sorts randomly, however, it does not give repeatable results. Same behavior as order by newid()?
– ccook
Commented Jan 24, 2009 at 2:13

Add a comment |

Collectives™ on Stack Overflow

Pseudo Random Repeatable Sort in SQL Server (not NEWID() and not RAND())

7 Answers 7

Not the answer you're looking for? Browse other questions tagged
sql
sql-server
t-sql
random
paging
or ask your own question.

Linked

Hot Network Questions

Collectives™ on Stack Overflow

7 Answers 7

Not the answer you're looking for? Browse other questions tagged sqlsql-servert-sqlrandompaging or ask your own question.

Linked

Related

Not the answer you're looking for? Browse other questions tagged
sql
sql-server
t-sql
random
paging
or ask your own question.