308

I am using MSSQL Server 2005. In my db, I have a table "customerNames" which has two columns "Id" and "Name" and approx. 1,000 results.

I am creating a functionality where I have to pick 5 customers randomly every time. Can anyone tell me how to create a query which will get random 5 rows (Id, and Name) every time when query is executed?

2

14 Answers 14

832
SELECT TOP 5 Id, Name FROM customerNames
ORDER BY NEWID()

That said, everybody seems to come to this page for the more general answer to your question:

Selecting a random row in SQL

Select a random row with MySQL:

SELECT column FROM table
ORDER BY RAND()
LIMIT 1

Select a random row with PostgreSQL:

SELECT column FROM table
ORDER BY RANDOM()
LIMIT 1

Select a random row with Microsoft SQL Server:

SELECT TOP 1 column FROM table
ORDER BY NEWID()

Select a random row with IBM DB2

SELECT column, RAND() as IDX 
FROM table 
ORDER BY IDX FETCH FIRST 1 ROWS ONLY

Select a random record with Oracle:

SELECT column FROM
( SELECT column FROM table
ORDER BY dbms_random.value )
WHERE rownum = 1

Select a random row with sqlite:

SELECT column FROM table 
ORDER BY RANDOM() LIMIT 1
6
  • 34
    Does this become very expensive on large tables, where each row gets a random number, and then a large unindexed random number set is sorted?
    – Andrey
    Commented Apr 19, 2014 at 16:04
  • 1
    This is perhaps obvious to most people, but it wasn't obvious to me... the following query will not get a new random value for each row: update tbl_vouchers set tbl_UsersID = (select top(1) id from tbl_Users order by NEWID()) - edit: I can't get formatting to work in comments :(
    – Mir
    Commented Dec 10, 2015 at 18:35
  • Why does this fail on Google Cloud SQL? We only get partially random results. Nearly 80% of the time we get the same row back.
    – Praxiteles
    Commented Nov 25, 2016 at 22:38
  • 8
    Warning: For big databases this method will have a bad performance. Can you imagine the time it will take to generate a random value for each row if the database have a million of entry? You can have more information about and a better alternativ here. Commented Dec 15, 2016 at 9:33
  • Thanks for the solution. Just wanted to know if we can assign some kind of a declare variable to the value after the 'limit' keyword. I am trying to find solutions in bigquery but haven't had much luck yet.
    – Ajay Kumar
    Commented Jan 20, 2020 at 9:52
39
SELECT TOP 5 Id, Name FROM customerNames ORDER BY NEWID()
0
16

In case someone wants a PostgreSQL solution:

select id, name
from customer
order by random()
limit 5;
0
13

I have found this to work best for big data.

SELECT TOP 1 Column_Name FROM dbo.Table TABLESAMPLE(1 PERCENT);

TABLESAMPLE(n ROWS) or TABLESAMPLE(n PERCENT) is random but need to add the TOP n to get the correct sample size.

Using NEWID() is very slow on large tables.

1
11

Maybe this site will be of assistance.

For those who don't want to click through:

SELECT TOP 1 column FROM table
ORDER BY NEWID()
1
  • 4
    should have at least replaced 1 with 5 :)
    – roman m
    Commented Feb 24, 2009 at 6:40
8

There is a nice Microsoft SQL Server 2005 specific solution here. Deals with the problem where you are working with a large result set (not the question I know).

Selecting Rows Randomly from a Large Table http://msdn.microsoft.com/en-us/library/cc441928.aspx

7

This is an old question, but attempting to apply a new field (either NEWID() or ORDER BY rand()) to a table with a large number of rows would be prohibitively expensive. If you have incremental, unique IDs (and do not have any holes) it will be more efficient to calculate the X # of IDs to be selected instead of applying a GUID or similar to every single row and then taking the top X # of.

DECLARE @minValue int;
DECLARE @maxValue int;
SELECT @minValue = min(id), @maxValue = max(id) from [TABLE];

DECLARE @randomId1 int, @randomId2 int, @randomId3 int, @randomId4 int, @randomId5 int
SET @randomId1 = ((@maxValue + 1) - @minValue) * Rand() + @minValue
SET @randomId2 = ((@maxValue + 1) - @minValue) * Rand() + @minValue
SET @randomId3 = ((@maxValue + 1) - @minValue) * Rand() + @minValue
SET @randomId4 = ((@maxValue + 1) - @minValue) * Rand() + @minValue
SET @randomId5 = ((@maxValue + 1) - @minValue) * Rand() + @minValue

--select @maxValue as MaxValue, @minValue as MinValue
--  , @randomId1 as SelectedId1
--  , @randomId2 as SelectedId2
--  , @randomId3 as SelectedId3
--  , @randomId4 as SelectedId4
--  , @randomId5 as SelectedId5

select * from [TABLE] el
where el.id in (@randomId1, @randomId2, @randomId3, @randomId4, @randomId5)

If you wanted to select many more rows I would look into populating a #tempTable with an ID and a bunch of rand() values then using each rand() value to scale to the min-max values. That way you do not have to define all of the @randomId1...n parameters. I've included an example below using a CTE to populate the initial table.

DECLARE @NumItems int = 100;

DECLARE @minValue int;
DECLARE @maxValue int;
SELECT @minValue = min(id), @maxValue = max(id) from [TABLE];
DECLARE @range int = @maxValue+1 - @minValue;

with cte (n) as (
   select 1 union all
   select n+1 from cte
   where n < @NumItems
)
select cast( @range * rand(cast(newid() as varbinary(100))) + @minValue as int) tp
into #Nt
from cte;

select * from #Nt ntt
inner join [TABLE] i on i.id = ntt.tp;

drop table #Nt;
4
  • @Protiguous, the edit you proposed broke the random selection. Using min() and max() applied to the dbo.Tally64k table would not allow the user to select a row with a pk id > 65556.
    – RIanGillis
    Commented Sep 23, 2019 at 13:41
  • The table name change was simply an artifact from testing. The actual table name doesn't matter, as long as the correct table is used. min() and max() can both be queried in one query rather than two, which is what I was trying to show.
    – Protiguous
    Commented Jan 14, 2020 at 21:48
  • @Protiguous Ah, I see that now, I was confused because you used the 0-65k when doing the min-max but not later. After your most recent edit I actually wanted to ask you about the performance implications of the changes you made, as performance tuning is one of my interests and seemingly meaningless decisions like which side of the equals sign you place something can actually have a significant impact --- Would the same thing apply to the 5 SET @randomId## calls? Or is that different because it is not SELECTing FROM an actual table?
    – RIanGillis
    Commented Jan 15, 2020 at 4:49
  • I'm not sure I understand your question. Are you asking why there are 5 SET instead of just 1 SELECT @id1=rand(), @id2=rand().. ? It's because multiple calls to a rand() in 1 statement will produce the same result, hence the separated SET. (rand() on SQL Server is a deterministic function, I believe.) I would guess that 1 select vs 5 set is in the nanosecond range performance-wise.
    – Protiguous
    Commented Mar 6, 2020 at 23:43
6

If you have a table with millions of rows and care about the performance, this could be a better answer:

SELECT * FROM Table1
WHERE (ABS(CAST(
  (BINARY_CHECKSUM
  (keycol1, NEWID())) as int))
  % 100) < 10

https://msdn.microsoft.com/en-us/library/cc441928.aspx

1
  • Note that this will select approximately 10% of the rows in the table. If you need to select an exact number of rows, or at least N rows, this approach won't work.
    – LarsH
    Commented Jun 3, 2019 at 14:49
6
SELECT * FROM TABLENAME ORDER BY random() LIMIT 5; 
0
3

In order to shuffle the SQL result set, you need to use a database-specific function call.

Note that sorting a large result set using a RANDOM function might turn out to be very slow, so make sure you do that on small result sets.

If you have to shuffle a large result set and limit it afterward, then it's better to use something like the Oracle SAMPLE(N) or the TABLESAMPLE in SQL Server or PostgreSQL instead of a random function in the ORDER BY clause.

So, assuming we have the following database table:

Song database table

And the following rows in the song table:

| id | artist                          | title                              |
|----|---------------------------------|------------------------------------|
| 1  | Miyagi & Эндшпиль ft. Рем Дигга | I Got Love                         |
| 2  | HAIM                            | Don't Save Me (Cyril Hahn Remix)   |
| 3  | 2Pac ft. DMX                    | Rise Of A Champion (GalilHD Remix) |
| 4  | Ed Sheeran & Passenger          | No Diggity (Kygo Remix)            |
| 5  | JP Cooper ft. Mali-Koa          | All This Love                      |

Oracle

On Oracle, you need to use the DBMS_RANDOM.VALUE function, as illustrated by the following example:

SELECT
    artist||' - '||title AS song
FROM song
ORDER BY DBMS_RANDOM.VALUE

When running the aforementioned SQL query on Oracle, we are going to get the following result set:

| song                                              |
|---------------------------------------------------|
| JP Cooper ft. Mali-Koa - All This Love            |
| 2Pac ft. DMX - Rise Of A Champion (GalilHD Remix) |
| HAIM - Don't Save Me (Cyril Hahn Remix)           |
| Ed Sheeran & Passenger - No Diggity (Kygo Remix)  |
| Miyagi & Эндшпиль ft. Рем Дигга - I Got Love      |

Notice that the songs are being listed in random order, thanks to the DBMS_RANDOM.VALUE function call used by the ORDER BY clause.

SQL Server

On SQL Server, you need to use the NEWID function, as illustrated by the following example:

SELECT
    CONCAT(CONCAT(artist, ' - '), title) AS song
FROM song
ORDER BY NEWID()

When running the aforementioned SQL query on SQL Server, we are going to get the following result set:

| song                                              |
|---------------------------------------------------|
| Miyagi & Эндшпиль ft. Рем Дигга - I Got Love      |
| JP Cooper ft. Mali-Koa - All This Love            |
| HAIM - Don't Save Me (Cyril Hahn Remix)           |
| Ed Sheeran & Passenger - No Diggity (Kygo Remix)  |
| 2Pac ft. DMX - Rise Of A Champion (GalilHD Remix) |

Notice that the songs are being listed in random order, thanks to the NEWID function call used by the ORDER BY clause.

PostgreSQL

On PostgreSQL, you need to use the random function, as illustrated by the following example:

SELECT
    artist||' - '||title AS song
FROM song
ORDER BY random()

When running the aforementioned SQL query on PostgreSQL, we are going to get the following result set:

| song                                              |
|---------------------------------------------------|
| 2Pac ft. DMX - Rise Of A Champion (GalilHD Remix) |
| JP Cooper ft. Mali-Koa - All This Love            |
| Ed Sheeran & Passenger - No Diggity (Kygo Remix)  |
| HAIM - Don't Save Me (Cyril Hahn Remix)           |
| Miyagi & Эндшпиль ft. Рем Дигга - I Got Love      |

Notice that the songs are being listed in random order, thanks to the random function call used by the ORDER BY clause.

MySQL

On MySQL, you need to use the RAND function, as illustrated by the following example:

SELECT
  CONCAT(CONCAT(artist, ' - '), title) AS song
FROM song
ORDER BY RAND()

When running the aforementioned SQL query on MySQL, we are going to get the following result set:

| song                                              |
|---------------------------------------------------|
| HAIM - Don't Save Me (Cyril Hahn Remix)           |
| Ed Sheeran & Passenger - No Diggity (Kygo Remix)  |
| Miyagi & Эндшпиль ft. Рем Дигга - I Got Love      |
| 2Pac ft. DMX - Rise Of A Champion (GalilHD Remix) |
| JP Cooper ft. Mali-Koa - All This Love            |

Notice that the songs are being listed in random order, thanks to the RAND function call used by the ORDER BY clause.

1

If you are using large table and want to access of 10 percent of the data then run this following command: SELECT TOP 10 PERCENT * FROM Table1 ORDER BY NEWID();

0

If you use Yandex Database then you should use

select column from table order by random (TableRow()) limit 1;
0

If you don't want to use NEWID() and the primary key column is int, then you can just select a random primary key like this:

with a as 
(
select count(id) as row_count
from mytable
)

select *
from mytable , a
where id = round(rand() * row_count, 0)
0

If you need just to shuffle sequential values then you don't need always to use random (as it's non sql standard), you can try to use some tricks, like using reverse(PK)

SELECT PK FROM products ORDER BY REVERSE(concat('', PK))

So let's say we have a values: 123 124 125 223 224 225 323 324 325

then after reverse we will see them in the following order:

321 - 1
322 - 4
323 - 7
421 - 2
422 - 5
423 - 8
521 - 3
522 - 6
523 - 9

Not the answer you're looking for? Browse other questions tagged or ask your own question.