Generate a Random Number for Each Row in a Query {a better way}

A couple of years ago I wrote a post on how to generate a random number in SQL Server.

https://sqlundercover.com/2017/06/22/generating-random-numbers-for-each-row-over-a-specified-range-and-other-funky-stuff-you-can-do-with-them/

It’s ok and it works, it’s also a method that I used for quite a while.

Why not use RAND() I’ve heard people say, well the issue with RAND() is that it will generate the same number of every row, lets look at the result from this rather arbitrary query.

SELECT database_id, RAND() AS RandomNumber
FROM sys.databases

You see the problem? Well in the post that I mentioned, I adressed this issue and offered a solution using CHECKSUM() and NEWID().

That’s ok but can be a bit cumbersome. I recently stumbled across a function that I hadn’t realised was there before (and of course this is where you all tell me, we’ve known about that all the time), CRYPT_GEN_RANDOM().

That function will create a random (and supposedly cryptographically strong) binary. You do need to pass in how long you want that binary to be in bytes. So lets see how that looks if we replace RAND() in our above example with CRYPT_GEN_RANDOM(1).

SELECT database_id, CRYPT_GEN_RANDOM(1) AS RandomNumber
FROM sys.databases
Interesting, all the randomly generated binary values are different. But you want a number? Simples, just CAST that binary to an INT and Bob’s your Uncle!
SELECT database_id, CAST(CRYPT_GEN_RANDOM(1) AS INT) AS RandomNumber
FROM sys.databases

Obviously you can change the length of the binary that you create if you want to increase the maximum number that you generate.

This is my new go to random number generator, hopefully you’ll find it a good way to create a random number of each row in your result set. The only question that I have is why didn’t I know about this function sooner?!

6 thoughts on “Generate a Random Number for Each Row in a Query {a better way}

Add yours

  1. Thanks for the article. I think that anyone that takes the time to share information is aces in my book.

    Shifting gears to the topic at hand….

    Here’s some code to generate random numbers using the “new” way and the “old” way.

    SET STATISTICS TIME OFF;
    PRINT REPLICATE(‘@’,50);
    PRINT ‘***** CRYPT_GEN_RANDOM *****’
    SET STATISTICS TIME ON;
    GO
    DECLARE @BitBucket INT;
    SELECT TOP 1000000
    @BitBucket = ABS(CONVERT(INT,CRYPT_GEN_RANDOM(4))%1000)+1
    FROM sys.all_columns ac1 CROSS JOIN sys.all_columns ac2;
    GO 3
    SET STATISTICS TIME OFF;
    PRINT REPLICATE(‘@’,50);
    PRINT ‘***** Old GUID method *****’
    SET STATISTICS TIME ON;
    GO
    DECLARE @BitBucket INT;
    SELECT TOP 1000000
    @BitBucket = ABS(CHECKSUM(NEWID())%1000)+1
    FROM sys.all_columns ac1 CROSS JOIN sys.all_columns ac2;
    GO 3
    SET STATISTICS TIME OFF;

    And here are the results of that run…

    @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
    ***** CRYPT_GEN_RANDOM *****
    Beginning execution loop
    SQL Server parse and compile time:
    CPU time = 0 ms, elapsed time = 0 ms.

    SQL Server Execution Times:
    CPU time = 141 ms, elapsed time = 148 ms.
    SQL Server parse and compile time:
    CPU time = 0 ms, elapsed time = 0 ms.

    SQL Server Execution Times:
    CPU time = 156 ms, elapsed time = 146 ms.
    SQL Server parse and compile time:
    CPU time = 0 ms, elapsed time = 0 ms.

    SQL Server Execution Times:
    CPU time = 141 ms, elapsed time = 146 ms.
    Batch execution completed 3 times.
    SQL Server parse and compile time:
    CPU time = 0 ms, elapsed time = 0 ms.
    @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
    ***** Old GUID method *****
    Beginning execution loop
    SQL Server parse and compile time:
    CPU time = 13 ms, elapsed time = 13 ms.

    SQL Server Execution Times:
    CPU time = 125 ms, elapsed time = 138 ms.
    SQL Server parse and compile time:
    CPU time = 12 ms, elapsed time = 12 ms.

    SQL Server Execution Times:
    CPU time = 125 ms, elapsed time = 127 ms.
    SQL Server parse and compile time:
    CPU time = 0 ms, elapsed time = 0 ms.

    SQL Server Execution Times:
    CPU time = 125 ms, elapsed time = 128 ms.
    Batch execution completed 3 times.
    SQL Server parse and compile time:
    CPU time = 0 ms, elapsed time = 0 ms.

    Now I’ll admit that the difference is quite small, especially considering that the tests all involve the generation of 10 Million rows each (and I dump the output to variables to take display and disk time out of the picture) but… the “old” way is consistently faster and the old code is shorter to write.

    That being said, I’ll continue to use the “old” way. 😀

    Liked by 1 person

  2. With Cte_Randomness as
    (
    Select top 200

    CAST(CRYPT_GEN_RANDOM(1) AS INT ) AS RandomNumber,
    xxx as columnName
    From sourcetable
    )

    Select *,
    ROW_NUMBER () over( order by RandomNumber Asc) as Orderid

    from Cte_Randomness

    I have above better to generate unique row ids as the function still generates some rows with same random number.

    Like

    1. Heh… tests show that not every random number is unique???? What exactly do you think a random number actually is?

      What you’re talking about is a sequence of numbers that are returned in random order and that’s a whole ‘nuther ball game because you first have to make sure that you generate enough numbers and then you need set them into random order and then you need to keep track of which once you’ve “issued”.

      Like

Leave a comment

Create a website or blog at WordPress.com

Up ↑