Random String Generation in C#

There are some programming problems that seem like they should be easy but  aren't (centering content in a DIV anyone?). Generating random unique strings is one of those things.  

On a recent project we had the requirement to create several types of unique strings.  A readable user code, item numbers and suggested strings for invite codes.

Random is Not Random

The first thing you learn when you start reading up on generating random numbers is that "Random" is not really random. When using the .Net Random class and providing the same seed value multiple instances will generate the same values.  For example the following test will pass.

[Fact]
public void RandomStuff()
{
    var r1 = new Random(15);
    var r2 = new Random(15);    
    
    Assert.Equal(r1.NextDouble(), r2.NextDouble());
}
In .Net Core the default seed value has been improved when using the default constructor But I still read a lot of suggestions saying not to use Random or that it's not 'cryptographically secure'

More 'Random' Solutions

In all our our scenarios for this project we don't require the random string we generate to be globally unique or non-guessable. We will still need to ensure these values are unique in our database with the appropriate constraints.  So, we can be a little more relaxed but I still wanted to see if there was something we could easily do beyond Random, but I also wanted some more insight into how good Random really is.

After some reading it appears that the simple alternative  to using Random appears to be the RandomNumberGenerator.GetInt32 method introduced in .Net Core 3.

Two Solutions

To see if there was any real difference between random key generation using RandomNumberGenerator and the default Random constructor I created the following two generation methods.

public string GetKey(int size, char[] chars)
{   
    var result = new StringBuilder(size);
    for (var i = 1; i <= size; i++)
    {
        var idx = RandomNumberGenerator.GetInt32(chars.Length);
        result.Append(chars[idx]);
    }
    
    return result.ToString();
}

public string GetKeyWithRandom(int size, char[] chars)
{
    var result = new StringBuilder(size);
    var random = new Random();
    for (var i = 1; i <= size; i++)
    {
        var idx = random.Next(chars.Length);
        result.Append(chars[idx]);
    }

    return result.ToString();
}

Then I wrote a little unit test to see how many unique keys I could generate before we created a duplicate. 100,000 unique keys seemed like enough for my current usage.

[Fact]
public void QuickUniquenessCheck()
{
    const int size = 6;
    const int iterations = 100000;
    var keys = new Dictionary<string, int>();
    private readonly char[] chars = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789".ToCharArray();

    for (var i = 0; i < iterations; i++)
    {
        var key = GetKey(size, chars);
        //var key = GetKeyWithRandom(size, chars);
        if (keys.ContainsKey(key))
        {
            throw new ApplicationException($"Key {key} duplicated at iteration {i}");
        }
        keys.Add(key, i);
        _output.WriteLine($"{i}. Key {key}");
    }

    Assert.True(keys.Keys.Count == iterations);
}

With a sample size of 100,000 keys the test using RandomNumberGenerator would succeed almost every time and the test using Random would create a duplicate every 2 out of 3 attempts or more often.

That makes the winner for my usage the RandomNumberGenerator.

Real World

How did I begin using this?

When using this key generation code in my application to create short user codes I first abstracted the generation logic into an IKeyGenerator interface and then create a UserCodeGenerator that would generate the key and make sure it was unique in the database.

public interface IKeyGenerator
{
    string GetKey(int size);

    string GetKey(int size, char[] chars);
}
public class UserCodeGenerator
{
    private readonly IKeyGenerator _keyGenerator;
    private readonly ApplicationDbContext _dbContext;

    public UserCodeGenerator(IKeyGenerator keyGenerator, ApplicationDbContext dbContext)
    {
        _keyGenerator = keyGenerator;
        _dbContext = dbContext;
    }

    public async Task<string> GetUserCode()
    {
        var size = 4;
        string code = null;
        var codeIsUnique = false;
        while (!codeIsUnique)
        {
            code = _keyGenerator.GetKey(size);
            codeIsUnique = !(await _dbContext.Users.AnyAsync(x => x.UserCode == code));
        }

        return code;
    }
}

Conclusion

Generating unique strings as short as 4 characters requires keeping track of what codes have already been used.  As we use more and more codes the likely hood of generating a code that has already been used increases and that will require a more complicated solution.  That being said I think the task of generating a random string is easily solved using this method. It's what I'll be using moving forward.

Other Scenarios

  • The UserCodeGenerator in this post will run forever if we run out of unique keys in the database. How can that situation be avoided?
  • How could I alert myself that I am close to running out of unique 4 character keys? And at what number of used keys will I begin to have performance issues finding a new key using this method.
  • There is a non-zero chance that the unique user code created in GetUserCode is generated by another caller and added to the database ahead of my call.  How could I prevent or seamlessly handle this situation?

Learn More

Cover image by Markus Spiske on Unsplash