Friend Codes in the Context of Social Networks: A Concrete Scheme

Metadata:
Published: 2023-09-17
Last modified: 2023-11-05 (made interactive generation button)


Context

In my previous post Friend Codes in the Context of Social Networks, I evaluated the idea of using friend codes as primary user identifiers on social networks. Usernames/handles and display names would be freed from being unique because the friend code would be the actual unique identifier. I shied away from presenting a concrete scheme in that post. However, I did end up making one. This is my formalized summary.

Note about patents: Friend codes act as a unique identifier generated by the system, rather than the user; entering it starts the flow to establish a friend relationship. Though patented by Nintendo (US9931571B2, US11083971B2, US8568239B2), obvious prior art exists (ICQ and phone numbers). I will therefore continue talking about this concept, but the patents may be an issue that could have discouraged Discord from pursuing this angle. I strongly suggest doing a patent search – especially if you plan to deploy your scheme in the United States of America.

Exploring the length limits

A good unique identifier needs to be unique. Therefore, it cannot be particularly short. For example, having friend codes be the digits from 1 through 10 means being unable to have more than 10 users on your platform. Therefore, having enough space is critically important. On the other hand, friend codes are increasingly inconvenient the longer they get. In particular on mobile devices, copying and pasting as well as typing are both non-trivial labor. Shortness is thus of the essence as well. This is a problem that friend codes, digital rights management (DRM) and phone numbers share. However, DRM license keys exist in a somewhat different space: Unlike friend codes and phone numbers, you only rarely ever will have to interface with them. Using license files instead of keys has therefore become common practice.

Nintendo uses a 12-digit scheme. ICQ started at a 5-digit number and then just increased it sequentially. Nintendo does not even mention the length of the string in patent US8568239B2.

Microsoft famously uses 25-character product keys; even their gift cards use a 25-character format. Microsoft does not justify their 25-character choice in their patents US6209093B1 or US7512232B2;

The International Telecommunication Union Telecommunication Standardization Sector (ITU-T) recommends in E.164 (11/10):

ITU-T recommends that the maximum number of digits for the international geographic, global services, Network and groups of countries applications should be 15 (excluding the international prefix). Administrations are invited to do their utmost to limit the digits to be dialled to the degree possible consistent with the service needs.

Clearly, ITU-T realizes that this is of relevance, but then fails to actually justify their seemingly arbitrary choice of 15 digits. Checking over the Wikipedia list of national conventions for writing telephone numbers, there is an observable tendency towards creating smaller groups of digits.

George A. Miller presented in his 1956 paper (The magical number seven, plus or minus two: some limits on our capacity for processing information. Psych. Rev. 63, pp. 81–97) the notion that human short-term memory span is at about 7 items. These must be chunks, approximately a unit of something the reader is familiar with (such as words in a language they know, digits or characters in a script they know). Nelson Cowan revisited this paper in 2001 (The magical number 4 in short-term memory: A reconsideration of mental storage capacity. Behav. Brain Sci. 24(1), pp. 87–114). He arrived at the conclusion that suggest a memory span only spanning about 3 to 5 chunks.

This means an approach using words would be exceedingly efficient, as noted in my original post. But they present the issues outlined therein (keyboard layout, not everyone speaking and using the same language). Using purely numeric digits would mean either trivial user discovery (leading to rapid enumeration of all possibilities) and resource exhaustion or alternatively a relatively long string of digits. For these reasons, I choose to go with alphanumeric digits, which have the greatest chance of being recognizable by the most people and thus hopefully easy enough to remember. These should be grouped into short units.

As of the time of writing, the human population on earth is estimated to be about 8,000,000,000 people. Not every single one of them will use or even be able to use your service. On the other hand, some people will require multiple accounts (e. g. for their bots or because they are a plural system). Thus, aiming to represent about 8 billion identifiers seems approximately ideal.

Putting it together

Using the base 23 alphabet CDFGJKLMNQRSTVWXZ234579, all the characters are at least somewhat visually distinct. The goal of 8 billion requires a 33-bit integer to represent. 7 digits of base 23 can only represent about 32 bits (237 = 3,404,825,447). An extra digit allows for an extra 5 bits, far exceeding the required 8 billion.

As I mentioned in my original post on the topic, having a check digit is exceedingly useful for providing useful user feedback since it allows checking for common human input mistakes. Namely, I use Yanling Chen et al., A general check digit system based on finite groups, Des. Codes, Cryptogr. 80, pp. 149–163, 2015. Since 23 is prime, no matrix arithmetic is required for the implementation. I use α ∈ {5, 7, 10, 11, 14, 15, 17, 19, 20, 21}, arbitrarily preferring to use 5 in particular, leading to the following cycle of coefficients for the check digit equation: (5, 2, 10, 4, 20, 8, 17, 16, 11, 9, 22, 18, 21, 13, 19, 3, 15, 6, 7, 12, 14, 1).

Therefore, I choose to work with 3 groups of 3 digits in base 23 each, the final of which is a check digit.

This should be coupled with a scheme to turn the hash value of the identifier to a color. For example, on the interface that displays the current user's friend code could look like this:

Your friend code is:

The buttons require ECMAScript.

When the user clicks on the Copy button, copy not only the friend code, but also roughly the expected color. In this example, the clipboard would read: MVE-G7Q-C32 (teal).

When the recipient then enters it into the UI manually (e. g. on a phone or something), also render the same color. That way, people can immediately visually see any kind of typo.

Conclusion

I spent way more time than I should have on something that will help neither me nor anybody else do anything. And probably created a patent mine