Why YouTube will never run out of Video IDs.

Video Transcript

Every YouTube video has a unique ID. It’s up in the URL: a string of eleven characters that uniquely identifies which video you want. Now, YouTube has millions and millions of
videos. The last stats that they released said they
have 400 hours of video being uploaded every minute. So: are they ever going to run out of those
IDs? Well, to find out, let’s talk about counting
systems. People count in Base 10. 0 to 9. That’ll be, hopefully, familiar to you. Computers count in base 2, in binary, but that’s difficult for humans to read, it gets too long to write really, really quickly, so often computers will display it in base
16, hexadecimal. You have 0 to 9, and then A to F, and then you start adding to the next column. Humans can’t understand that easily, but it’s efficient if we have to type it in
somewhere, and 16 – 2 to the power of 4 – is also easy
for computers to deal with. So how about Base 64? That’d be a ridiculous counting system, right?
Except. 64 is another one of those easy numbers for
computers, it is 2 to the power of 6. And humans can get to 64 very easily: 0 to 9, then capital letters A to Z, then small letters a to z, and two other characters. Most Base 64 uses slash and plus, but they don’t work so well in URLs, so YouTube uses hyphen and underscore. That YouTube URL, that unique ID, is really just a random number in base 64. They could have have picked base 10 or base
16, but they didn’t: they went with 64, because it will let you cram a huge number
into a small space and still make it vaguely human readable. Author and programmer Sam Hughes, by the way, pushed this to the limit, and invented Base
65,536, which includes basically every character from
every language. It is ridiculous and unnecessary, but when has that ever stopped programmers? So why didn’t YouTube just start counting
at 1 and work up? Well, first, they would have to synchronise
their counting between all the servers handling the video
uploads, or they’d have to assign each server a block
of numbers. Either way, there’s a lot of tracking to do, a lot of making sure that it’s never duplicated. Instead, they just generate a random number
for each video, see if it’s already taken, and if not, use
it. And secondly, it is a really, really bad idea to just count 1, 2, 3 and so on in URLs. Incremental counters, as they’re called, can
be a big security flaw: if you see video 283 up there, then you might
wonder: what’s video 284? Or video 282? It’s easy to enumerate, as it’s called, to run through the entire list. YouTube Unlisted videos, the ones that don’t
appear publicly but that you can send the link to people,
those wouldn’t work. And by the way? Lots of badly designed sites
do use incremental counters. And it is a terrible idea. It might tell your competitors exactly how
many customers you have, ‘cos they can just count them. It might let people download all your records
easily, ‘cos they can just run through them. And in one site that someone in Florida emailed
me about this week, it lets you look at other people’s personal
details. Don’t use incremental counters if you’re building
a web site. Use a random number. Which brings me to the question: just how big are the numbers that YouTube
uses? Well, let’s work it out. One character of base 64 lets you have 64
ID numbers. Two characters? That’s 64 by 64, or 4,096. Three characters? 64 times 64 times 64 — or
64 to the power of 3. That is already more than a quarter of a million. And if we go to four? Well, now we’re above
16 million. If you use Base 64, then you can assign an
ID number to everyone who lives in London down there
twice over, and you’ll only need four characters. This gets big fast. We can keep on doing this, and by seven characters we’re already at four
quadrillion. Now, I assume that YouTube checks through
a dictionary, and doesn’t allow any actual words to appear
up there — particularly anything rude. But that is going to be a tiny minority of
the URLs, so for our purposes, we can pretty much just
ignore that. At YouTube’s 11 characters, we are at 73 quintillion
786 quadrillion 976 trillion 294 billion 838 million 206 thousand and 464 videos. That’s enough for every single human on planet
Earth to upload a video every minute for around
18,000 years. YouTube planned ahead. Can they run out of URLs? Technically, yes. Practically? No. And if they did? They could just add one more character. [Translating these subtitles? Add your name here!] Ha! One take! One take! Yes!

Video Description

In the URL of each YouTube video is the 11-character video ID, unique for each video. Can they ever run out? Just how many videos can YouTube handle? To work it out, we need to talk about counting systems, and about something called Base 64.

Want to know how the single camera shot was done? “Matt Bought a DJI Osmo and It’s Surprisingly Good” is today’s video over on the Park Bench: https://www.youtube.com/watch?v=Dyy41yAs8nc

I’m at https://tomscott.com/
or on Twitter at http://twitter.com/tomscott
or on Facebook at http://facebook.com/tomscott
or on Instagram as tomscottgo.

Filmed by Matt Gray, who’s at http://mattg.co.uk
or @unnamedculprit on basically everything everywhere.

Author: dhobson