How Binary Search Makes Computers Much, Much Faster
Featuring binary versus linear search, and non-clustered indexes. Uh, indices. However you want to say it. • MORE BASICS: fiblock.info/player/PL96C35uN7xGLLeET0dOWaKHkAlPsrkcha.html
Written with Sean Elliott SeanMElliott/ • Camera by Tomek • Graphics by William Marler wmad.co.uk
I'm at tomscott.com
on Twitter at tomscott
on Facebook at tomscott
and on Instagram as tomscottgo
Julkaistu 6 kuukautta sitten
I am, as ever, extremely thankful for animator William Marler for handling all the graphics here!
Information. I.e books, are mostly changing with the 3-5year paradigm. So why not sort by release date?
Hmm video is uploaded a month ago, but Tom written this comment 3 months ago? Wierd
how can there be a 3-month old comment on a video uploaded 2 months ago?
This video is great, as per usual, but it's particularly awesome that you added that bit about Dewey. This is what needs to happen, an extra 10s to give context, not much, but so worth the time. Thanks
A113, eh?
4:21 Don’t tell me that screen has T-Series on it
*Wow, Very good. ... Muito bom. ... Très bien. ... Beetho, da "Boomerang Flowers Band ®", de Belo Horizonte (MG), Brasil.*
Now do one on the B-tree.
I don't know why you felt that you had to denounce Dewey... weird
It's simple. Would you look for a phone number in a phone book starting at page one, and reading every single page until you find the name? No. You break the phone book in half, see which half the name is in then keep repeating.
Real OGs get the curious George reference
On a smaller scale linear search is often faster because of cpu pipelines
0:19 The attention to detail here is incredible. Besides the obvious joke about Tom’s red shirts, John Scalzi is an actual author that wrote a book called _Redshirts_
Binar search is used just for ordered list not unordered
Pathetic human race. Arranging their knowledge by category just made it easier to absorb. Dewey, you fool! Your decimal system has played right into my hands!
Brilliant !!!! 👍🏻👌🏻👍🏻👌🏻
Doesn't databases typically use balanced trees, not binary ones? They are very much related, but balanced trees takes makes sure sorted input doesn't result in a lopsided tree. They also uses the fact that data is usually bunched together in storage medias.
Aside from the design of google's indexes, the reason that their searches are so fast is because they store 100% of their indexes in RAM -- and very fast RAM on very fast computers. They build their indexes off-line. Once built, they copy them to one of their RAM-based index servers. Once copied there, they then add that index server to their client pool. it is the computers in the client pool that we interact with when we make a search request (not directly, because we go through firewalls and load balancers -- but our search requests hit those RAM index servers). So people making search requests never see the work, and delays, behind the scenes, that go into creating the indexes. We are simply given access to those indexes, when someone at google adds a now-ready RAM index server to their pool of computers that process our requests.
Content
Same as genders, if you treat it as binary, you can advance society much much faster
Things I learned today: How binary searches work. Dewey was a terrible human being.
Don't recommend WhatsApp.
Laughing in hashmap
Tom, I do not know how you find the time and effort to produce these videos but they are brilliant! Top one. Nice one. Get sorted.
Missive respect to google for optimizing the algorithm I wish it was open source.
Girls eyes
I don't disagree with any of Tom's criticisms of the man's moral failures, but hearing a guy like Tom complain about how evil a historical figure was makes me want to die a little bit.
All the schools and libraries I’ve been to still use dewey’s system for nonfiction
Excellent explanation Tom, I never found one like this on binary search in youtube. Thanks for the content
smart guy from the past starter pack: Eugenics prof That's it
Don't you hate it when you notice Tom Scott say you can't do something, and you say to yourself "Wrong! You can use multiple indices and there should be a tiny overlap, if any, to sort through after" and then Tom says "Unless you uses indexes" and it makes your entire train of thought redundant.
Still Google search isn't perfect
What kind of computers are those next to you? They look just like the ones I used at the phone company from 1998-2002ish (when we finally got color monitors.)
The tame robert epidemiologically rot because headlight hisologically untidy down a broad tights. utopian, encouraging sail
BIG TO SMOL
“It doesn’t make much difference for eleven cards”, if anything it actually makes searching small datasets slower than linear search since linear search is cache and prefetch friendly while binary search is not. You need enough items in the dataset so that the search time is dominated by the items rather than cache thrashing.
Information. I.e books, are mostly changing with the 3-5year paradigm. So why not sort by release date?
Well seven months later the search take longer but returns 100,000,000 more results!!
God I'd love to talk with Tom.
Sorting by colour is psychopathic and stupid. You cant change my mind. It doesnt make sense. Breaks as soon as a series has different colours.
4:02 genuinely expecting Tom to say “a cluster ****” 😂
Can we point out how great the "The Basics" sound is?
Goes to show terrible people can still have good ideas
When I selected the video, I didn't know what was going to be about. And suddenly I remembered every time i had to access data from an SQL table. The term "did you put an index in that table" was say out loud every time a process was taking to long.
this is what we call it a video thanks so much for the efforts
Another thing to remember: make sure your computer isn't programmed to index the index. I was wondering why my index file was 263 GB on my 500 GB hard drive, and someone told me to make sure the index was excluded from itself. Whoops! :)
I like the way you explains me things like I was an 3 years old baby with defunctional brain, so I can understand everything
0:18 this looks like a book Tom would read
Truely great information.
Dewey - The Andy Rubin of his time...
Dewey is a fantastic example of an uncomfortable truth. Sometimes terrible people can do great things.
mmmmMMMmM bogos binted
Nice tip of the hat using A113!
It's quite simple: Database indexes contain the indices of the data, but in a different order, so you can quickly search the index to find the index of the original data you wanted.
Chinese dictionaries also have indexing by radicals, on top of the stroke count and pages. Radicals are memorable visual components of many Chinese characters, and often carry a unified meaning to characters who share them, e.g. characters with 亻are usually human-related, or characters with 氵are usually water-related. Certainly has more meaning as an index than colour (to me), though ofc not all indexes need a deeper meaning anyway, as long as they get the job done....
Oh wow when i got taught about this in school i didnt think dewey was a sexpest
linear search is very often faster than binary search because of cache prefetching.
Oh, this video gave me a big wave of high school flashbacks. That sequence of cards really brings back memories, haha.
Wow, Dewey seems like a cool gu-... oh god..
*Indexes Indizes = multiple registers Indexes = multiple indexes in a database Tought by my university professor in Germany
who says sexsism and rasism is bad, it can be an indication of wrong that is ignored by some like you apparenty
Well, you normally use a hash table for the primary key index, not rely on binary search for that. Binary search is common for other indexes though.
Neat video! Thanks for uploading!
>there are two ways a computer can search, linear or binary Laughs in hashmap
pedobyte (n.) the amount of data storage used by the average nonce.
That’s some GREAT green screen
If life is a Search for Meaning it better be Binary Search
A113 easter egg on a Tom Scott Video? More likely than you think.
2:01 HP: 1000 T-SERIES anyone notice this? i thought it was just a mistake but nevermind.
I always have to use linear search, it's because I'm non binary
T-Series easter egg? 2:00
What
5:36 I'm using the triangle to describe the struggle between CPU, ram, and disk space. The size of the triangle is showing the size of the impact on the system that the program will have.
But Wiggle Ferret
Dewey based system lmao cry harder about it and NYW
life is about 3 things, getting bitches, getting money, and the dewey decimal system - Bo Burnham
How can anyone dislike a video like this? Great work as always!
nicely described.
*Indexing:* Do you want to be lazy now or in the future?
Can you use this on Microsoft folder sorting?
Amazing
Yet Windows takes forever to search my computer?
Most likely windows explorer is set to search file contents as well--if you turn that off, it'll only search filenames, which is far quicker.
Great to see matt parker in a tom scott video
I'm interested in Dewey's sexual deviences now
long live racism! you will never silence the truth!
Great joke
Thanks to Melvil Dewey we don't need to ask the librarian where a specific adult book is.
I don't know how they do it, but I've got a search program that requires no indexing and is able to return over 2 million results across 4.68TB of data, in 5.63 seconds. Ultra Search, it's free too.
offtopic, but in my opinion: index, as the whole list - plural from that (many lists) is indexes. index, as a one number (id) - plural (many numbers where each serves as an id) indices.
But google doesn't use the binary search ,does it ?
Could you explain Page search next ?
When a corporate type like TS says someone was bad you know he's just covering himself
Dewey would have become president today 🤣
.. What kind of idiot would go into a book store, and ask by color?
The most amazing thing about binary search: If searching through x items takes (a maximum of) n comparisons, when you double the amount of items only takes n+1 comparisons. So with a 100.000 it's 16 times, 200.000 takes 17 tries.
There are many more strategies than just linear/binary to search a list. Easy Example: Probabilistic based upon guessing the distribution of the remaining items to be searched and where you should guess to maximize the speed at which you narrow the search space.
Please tom, 60 fps. My poor eyes
I still remember the time when the transition happened from Yahoo or Altavista (or metacrawler or however that was called), yielding results in 2-3 seconds, to that new thing, Google, without anything on the landing page except that search bar - and giving the results in *fractions* of a second. And they were so much better, so much more accurate, they found so many more sites! It was mind-blowing.
For a minute I thought you were talking about John Dewey and I was so confused lmfao
except when it doesn't.. (cache, memory prefetch)
Good video.
So Melvis Dewey is the R. Kelly of librarians?
1:40 as we will Michael Jacksons aswell. If you do something good, it stays good regardles of what bad you have done.
Hi Tom. having a pedantic moment. The computers speed does not change. What has been achieved is a faster algorithm that provides results quicker. The poor old computer is non the wiser, it still runs at its designed speed regardless of what is being executed.
thats were T O C comes in
No scott, It's froo big to small.