![]() |
| Home | Demo | Services | Features | Help | User Forum | Blog | About | |
|
#1
|
|||
|
|||
|
As an experiment I did a search for the single letter 'a' on Google.
It gave 25,130,000,000 results in .28 seconds. How is this high speed achieved in all it's indexed pages world wide? Do all the data centres muck in and each do part searches for 'a', or does just one data centre achiever this speed? Sorry to be so ignorant! |
|
#2
|
|||
|
|||
|
But I would imagine, that everytime you search, results would come from one data center only. Therefore each data center would be conducting their own separate indexing(?) and most likely bringing it to you on its own without the help of other data centers(?)
The learned and the knowlegeable can correct the above. BTW, 1,800,000,000 for china (0.08 seconds) - and everything was not about me? |
|
#3
|
||||
|
||||
|
Those searches are done in a quick index. That quick index already knows how many ocurrences are for a word (approximately, as of the last full index update) because that number is stored alongside the word itself - without actually going about retrieving any information. Then Google only retrieves the first 1000 link maximum. Other search engines will retrieve a different maximum number of links.
(* Disclaimer: Ok, I'm speculating about the actual storage and retrieval mechanism of that index, but it's based on what I know historically of how indexing works in general. My knowledge may be obsolete. *) And yes, the results are from the particular datacenter you landed on.
__________________
Christina >>Forum Moderator<< Please do not PM me for support. The forum is here for that. Last edited by webado; 07-01-2006 at 04:16 PM. |
|
#4
|
|||
|
|||
|
I think Christina's guess is pretty close to the way it works. There are probably some pretty hi-tech solutions in place also. Like server clustering. Still pretty amazing how this all works together.
|
|
#5
|
|||
|
|||
|
Google built their own databases (at least originally) instead of trying to customize any of the off-the-shelf dbs. It was (and, presumably, is still) optimized for the kinds of searches they do.
There are a lot of hints out there about how they manage to run multiple processes with just a limited amount of information to start with, but, once they've parsed the query string (split it into its components) it all has to be sent to the kinds of "inverted indexes" that Christina mentioned. It sounds to me (and I'm also just guessing, based on what little I've read and actually understood), that they send each word in a query off to a separate process that brings back matches, which are then ordered by yet a different process using different rules. My guess is that Microsoft uses some kinds of highly cusomized versions of SQL Server for their searches. It's notable that they spent years developing the version they called "TerraServer" which was (and is) used for mapping, but it was also research project that allowed them to make sure that the DB app could quickly handle terabytes of data from multiple thousands of simultaneous searches. I've never heard about what Yahoo! uses, but they bought AltaVista before launching their own search engine. AltaVista started out as a research project by the late Digital Equipment company, using custom db solutions on their (then) super-fast chip (which isn't made anymore). I suppose it's possible that Yahoo! still uses some of what the Digital folks developed. |
|
#6
|
|||
|
|||
|
Quote:
Trying to stay happy happy, but google, and admittedly yahoo "is" the evil enemies.... Guess not too happy happy here...........
__________________
He Profits Most Who Serves Best “Remember that great missions are serious undertakings. Do not expect to perform great missions in a day.” |
|
#7
|
|||
|
|||
|
Thanks for all the replies.
I am now tempted to ask how it is all stored? Do they just have mega numbers of hard drives? It would need hundreds of thousands of them surely, even with the biggest capacity available. Has nybody seen around a data centre? Are the HDs all on racks in air conditioned rooms or installed in a big PC ......1milex1milex10 miles size? |
|
#8
|
||||
|
||||
|
This is not from Google, obviously, it's from The Planet - a big datacenter in Texas: http://www.theplanet.com/facilities/tour/index.html
But you get the idea of what facilities might look like. I think I saw a Google presentation once.
__________________
Christina >>Forum Moderator<< Please do not PM me for support. The forum is here for that. |
|
#9
|
|||
|
|||
|
Be no good for you and your ciggies in there, Christina
Alan. |
|
#10
|
|||
|
|||
|
Impressive.
|
![]() |
| Thread Tools | |
| Display Modes | |
|
|