Submit Your Article Forum Rules

Results 1 to 6 of 6

Thread: How does Google indexing the page

  1. #1
    Junior Member
    Join Date
    Mar 2011
    Posts
    5

    Smile How does Google indexing the page

    Please give me the short answer what is Google indexing and how does it work ?

  2. #2
    Junior Member
    Join Date
    Jul 2011
    Location
    Bangalore, India
    Posts
    2
    Google has three distinct parts:

    * Googlebot, a web crawler that finds and fetches web pages.
    * The indexer that sorts every word on every page and stores the resulting index of words in a huge database.
    * The query processor, which compares your search query to the index and recommends the documents that it considers most relevant.

  3. The following user agrees with veecreate:
  4. #3
    Moderator Tubby's Avatar
    Join Date
    Nov 2003
    Location
    Outback Queensland Australia
    Posts
    3,756
    If you consider google to be just another user who has an enhanced ability to recommend your site the intricacies of Google do not really matter that greatly.

    How does it work? . . Three main steps.
    1 Do not treat google like a moron, (consider it to have an equal IQ to your users)
    2 Offer your users (this included Google) a page of first rate on topic substance,
    3 Provide some clear and logical pathways (links) to your page . . . Indexing will likely "Just Happen"

    classic cars - directory - Southern cross Engines
    If Optimising for google gives you a headache? - try optimising your Users

  5. #4
    Senior Member deepsand's Avatar
    Join Date
    May 2004
    Location
    State College, PA
    Posts
    16,482
    Quote Originally Posted by veecreate View Post
    Google has three distinct parts:

    * Googlebot, a web crawler that finds and fetches web pages.
    * The indexer that sorts every word on every page and stores the resulting index of words in a huge database.
    * The query processor, which compares your search query to the index and recommends the documents that it considers most relevant.
    It should be clarified that a crawler/robot/spider does not actively seek out new resources to be indexed, but simply fetchs such on command from the indexing engine.

  6. #5
    Administrator weegillis's Avatar
    Join Date
    Oct 2003
    Posts
    5,793
    "What" is Google indexing?, or "What is Google indexing?" are two different questions.

    The what part is simple. Just about any and every piece of useful information it can find, and where it can be found.

    The logical path from a new website to getting its first robot crawl is, A. Site is launched. B. Site receives a link from another site which has been indexed and is likely to be regularly crawled, a robot finds that link and reports back to indexing engine. The engine then adds the URL to its crawl queue and eventually sends a robot to give it a peek, starting with the robots.txt file. This isn't literally true. The engine simply requests robots.txt from the domain. Assuming that file is present and valid, it records the restrictions and goes about requesting the home page, usually index.html, and a short list of typical pages such as privacy, about, contact, and very likely sitemap, according to what is and is not restricted by robots.txt.

    Now the site is discovered. The engine may first only request whatever is on the domain root (domain name only, no target file) to record the responses in the headers, as well as pull the page. The responding document is cached and at some point parsed into the search index. Assuming the quality checks all meet with recommended guidelines the site begins to get more attention as it is added to subsequent queues and crawled more extensively over time. Indexed pages will be added the queue as well, for recrawling.

    The process of caching, parsing and grading is part of the search engine's own patent, and much of it is highly protected intellectual property so we can't just go off and say they do this and that, and the other thing. We can easily suspect, though, that they parse out all the text, meaning any mime type that is text, such as CSS, JS, HTML, PDF, SWF, and so on, and drill down to what is actually text in the page components, or text in link phrases. They see HTML comments but grade them differently on their quality scale. They see class and id attributes, along with all the others, being text. They can understand CSS.

    What are the things on a web page worth indexing? Well, we hope everything, but logically we can look at the various elements and how they come into play. We can start with the HTML element, TITLE. It is directly tied to the H1 element, which are both tied to the body text in page. The body text in the page (along with the other two) contain words, many of which occur regularly in search phrases. Then we have images and other media in our page. They can be indexed separately. Outbound and internal links can be indexed (and graded, ranked or some such rigor).

    How our pages get ranked is not for this discussion and is the point of this entire forum, but the above gives us an idea of what is indexed. Now, what is Google Indexing? Your guess is as good as mine.

  7. #6
    Administrator LD's Avatar
    Join Date
    Apr 2006
    Location
    Still the same.
    Posts
    4,271
    Since the red text message below is not being adhered to, this thread is closed.

    This thread has not had a response in the last two months. Consider starting a new thread.
    Last edited by LD; 07-17-2012 at 10:00 AM.
    Local Web Design Company in Markham, Toronto and Richmond Hill
    Markham-based Search Engine Optimization company servicing Toronto, the GTA including Richmond Hill
    Why a business needs a good Facebook Fan Page. IFM serves Markham, Toronto and the GTA.

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •