Submit Your Article Forum Rules Search
WebProWorld
Register FAQ Calendar Mark Forums Read
Search Engine Optimization Forum SEO is much easier with help from peers and experts! The WebProWorld SEO forum is for the discussion and exploration of various search engine optimization topics. Any non (engine) specific SEO or SEM topics should go here.

Share Thread: & Tags

Share Thread:

Reply
 
LinkBack Thread Tools Display Modes
  #1 (permalink)  
Old 05-07-2009, 10:21 AM
WebProWorld New Member
 
Join Date: Feb 2008
Posts: 18
Vithe RepRank 0
Question SEO and PDFs

The company I work for are about to upload a shedload of pdfs to the website and I would like them to be as optimised as possible. I've never gone about optimising pdfs before though.

I did search on this forum before posting but couldn't find any information about this topic at all.

I have a couple of questions which I hope you can help me answer:

Firstly, does Google read and rank pdfs pretty well?

Secondly, what are your best pdf optimisation tips?

Many thanks in advance for any help you can give me.
Reply With Quote
  #2 (permalink)  
Old 05-07-2009, 10:48 AM
kgun's Avatar
WebProWorld 1,000+ Club
WebProWorld MVP
 
Join Date: May 2005
Location: Norway
Posts: 5,944
kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10
Default Re: SEO and PDFs

Quote:
Originally Posted by Vithe View Post
Firstly, does Google read and rank pdfs pretty well?
I don't thing GoogleBOT is able to read PDF documents (that would imply OCR like scanning ability).

The ranking is another question based on citations / references / that translates to IBL's in WWW.

Quote:
Originally Posted by Vithe View Post
Secondly, what are your best pdf optimisation tips?
Write quality documents that other people link to with different anchor text.

Other members may say, submit, write about your documents etc. etc.

As always it is a question about semantics and context. Write and link where it is natural and useful for the surfer. Finding your own niche may be important.
__________________
Mini Network:: Financial information at your fingertips
Learn object oriented programming where it started

I will use a search engine before I ask dumb questions.
Reply With Quote
  #3 (permalink)  
Old 05-07-2009, 10:51 AM
Dubbya's Avatar
WebProWorld 1,000+ Club
WebProWorld MVP
 
Join Date: Nov 2006
Location: Steinbach, Manitoba, Canada
Posts: 1,307
Dubbya RepRank 6Dubbya RepRank 6Dubbya RepRank 6Dubbya RepRank 6Dubbya RepRank 6Dubbya RepRank 6
Default Re: SEO and PDFs

As long as the PDF is rastorized (flattened) to an image and the text is embedded, it's pretty much a cinch.

There are a few caveats though. If you save out a PDF and apply security preferences to it, the contents are encrypted and can't be read by the Search Engines.

If it doesn't matter that the end user can copy text or print the page, don't lock it up.

Here are a couple of articles of interest:

SEO Your PDF's - Does This Work?

Optimizing PDFs for SEO

Good Luck!
Reply With Quote
  #4 (permalink)  
Old 05-07-2009, 10:55 AM
kgun's Avatar
WebProWorld 1,000+ Club
WebProWorld MVP
 
Join Date: May 2005
Location: Norway
Posts: 5,944
kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10
Default Re: SEO and PDFs

Quote:
Originally Posted by Dubbya View Post
"Can the search engines read PDF files?

Yes, most of the major search engines now can read the basic contents of PDF files, though getting these pages to rank as well as HTML files is still questionable".

New to me.
__________________
Mini Network:: Financial information at your fingertips
Learn object oriented programming where it started

I will use a search engine before I ask dumb questions.
Reply With Quote
  #5 (permalink)  
Old 05-07-2009, 11:17 AM
Dubbya's Avatar
WebProWorld 1,000+ Club
WebProWorld MVP
 
Join Date: Nov 2006
Location: Steinbach, Manitoba, Canada
Posts: 1,307
Dubbya RepRank 6Dubbya RepRank 6Dubbya RepRank 6Dubbya RepRank 6Dubbya RepRank 6Dubbya RepRank 6
Default Re: SEO and PDFs

Quote:
Originally Posted by kgun View Post
New to me.
We've all been seeing PDF's and content snippets listed alongside web pages in the SERPs for some time, so it would only stand to reason that the PDF files are being spidered and indexed as well.
Reply With Quote
  #6 (permalink)  
Old 05-07-2009, 11:53 AM
kgun's Avatar
WebProWorld 1,000+ Club
WebProWorld MVP
 
Join Date: May 2005
Location: Norway
Posts: 5,944
kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10
Default Re: SEO and PDFs

Quote:
Originally Posted by Dubbya View Post
We've all been seeing PDF's and content snippets listed alongside web pages in the SERPs for some time, ...
  1. Spidered is different from being indexed.
  2. PDF files have been indexed for a long time. I thought that was based on IBL's only.
Quote:
Originally Posted by Dubbya View Post
... so it would only stand to reason that the PDF files are being spidered and indexed as well.
So it is based on opinions and not on facts Dubbya?

Natural search term:

does googlebot spider pdf documents site:google.com
__________________
Mini Network:: Financial information at your fingertips
Learn object oriented programming where it started

I will use a search engine before I ask dumb questions.

Last edited by kgun; 05-07-2009 at 11:56 AM.
Reply With Quote
  #7 (permalink)  
Old 05-07-2009, 01:01 PM
WebProWorld Pro
 
Join Date: Jul 2003
Posts: 117
Peter RepRank 2Peter RepRank 2
Default Re: SEO and PDFs

Google does rank the PDF pages, however it is very useful to input all those optional fields under File -> Properties to help placement, so you need to use Acrobat or a full fledged PDF editor rather than just a PDF writer.

For a product specific search term with 453k results, our PDF page ranks 14th on Google and the HTML is listed underneath it. It might rank higher if we had a better page title..

The quality of copy etc I believe still applies, along with the basics of keywords.
Reply With Quote
  #8 (permalink)  
Old 05-07-2009, 01:33 PM
kgun's Avatar
WebProWorld 1,000+ Club
WebProWorld MVP
 
Join Date: May 2005
Location: Norway
Posts: 5,944
kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10
Default Re: SEO and PDFs

Quote:
Originally Posted by Peter View Post
Google does rank the PDF pages,
Based on spidered file content is my question.

Quote:
Originally Posted by Peter View Post
however it is very useful to input all those optional fields under File -> Properties to help placement, so you need to use Acrobat or a full fledged PDF editor rather than just a PDF writer.
I have the 2007 professional version of Acrobat Reader. That is a possibility to describe the subject and give keywords etc.

I looked at some of the above hits that I suggested in my search term and could not find an article on Google.com that said that GoogleBot is able to scan PDF documents and rank content based on such a scanning.
__________________
Mini Network:: Financial information at your fingertips
Learn object oriented programming where it started

I will use a search engine before I ask dumb questions.

Last edited by kgun; 05-07-2009 at 01:44 PM.
Reply With Quote
  #9 (permalink)  
Old 05-07-2009, 07:36 PM
WebProWorld Member
 
Join Date: Nov 2008
Location: South East USA
Posts: 42
seomagician RepRank 1
Default Re: SEO and PDFs

They definitely rank. I have a client that had several pdf files that had pretty high rankings. I actually spent time replacing those rankings with other html pages. The reason? The files were too big. It took forever to download them and people just bounced off the site. The moral of the story for me is to create pdf files that have a pretty fast download.
Reply With Quote
  #10 (permalink)  
Old 05-07-2009, 09:10 PM
WebProWorld Member
 
Join Date: Oct 2005
Location: Manchester
Posts: 83
Psychobel RepRank 0
Default Re: SEO and PDFs

I would tend to put the content in flat HTML pages and have a .pdf hanging off that for download if required.
Why?
HTML pages have your navigation around them and the opportunity to place calls to action. And you know HTML gets indexed and ranked
Reply With Quote
  #11 (permalink)  
Old 05-08-2009, 12:43 AM
WebProWorld 1,000+ Club
WebProWorld MVP
 
Join Date: Oct 2006
Posts: 1,029
innominds RepRank 5innominds RepRank 5innominds RepRank 5innominds RepRank 5innominds RepRank 5innominds RepRank 5
Default Re: SEO and PDFs

Sorry for being against the topic. What should one do if PDF's are not to be indexed? Do zipping a file will help it?
Reply With Quote
  #12 (permalink)  
Old 05-08-2009, 01:32 AM
Banned
 
Join Date: Jul 2006
Posts: 73
sck4784 RepRank 0
Default Re: SEO and PDFs

Search engines can crawl & index PDF files. For optimizing PDF files few point are worth considering.

1. Use keywords enrich contents in the PDF file.

2. Proper use of <H1><b><I> should be in the PDF file.

3. Targeted keywords should be included in the page urls.

4. Use Acrobat 6.0 or above version for creating PDF files.

5. Dont use too much graphics/images in PDF files.

6. Keep PDF files length as less as possible.

These are some tips from my side.
Reply With Quote
  #13 (permalink)  
Old 05-08-2009, 01:53 AM
NetProwler's Avatar
WebProWorld Member
 
Join Date: Jan 2007
Posts: 97
NetProwler RepRank 2
Default Re: SEO and PDFs

@innominds:

When you are creating/saving your pdf file - select the security tab from the application you are using and select the 'Encrypt the PDF document' option which will prompt you for a password. Your saved file can't be indexed by any search engine.
Reply With Quote
  #14 (permalink)  
Old 05-08-2009, 02:12 AM
WebProWorld Member
 
Join Date: Jun 2004
Location: Cape Town
Posts: 34
Pico_Train RepRank 2
Default Re: SEO and PDFs

I just searched for an arbitrary PDF file on the web. I took a snippet of text from it and searched for that snippet in "" in Google. The document was found.

Many PDF files are offered as HTML. Does that not imply that Google can read them?

I think they can. They don't rank as well in my opinion for reasons I don't understand or can explain. Just an opinion there. They do rank though.

I will write pdf file and link to it from one of my sites. Put arbitrary content in it and see what happens as an experiment because it could be useful in the future. Oh and just to get KGUN to be a bit less confrontational in the future -
Reply With Quote
  #15 (permalink)  
Old 05-08-2009, 03:30 AM
ericajoieake's Avatar
WebProWorld Pro
 
Join Date: Feb 2008
Posts: 109
ericajoieake RepRank 3ericajoieake RepRank 3
Default Re: SEO and PDFs

nowadays google can crawl pdf files and get rank
__________________
Club Flyers
California Web Design
Reply With Quote
  #16 (permalink)  
Old 05-08-2009, 04:49 AM
dburdon's Avatar
WebProWorld 1,000+ Club
 
Join Date: Oct 2004
Location: Kent, England
Posts: 1,461
dburdon RepRank 2
Default Re: SEO and PDFs

Google has been reading and ranking pdf's for years.

Optimising for a pdf is the same as optimising for any other form of document. Think target market, do keyword research, build keywords into content, build content into meta tags.

I've got a 2005 pdf document out there that still - rather embarrassingly - ranks highly.
__________________
Simply Clicks | SEO | SEO Training| Pay Per Click Advertising | Search Engine Powered Marketing
Reply With Quote
  #17 (permalink)  
Old 05-08-2009, 04:55 AM
WebProWorld New Member
 
Join Date: Jul 2008
Posts: 11
m-chapman RepRank 0
Cool Re: SEO and PDFs

PDFs can be optimised and do rank. I am just reviewing the rankings report for one of our clients. This firm publishes a regular newsletter that we re-save in Adobe Acrobat with title, description, keyword tags. Links can be added to the text in the document.

How well does it work? Currently, this client has several PDFs on page 1 and many others in the top 50 results.

mark chapman.
Reply With Quote
  #18 (permalink)  
Old 05-08-2009, 05:18 AM
WebProWorld Pro
 
Join Date: Mar 2009
Location: Cardiff, UK
Posts: 174
nickoran RepRank 2nickoran RepRank 2
Default Re: SEO and PDFs

Quote:
Originally Posted by sck4784 View Post
Search engines can crawl & index PDF files. For optimizing PDF files few point are worth considering.

1. Use keywords enrich contents in the PDF file.

2. Proper use of <H1><b><I> should be in the PDF file.

3. Targeted keywords should be included in the page urls.

4. Use Acrobat 6.0 or above version for creating PDF files.

5. Dont use too much graphics/images in PDF files.

6. Keep PDF files length as less as possible.

These are some tips from my side.
Really handy to know, thanks!
__________________
Peace, through superior firepower. "Roach" SAP Jobs : Search Engine Optimisation : SAP
Reply With Quote
  #19 (permalink)  
Old 05-08-2009, 05:34 AM
WebProWorld Pro
 
Join Date: Apr 2009
Posts: 100
loosapphire RepRank 1
Default Re: SEO and PDFs

Firstly, does Google read and rank pdfs pretty well?

Yes, most of the major search engines now can read the basic contents of PDF files, though getting these pages to rank as well as HTML files is still questionable.


Secondly, what are your best pdf optimisation tips?The simple answer is, yes. The title tag and body copy can still be optimized and the major search engines will index it accordingly. As far as the Keywords and Description meta tags, well Google ignores this in PDF’s just as it does in HTML documents and Yahoo!, which does use the description tag, is only half way to where it needs to be.
__________________
Cubic zirconia | Chinese medicine
Reply With Quote
  #20 (permalink)  
Old 05-08-2009, 05:42 AM
WebProWorld Pro
 
Join Date: Dec 2003
Location: Eastleigh, Hampshire, UK
Posts: 185
Clarrie RepRank 3Clarrie RepRank 3
Default Re: SEO and PDFs

We run a site for a client that has a lot of PDF White Paper downloads. All the main search engines, including Google can index PDFs.

Google doesn't really seem to give them as much priority in SERPS as HTML - they mostly come up in the results for longer tail / very topic specific searches, but they do still get returned some of the time.

This site also has Google site search in it, and the PDFs routinely show in the results.

Personally I don't think Google has put the same sort of effort into analysing / indexing PDF format with a view to ranking, or maybe PDFs don't tend to get a much Page Rank - probably fewer links etc.

Have found that Live and Yahoo tend to show PDFs more often - I'd say Live are currently the best at being able to deliver relevant PDFs in results.

Quote:
Originally Posted by seomagician View Post
They definitely rank. I have a client that had several pdf files that had pretty high rankings. I actually spent time replacing those rankings with other html pages. The reason? The files were too big. It took forever to download them and people just bounced off the site. The moral of the story for me is to create pdf files that have a pretty fast download.
The way we get round that is to have an HTML intro page - this has an Abstract and / or outline of the content in the PDF together with other relevant data, and a download button link to the PDF.
__________________
Clarrie
www.dvisions.co.uk - lose the camouflage and stand out...
Reply With Quote
  #21 (permalink)  
Old 05-08-2009, 05:49 AM
WebProWorld New Member
 
Join Date: Feb 2008
Posts: 18
Vithe RepRank 0
Default Re: SEO and PDFs

Many thanks for all the replies. By the looks of the webpages I had been looking at before posting, it seems that this topic had been overlooked for a while - nice to get a fresh view on it.

Thanks for the optimisation tips too, very useful and I will definitely be putting them into practice!

Edit: just seen Clarrie's post. That format was exactly what I had planned - a list of optimised intro snippets on one page with the PDFs all linked off.
Reply With Quote
  #22 (permalink)  
Old 05-08-2009, 08:23 AM
kgun's Avatar
WebProWorld 1,000+ Club
WebProWorld MVP
 
Join Date: May 2005
Location: Norway
Posts: 5,944
kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10
Default Re: SEO and PDFs

Quote:
Originally Posted by Pico_Train View Post
I just searched for an arbitrary PDF file on the web. I took a snippet of text from it and searched for that snippet in "" in Google. The document was found.
Give us the proof please.

That is no proof that PDF documents are crawled by GoogleBOT.
  1. Can anybody explain why?
  2. Why is it difficult to find information / evidence on Google's own site?
Quote:
Originally Posted by Pico_Train View Post
I will write pdf file and link to it from one of my sites. Put arbitrary content in it and see what happens as an experiment because it could be useful in the future. Oh and just to get KGUN to be a bit less confrontational in the future -
I have still not got a good enough answer. Show me a cite from Google's official site please, not their blogs, groups or third party sites. There is enough speculation and opinions on the internet. So long this thread is a good example.
__________________
Mini Network:: Financial information at your fingertips
Learn object oriented programming where it started

I will use a search engine before I ask dumb questions.

Last edited by kgun; 05-08-2009 at 08:35 AM.
Reply With Quote
  #23 (permalink)  
Old 05-08-2009, 09:00 AM
WebProWorld Member
 
Join Date: Oct 2005
Posts: 30
gavinscott RepRank 0
Default Re: SEO and PDFs

From a user point of view I find pdfs frustrating; on my set-up (Mac Tiger/Safari) there's around a 50/50 success rate for them loading within the browser (I often have to download them first and then open them in Acrobat), and when they do work they load very slowly. Yes this maybe a browser issue, but there are many Safari users out there.

It doesn't matter what Google thinks if your visitors are put off by slow load times or documents failing to load altogether. If it is too great a task to convert all the documents to HTML it might be worth offering the first page of each in HTML with a link to the full document so your visitors know it's worth the trouble.
Reply With Quote
  #24 (permalink)  
Old 05-08-2009, 09:05 AM
kgun's Avatar
WebProWorld 1,000+ Club
WebProWorld MVP
 
Join Date: May 2005
Location: Norway
Posts: 5,944
kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10
Default Re: SEO and PDFs

Quote:
Originally Posted by gavinscott View Post
It doesn't matter what Google thinks if your visitors are put off by slow load times or documents failing to load altogether. If it is too great a task to convert all the documents to HTML it might be worth offering the first page of each in HTML with a link to the full document so your visitors know it's worth the trouble.
That is one good example why you find the same key words in a PDF documents as in a search query.
__________________
Mini Network:: Financial information at your fingertips
Learn object oriented programming where it started

I will use a search engine before I ask dumb questions.
Reply With Quote
  #25 (permalink)  
Old 05-08-2009, 11:29 AM
WebProWorld New Member
 
Join Date: Nov 2005
Location: Wilmington, DE
Posts: 4
wacjr66 RepRank 0
Default Re: SEO and PDFs

Did a test, searched for...Impact of Managed Care in
the Developmental Disabilities Sector... and PDFs come up peppered throughout the SERPs. Looks like copy is cited and Google gives the option to view the PDF as HTML. Don't know if this would prove that some level of "spidering" occurs.

Some of our clients have successful, high level traffic for their PDFs. We do encourage them to develop HTML instead of, or in addition as support, the PDFs. IMO HTML equivalents will fair better in results.
Reply With Quote
  #26 (permalink)  
Old 05-08-2009, 11:37 AM
kgun's Avatar
WebProWorld 1,000+ Club
WebProWorld MVP
 
Join Date: May 2005
Location: Norway
Posts: 5,944
kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10
Default Re: SEO and PDFs

I googled:

Impact of Managed Care in the Developmental Disabilities Sector

Result HTML document starting like this:

Ten Dimensions of Public-Sector Managed Care

Michael A. Hoge, Ph.D., Selby Jacobs, M.D., Neil M. Thakur, M.Phil. and Ezra E.H. Griffith, M.D.

First hit:

Ten Dimensions of Public-Sector Managed Care -- Hoge et al. 50 (1): 51 -- Psychiatr Serv

Second hit:

http://psychservices.psychiatryonlin...nt/50/1/51.pdf

Do you see any similarities?

<quote>
Looks like copy is cited and Google gives the option to view the PDF as HTML. Don't know if this would prove that some level of "spidering" occurs.
</quote>
My bolding.

At least not the above example.

This example comes under the same category as that mentioned in my above post.

I am sure that there are much more complex examples if text from a PDF document is written on n other sites on the internet. Do you find that likely?

Can

<link rel="Canonical" href="http://www.yourdomain.com">

prevent it? No it can not, since the (stolen / duplicated) HTML document can rank higher as "the original document" (especially if the PDF version can not be spidered). So long I have seen no proof that it can.
__________________
Mini Network:: Financial information at your fingertips
Learn object oriented programming where it started

I will use a search engine before I ask dumb questions.

Last edited by kgun; 05-08-2009 at 11:55 AM.
Reply With Quote
  #27 (permalink)  
Old 05-08-2009, 12:05 PM
chrisJumbo's Avatar
WebProWorld Veteran
 
Join Date: Oct 2005
Location: California
Posts: 342
chrisJumbo RepRank 3chrisJumbo RepRank 3
Default Re: SEO and PDFs

I suggested the search and pulled up this link:
NLCDD Policy Insights Bulletin (March 2009).pub

At the top of the page, Google inserted a message that it automatically creates HTML versions of PDF documents that it crawls. That should be enough proof.

However, the PDF version loaded so much slower and I believe users would appreciate having an HTML version more. Also, as others noted, the HTML version can you have your navigation which I also believe gives the user a better experience. Especially when it comes to a search. If you open the PDf from the search it is much more difficult to get to the home page or other pages on the site.

cd :O)
__________________
CD Rates | CD Rates Blog | Banking Online
Reply With Quote
  #28 (permalink)  
Old 05-08-2009, 12:13 PM
kgun's Avatar
WebProWorld 1,000+ Club
WebProWorld MVP
 
Join Date: May 2005
Location: Norway
Posts: 5,944
kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10
Default Re: SEO and PDFs

Quote:
Originally Posted by chrisJumbo View Post
At the top of the page, Google inserted a message that it automatically creates HTML versions of PDF documents that it crawls. That should be enough proof.
My bolding. Proof to whom?

<quote>
This is the html version of the file http://www.nasddds.org/pdf/PolicyInsightsBulletin(March2009).pdf.
Google automatically generates html versions of documents as we crawl the web.
</quote>
Where are these documents placed?

I use Opera's principle and do not rely on any site on the internet. Do you rely on the above?

Compare the hits:

Google automatically generates html versions of documents as we crawl the web

Google automatically generates html versions of documents as we crawl the web site:google.com

Do you find any proof on google.com?
__________________
Mini Network:: Financial information at your fingertips
Learn object oriented programming where it started

I will use a search engine before I ask dumb questions.

Last edited by kgun; 05-08-2009 at 12:23 PM.
Reply With Quote
  #29 (permalink)  
Old 05-08-2009, 12:32 PM
WebProWorld Pro
 
Join Date: Jul 2003
Posts: 117
Peter RepRank 2Peter RepRank 2
Default Re: SEO and PDFs

Does this link help answer it?

Official Google Webmaster Central Blog: First date with the Googlebot: Headers and compression

In there is some text

Googlebot: Website, let me give a bit more background. After actually downloading a file, I use the Content-Type header to check whether it really is HTML, an image, text, or something else. If it's a special data type like a PDF file, Word document, or Excel spreadsheet, I'll make sure it's in the valid format and extract the text content. Maybe it has a virus; you never know. If the document or data type is really garbled, there's usually not much to do besides discard the content.

My understanding is that Googlebot visits a site, and crawls all the links for file types, it then decides to look at particular types of files that claim to be of a type, the Googlebot then interrogates the file further looking at the content and then decides on the algorithim how to rank or include in the index.

Is this what you are looking for Kgun? or am i missing the point again?
Reply With Quote
  #30 (permalink)  
Old 05-08-2009, 12:38 PM
kgun's Avatar
WebProWorld 1,000+ Club
WebProWorld MVP
 
Join Date: May 2005
Location: Norway
Posts: 5,944
kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10
Default Re: SEO and PDFs

Quote:
Originally Posted by Peter View Post
My understanding is that Googlebot visits a site, and crawls all the links for file types, it then decides to look at particular types of files that claim to be of a type, the Googlebot then interrogates the file further looking at the content and then decides on the algorithim how to rank or include in the index.
Very interesting if that is correct.

Quote:
Originally Posted by Peter View Post
Is this what you are looking for Kgun? or am i missing the point again?
No, reread my above posts. I look for evidence from google.com.

I have seen enough nonsense from informal Google sites. Here is an example:

AdSense: Support bad without responsibility?

More precisely here: http://www.google.com/support/forum/...c9227b77&hl=en

That is even from the Google.com forum. The people participating there are not Google emplyoee's. They have no responsibility on behalf of Google as far as I know.
__________________
Mini Network:: Financial information at your fingertips
Learn object oriented programming where it started

I will use a search engine before I ask dumb questions.

Last edited by kgun; 05-08-2009 at 12:45 PM.
Reply With Quote
  #31 (permalink)  
Old 05-08-2009, 12:48 PM
flhu's Avatar
WebProWorld Member
 
Join Date: May 2008
Location: NY
Posts: 97
flhu RepRank 2
Default Re: SEO and PDFs

Proof that google understands and spiders pdfs:

1. search google for manual .pdf
2. Click any of the pdf result's VIEW AS HTML
3. The resulting link is the PDF, presented as HTML, with all search words highlighted served from google's site. ( ie: The Manual The attached manual was located by the Manchester (England) Metropolitan Police during a search an al Qaeda) (Google is google)

As far as ranking and optimizing for spidering, that's a different story.

Not all PDFs are just a collection of images. Those originally generated directly from publishing layout software usually contains the text content with instructions on how to position and display it. If your pdfs are scans, consider running them through OCR so the text is in the file as, well, text. otherwise, could also use related text around and descriptive linking to the pdf, but that won't be nearly as effective.

Here's some fun: find a pdf that has text that can be highlighted as text, change its extension to .txt and load it in your favorite text editor.
__________________
I liken SEO to voodoo and make a sacrifice of rum and decapitate a chicken to Papa Legba, spirit of communications and crossroads, before every site launch.

Last edited by flhu; 05-08-2009 at 01:01 PM.
Reply With Quote
  #32 (permalink)  
Old 05-08-2009, 01:02 PM
WebProWorld Pro
 
Join Date: Jul 2003
Posts: 117
Peter RepRank 2Peter RepRank 2
Default Re: SEO and PDFs

Ok, my link was for the google webmaster blog, however the link below is directly on the Google.com website

Site not appearing in search results, or appearing lower - Webmasters/Site owners Help

That isnt a google forum message answer so I take it as "official employee" answer.


The relevent part is

Googlebot processes each of the pages it crawls in order to compile a massive index of all the words it sees and their location on each page. In addition, we process information included in key content tags and attributes, such as title tags and alt attributes. Google can process many types of content. However, while we can process HTML, PDF, and Flash files, we have a more difficult time understanding (e.g. crawling and indexing) other rich media formats, such as Silverlight.
Reply With Quote
  #33 (permalink)  
Old 05-08-2009, 01:07 PM
kgun's Avatar
WebProWorld 1,000+ Club
WebProWorld MVP
 
Join Date: May 2005
Location: Norway
Posts: 5,944
kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10
Default Re: SEO and PDFs

Google:

"C++ Builder 2009 Professional::: Getting started PDF download."

Results 1 - 1 of 1 for "C++ Builder 2009 Professional::: Getting started PDF download.". (0.33 seconds)

Learn object oriented programming, at OopSchool.com

C++ Builder 2009 Professional::: Getting started PDF download. Kjell Gunnar Bleivik 05.01.2009::: Computing or drawing? Today, French-Russian mathematician ...
Learn object oriented programming, at OopSchool.com -

It can not be viewed as an html document today, may 8 2009.
  1. I am not convinced.
  2. If it is correct, it is not the default that Google automatically generates html versions of documents as we crawl the web.
Here is my robots.txt

User-agent: *
Disallow: error_log
Disallow: .ftpquota
Disallow: /cgi-bin/
Disallow: /include/
Disallow: /javascript/
Disallow: /styling/

and here

#The following line allow html extension pages to act as php pages
AddType application/x-httpd-php .php .html .htm
#The following line removes the identifier telling the user that the page uses PHP
#Header unset X-Powered-By
#Only include this line once to enable the rewriting engine
RewriteEngine on
#Begrenser tilgang
## File paths are relative to the Document Root (/)
# '404 Not Found' error
ErrorDocument 404 /404.htm
# '403 Forbidden' error
ErrorDocument 403 /my.htm
# '401 Unauthorized' error
ErrorDocument 401 /401.htm
# Or..
# ErrorDocument 401 "The webserver could not authorise you for content access.
order deny,allow
allow from all

is my addon domain .htaccess

Here is the most important parts of my main domain .htaccess

#The following line allow html extension pages to act as php pages
#AddType application/x-httpd-php .php .html
#The following line removes the identifier telling the user that the page uses PHP
#Header unset X-Powered-By
#Only include this line once to enable the rewriting engine
RewriteEngine on
#Begrenser tilgang
## File paths are relative to the Document Root (/)
# '404 Not Found' error
ErrorDocument 404 /404.htm
# '403 Forbidden' error
ErrorDocument 403 "Sorry: We have no capacity to allow you access now. Please, try later.
#ErrorDocument 403 "Sorry: We are upgrading our forum. Please, try later.
# Or..
#ErrorDocument 403 /403.htm
# '401 Unauthorized' error
ErrorDocument 401 "The webserver could not authorise you for content access.
# Or..
#ErrorDocument 401 /401.htm
#
# Managing server access
#
<Files "config.php">
Order Allow,Deny
Deny from All
</Files>
#
<Files "common.php">
Order Allow,Deny
Deny from All
</Files>
order deny,allow
deny from all
#
# White list start

..............................

# White list end
# The next line is commented out when the above white list is used.
allow from all
__________________
Mini Network:: Financial information at your fingertips
Learn object oriented programming where it started

I will use a search engine before I ask dumb questions.

Last edited by kgun; 05-08-2009 at 01:41 PM.
Reply With Quote
  #34 (permalink)  
Old 05-08-2009, 01:07 PM
WebProWorld New Member
 
Join Date: Nov 2005
Location: Wilmington, DE
Posts: 4
wacjr66 RepRank 0
Default Re: SEO and PDFs

You're right kgun, it is difficult to find definitive documentation from Google that they officially crawl the content of PDF files and rank specifically on content instead of IBL influence. I still believe they can do it.

There's only two links to the PDF at www dot nlcdd dot org/managedcare/policy-bulletin-short.pdf one from the site www dot nlcdd dot org as follows

[a href="www dot nlcdd dot org/managedcare/policy-bulletin-short.pdf" target="_blank" title="PDF file opens in a new window."][img src="images/nlcdd-policy.gif" alt="NLCDD Policy Brief on Managed Care in DD Field - Click Here!" width="163" height="181" id="brief" /][/a] (linked from an image, not text. alt attrb. has "Managed Care in DD Field" not exact match to the search phrase may have something to do with IBL, I doubt it though)

and one from a larger PDF at www dot nlcdd dot org/pdf/PolicyInsightsBulletin(March2009).pdf

--from PDF doc--

"...which is posted on the website of the National Leadership Consortium on Developmental Disabilities at www dot nlcdd dot org/managedcare."

So why would a search for "managed care in the developmental disabilities sector" bring this PDF up as #2 in SERP with snip of content including the phrase if Google hadn't crawled the document in some way?
Reply With Quote
  #35 (permalink)  
Old 05-08-2009, 01:18 PM
kgun's Avatar
WebProWorld 1,000+ Club
WebProWorld MVP
 
Join Date: May 2005
Location: Norway
Posts: 5,944
kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10
Default Re: SEO and PDFs

Quote:
Originally Posted by wacjr66 View Post
So why would a search for "managed care in the developmental disabilities sector" bring this PDF up as #2 in SERP with snip of content including the phrase if Google hadn't crawled the document in some way?
  1. For years, desktop search have had the ability to search inside some types of documents.
  2. Screen readers may have functions that a browser don't have. Screen readers are for disabled people.
  3. Google may give extra service to disabled people based on KW's.
__________________
Mini Network:: Financial information at your fingertips
Learn object oriented programming where it started

I will use a search engine before I ask dumb questions.
Reply With Quote
  #36 (permalink)  
Old 05-08-2009, 01:29 PM
chrisJumbo's Avatar
WebProWorld Veteran
 
Join Date: Oct 2005
Location: California
Posts: 342
chrisJumbo RepRank 3chrisJumbo RepRank 3
Default Re: SEO and PDFs

Quote:
Originally Posted by kgun View Post
My bolding. Proof to whom?

Do you find any proof on google.com?
I searched using Google.com. The messege was inserted by google.com.

I typed "google indexing pdf" into google's SE. And found tons of links, by sites that I would consider authoritive:
Google Now Indexing Text Within Scanned Adobe PDF Files
Official Google Blog: A picture of a thousand words?
Google Does PDF & Other Changes - Search Engine Watch (SEW)

So, Kgun, I'm not sure what "proof" you are looking for, but the evidence seems spot on to me.

But, like I said, I think the user experience is better if the pdf is converted into html and then the pdf can be offered as a download for a potentially fancier, portable presentation/document.
cd :O)
__________________
CD Rates | CD Rates Blog | Banking Online
Reply With Quote
  #37 (permalink)  
Old 05-08-2009, 01:30 PM
WebProWorld New Member
 
Join Date: Nov 2005
Location: Wilmington, DE
Posts: 4
wacjr66 RepRank 0
Default Re: SEO and PDFs

When a client insists on a PDF I will continue to suggest no security measures on the doc and that it have great content full of targeted keywords/phrases and links to their site(s), if possible. And I will stick to my theory that Google can crawl and index PDFs based on that content.
Reply With Quote
  #38 (permalink)  
Old 05-08-2009, 02:56 PM
WebProWorld Pro
 
Join Date: Jun 2008
Location: Leeds, West Yorkshire
Posts: 125
cbosleeds RepRank 1
Default Re: SEO and PDFs

I have some information on SEO for PDFs and a video here: Link building with PDFs


I'm not sure what damage password protecting a PDF might do to be honest after reading the above so maybe ignore that bit.
Reply With Quote
  #39 (permalink)  
Old 05-08-2009, 04:52 PM
WebProWorld Member
 
Join Date: Jun 2004
Location: Cape Town
Posts: 34
Pico_Train RepRank 2
Default Re: SEO and PDFs

The phrase I searched for was at the bottom of the pdf. There were only 2 results returned. First place was the document found.

I didn't post links because in the other forums I frequent it is frowned upon.

PDFs are crawled to be converted into HTML, they are indexed and they are ranked.
Reply With Quote
  #40 (permalink)  
Old 05-08-2009, 08:29 PM
texxs's Avatar
WebProWorld Veteran
 
Join Date: Jul 2005
Location: Somewhere in scrub of Florida
Posts: 391
texxs RepRank 2
Default Re: SEO and PDFs

Please do the internet community a favor and talk your client out of using pdf's.

They suck. And they just keep getting worse.

Here's a couple links:

pdfs suck - Yahoo! Search Results '

(I changed it to the SE results becuase there's just so many sites out there explaining the numerous way pdf's aren't fit for use except in very, very limited circumstances.)
__________________
Take a break and watch some stupid video clips
Reply With Quote
  #41 (permalink)  
Old 05-09-2009, 02:03 AM
lindaczelusniak's Avatar
WebProWorld New Member
 
Join Date: Oct 2005
Posts: 3
lindaczelusniak RepRank 0
Default Re: SEO and PDFs

I agree with texxs; PDF's are awful as web content. It seems everyone on this post is off on academic discussions on PDF's getting crawled or not when the first question to Vithe should be why all the PDFs?

PDF's were never meant to replace hmtl and just break peoples surfing experience. I often see them used out of pure laziness; people don't want to recreate marketing material in print and online formats...

Now, for manuals and other documents that mostly get printed, yeah, use PDF's but for web content they are the ultimate party poopers
Reply With Quote
  #42 (permalink)  
Old 05-09-2009, 09:45 AM
WebProWorld New Member
 
Join Date: Feb 2008
Posts: 18
Vithe RepRank 0
Default Re: SEO and PDFs

Well to give you a bit of background, my company has never uploaded its press releases to the web so we have loads (read: like 60) stored up.

My choices seem to be a) upload them as they are or b) take the content and make a load of brand new html pages.

Unfortunately I'm not that technical - I don't ever create new pages, I have to get one of the IT guys to do that - but I can add to existing pages through the CMS.

So, option a) would be fine for me to do because I can upload the pdfs using our FTP, whilst option b) would involve having to get one of our technical guys to spend ages creating new pages.

What I was hoping was that PDFs could be read by search engines quite well so option a) would suffice and I could do the work myself. This is why I was planning to down the snippet + pdf route.

I realise that this might seem a bit of a lame reason - and probably makes me look quite daft for those of you with much more technical knowledge - but the practical side of it is quite important for me.

I really appreciate all of the comments that have been made here and I can see the benefits of adding the page in proper online format. I'm not keen on PDFs when I'm surfing the web either.

I think I will try to go down the html route as it really seems that that is going to be the best for both search engines and visitors. Pity the poor overworked technical team

Thanks for all your comments. I hope you don't think me too naive but my background is mostly as an SEO copywriter and I'm having to do some catch up on the technical side of things.
Reply With Quote
  #43 (permalink)  
Old 05-09-2009, 11:33 AM
Peter (IMC)'s Avatar
WebProWorld MVP
WebProWorld MVP
 
Join Date: Dec 2003
Posts: 1,485
Peter (IMC) RepRank 4Peter (IMC) RepRank 4Peter (IMC) RepRank 4Peter (IMC) RepRank 4
Default Re: SEO and PDFs

Quote:
Originally Posted by Vithe View Post
The company I work for are about to upload a shedload of pdfs to the website and I would like them to be as optimised as possible. I've never gone about optimising pdfs before though.

I did search on this forum before posting but couldn't find any information about this topic at all.

I have a couple of questions which I hope you can help me answer:

Firstly, does Google read and rank pdfs pretty well?

Secondly, what are your best pdf optimisation tips?

Many thanks in advance for any help you can give me.
If the pdf's contain real content (i.e. they´re not just technical data sheets and a bunch of images.) then I would consider the option to create HTML versions of these documents. Each page in the pdf is a page in your website, with "next page" and "previous page" links and a list of links of all pages in the left column.

Doing this will give you the best result, but it might be an awful lot of work.
__________________
FREE SEO ! Really? YES! All you have to do is implement it!
Follow me on Twitter PeterIMC
Reply With Quote
  #44 (permalink)  
Old 05-09-2009, 12:07 PM
texxs's Avatar
WebProWorld Veteran
 
Join Date: Jul 2005
Location: Somewhere in scrub of Florida
Posts: 391
texxs RepRank 2
Red face Re: SEO and PDFs

Quote:
Originally Posted by Vithe View Post
My choices seem to be a) upload them as they are or b) take the content and make a load of brand new html pages.
There's software that converts .pdf's to html automatically, takes less than a second per page usually.

Hate to keep spouting off negativety, but pdf's aren't good for printing either. They only time you should use them is when you are required to have an encrypted document that the average person can't edit, as in a contract. Even then you should take the time to learn how to "lock" it in the software you created the document in the first place.

the whole PDF thing just doesn't make sense to use EVER in my book:

Here's a typical effiecent work flow:
  • Open MS Word ($200) Or OpenOffice (free)
  • Create document
  • Lock document , save it
  • Distribute document
  • Need to find a document out of the hundreds or thousands you have? Easy, just use any normal search tool, like windows search or google desktop (not recomended because it's spyware)


Here's the PDf workflow
  • Open MS Word ($200) Or OpenOffice (free)
  • Create document, save it
  • Open document with Adobe Acrobat ($300)
  • Create a pdf
  • Distribute document
  • If distributing via internet, pay extra for bandwidth, and live with the fact you are most definitely crashing computers left and right as well as slowing down the rest of your web site.
  • answer complaints about how Adobe reader crashes your users computers
  • always wonder, did they read my material or did it crash their computer?
  • always wonder, Do they think I don't know tech because I'm using PDF's?
  • Need to find a document out of the hundreds or thousands you have? You out of luck, unless you know the file name.

Now why does anyone use PDF's again?

Please there must be some reason people use pdf's? Is is really because they Don't want to take 5 minutes to learn how to lock documents in their word processor? Seriously?
__________________
Take a break and watch some stupid video clips
Reply With Quote
  #45 (permalink)  
Old 05-09-2009, 02:02 PM
kgun's Avatar
WebProWorld 1,000+ Club
WebProWorld MVP
 
Join Date: May 2005
Location: Norway
Posts: 5,944
kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10kgun RepRank 10
Default Re: SEO and PDFs

Quote:
Originally Posted by lindaczelusniak View Post
Now, for manuals and other documents that mostly get printed, yeah, use PDF's but for web content they are the ultimate party poopers
Agree. Se my above example.

Quote:
Originally Posted by texxs View Post
Now why does anyone use PDF's again?
IMO, Because it is portable, platform independent. That was the main reason, at least as far as I understand, that Adobe made Acrobat reader and the professional version that can make PDF documents.

I create them very fast in MS Word. When the Word document is finished, I click the PDF button and a document of 50 pages is created in a few minutes.

Quote:
Originally Posted by texxs View Post
  • Need to find a document out of the hundreds or thousands you have? Easy, just use any normal search tool, like windows search or google desktop (not recomended because it's spyware)
My bolding. Another example of WWW rumor or fact?
__________________
Mini Network:: Financial information at your fingertips
Learn object oriented programming where it started

I will use a search engine before I ask dumb questions.
Reply With Quote
  #46 (permalink)  
Old 05-10-2009, 06:51 AM
WebProWorld 1,000+ Club
WebProWorld MVP
 
Join Date: Oct 2006
Posts: 1,029
innominds RepRank 5innominds RepRank 5innominds RepRank 5innominds RepRank 5innominds RepRank 5innominds RepRank 5
Default Re: SEO and PDFs

Quote:
Originally Posted by NetProwler View Post
@innominds:

When you are creating/saving your pdf file - select the security tab from the application you are using and select the 'Encrypt the PDF document' option which will prompt you for a password. Your saved file can't be indexed by any search engine.
I don't think the free ones like Cute PDF has this feature.

How about zipping the pdf file? Will it index it?
Reply With Quote
  #47 (permalink)  
Old 05-10-2009, 08:03 AM
WebProWorld Member
 
Join Date: Apr 2009
Posts: 26
granite4less RepRank 0
Default Re: SEO and PDFs

now Mozilla also support pdf. may be Google also able to crawl pdf. content
__________________
Granite Worktops London
Reply With Quote
  #48 (permalink)  
Old 05-11-2009, 03:25 AM
WebProWorld Member
 
Join Date: Jul 2006
Location: Stillwater, Oklahoma
Posts: 80
LinksAndTraffic RepRank 1
Default Re: SEO and PDFs

Quote:
Originally Posted by kgun View Post
"Can the search engines read PDF files?

Yes, most of the major search engines now can read the basic contents of PDF files, though getting these pages to rank as well as HTML files is still questionable".

New to me.
I agree. I have worked with pdf software that can open and change around pdf files, so google will not have a problem opening them either. But let's take it one step further. Google can read pdf files, or else, pdf files would not be in their index - and they are.

I have never heard about using pdf's for search engine marketing, but I guess it could be done. It is certainly worth exploring.
Reply With Quote
  #49 (permalink)  
Old 05-11-2009, 03:36 AM
WebProWorld Member
 
Join Date: Jul 2006
Location: Stillwater, Oklahoma
Posts: 80
LinksAndTraffic RepRank 1
Default Re: SEO and PDFs

Quote:
Originally Posted by NetProwler View Post
@innominds:

When you are creating/saving your pdf file - select the security tab from the application you are using and select the 'Encrypt the PDF document' option which will prompt you for a password. Your saved file can't be indexed by any search engine.
You could also put your pdf files into a special directory, then tell the robots in your robots.txt that the directory is off limits to them.
Reply With Quote
  #50 (permalink)  
Old 05-11-2009, 11:22 AM
Webnauts's Avatar
WebProWorld 1,000+ Club
WebProWorld MVP
 
Join Date: Aug 2003
Location: Worldwide
Posts: 8,463
Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9Webnauts RepRank 9
Default Re: SEO and PDFs

Try out this free software A-PDF INFO Changer: A free utility for reading and changing properties of PDF files, includes author, title, subject, keywords.! [A-PDF.com] for your PDFs.
Create a PDF file properly, get it indexed and tell us what happen.

Here are some tips that may help: http://searchengineland.com/eleven-t...-engines-12156
__________________
"Being an expert isn't telling other people what you know. It's understanding what questions to ask, and flexibly applying your knowledge to the specific situation at hand. Being an expert means providing sensible, highly contextual direction." Jeff Atwood
SEO Workers - Search Engine Optimization Consulting Company | SEO Analysis Tool | Webnauts Net SEO

Last edited by Webnauts; 05-11-2009 at 11:26 AM.
Reply With Quote
Reply

  WebProWorld > Search Engines > Search Engine Optimization Forum

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
301 on PDFs? PaulMycroft Search Engine Optimization Forum 3 07-10-2008 07:29 PM
Do PDFs help? Kzajko Google Discussion Forum 2 04-25-2007 08:51 PM
Advice for selling PDFs? - need a very simple payment system jkardos1 eCommerce Discussion Forum 9 01-25-2006 06:59 PM
Seeking Security Software for PDFs exoticpublishing Internet Security Discussion Forum 0 09-30-2005 09:42 PM
Downloading PDFs from XP ajpaulus Web Programming Discussion Forum 3 09-05-2004 11:55 AM


All times are GMT -4. The time now is 05:40 PM.



Search Engine Optimization by vBSEO 3.3.0