When ever Google first becomes aware of information, whether it's an article or anything else, it is noted in it's database.
All subsequent findings of the same content will have less value than the original - (everything else being equal).
However, Google knows that an article may be posted on the same site in a HTML format and PDF, so for the originator of the information, they won't get penalized, but for anyone else using the same exact copy, most likely won't see as much benefit.
As far as a duplicate content penalty, I don't believe your site would get banned or blacklisted for having the same content, it just won't rank as well for that article than the source.
|