 |

06-02-2006, 01:11 PM
|
|
WebProWorld Member
|
|
Join Date: May 2005
Location: Wisco
Posts: 45
|
|
Special Characters in URL (Ü,ö,ä)
I am not sure where this problem is but I am even having a hard time posting a URL here with special characters.
So far MSN has indexed these pages
search.msn.com/results.aspx?q=site%3Awww.optek.com%2Fde%2FApplica tion_Notezen&Form=MSNH
Notice how the SERP displays the URL correctly with any special characters. But if you click the result, it sends you to a URL with the special characters replaced.
For example (you have to copy and paste these URLs):
http://www.optek.com/de/Application_...Kühlwasser.asp
http://www.optek.com/de/Application_...ühlwasser.asp
Both links send the user to the same page, but with my system you will notice that the 1st example has the proper Title where as the 2nd one has my default Title.
Somehow my 404 page is routing both request correctly, but the SEO system I built sees these as 2 different pages since the URLs are different.
I can easily set the 2nd one up to display the same Title information as the first, but what I am wondering is will these pages get flagged as duplicates?
I am hoping someone can help, I would like to fix this before G gets on it.
|

06-02-2006, 03:24 PM
|
|
WebProWorld Member
|
|
Join Date: May 2005
Location: Wisco
Posts: 45
|
|
No one has any ideas?
|

06-02-2006, 03:54 PM
|
 |
Moderator
|
|
Join Date: Aug 2004
Location: Playing with fire!
Posts: 3,220
|
|
If I'm not mistaken, the source code tells me those are 2 different pages.
If you don't 301 one page to the other, one will likely be seen as a duplicate and cause problems.
Dave
|

06-02-2006, 04:05 PM
|
|
WebProWorld Member
|
|
Join Date: May 2005
Location: Wisco
Posts: 45
|
|
You are correct dave, looking at the source code they do appear to be different pages.
I developed a system that dynamically creates all the title, keyword, description tags.
Essentially when a user comes to a page on my site, there is a db lookup for the requested URL, if the lookup finds the URL requested it pulls all the Title, keyword, description tags from other tables and displays them.
I could 301 one of the pages but I think a better solution would be to figure out why/how msn (and probably other SE's) are displaying the special characters in the url, but linking to a URL with encoded special characters
for example: ö = ö
You know what I am trying to say?
|

06-02-2006, 04:19 PM
|
 |
Moderator
|
|
Join Date: Aug 2004
Location: Playing with fire!
Posts: 3,220
|
|
Quote:
|
Originally Posted by pablowerk
You are correct dave, looking at the source code they do appear to be different pages.
I developed a system that dynamically creates all the title, keyword, description tags.
Essentially when a user comes to a page on my site, there is a db lookup for the requested URL, if the lookup finds the URL requested it pulls all the Title, keyword, description tags from other tables and displays them.
I could 301 one of the pages but I think a better solution would be to figure out why/how msn (and probably other SE's) are displaying the special characters in the url, but linking to a URL with encoded special characters i.e. öl = ö
You know what I am trying to say?
|
Yes, but in the mean time, using a 301 would save yourself potential problems while you're tring to figure why/how. You could always remove it.
I'm not being taken to any pages with the special characters being replaced. They are taking me right to the page they list in all 8 examples.
Sorry can't be of too much more help. You might want to send Faglork a PM about this thread. He'll likely be of better help than I.
Dave
|

06-02-2006, 04:22 PM
|
|
WebProWorld Member
|
|
Join Date: May 2005
Location: Wisco
Posts: 45
|
|
Thanks dave I will try and PM Faglork
|

06-02-2006, 06:55 PM
|
|
WebProWorld Veteran
|
|
Join Date: Dec 2005
Location: In Your Mind
Posts: 663
|
|
Possible reasons
Different encoding based upon the country code in your head tags??
Another
MSN on windows
Yahoo on Unix if I remember.
Different server environments and
three completely different crawlers between the top 3 search engines.
Have you looked at inkitomi and teoma and how they handle the URLs?
|

06-02-2006, 08:10 PM
|
 |
WebProWorld Veteran
|
|
Join Date: Feb 2005
Location: Forchheim, Germany
Posts: 990
|
|
No easy answer. I have to look into that. But I need some time ... which I don't have right now. Sorry.
I never have these problems, because I do not use "Umlauts" my urls:
ä --> ae
ö --> oe
ü --> ue
ß --> ss
...
Even if you tackle the problem with your URLs in SEs and Browsers, you never know where the problems will get back to you. You may try to burn a backup-CD of this site and your CD burning prog will freak out. Or your internal server backup will get hiccups. Or you try and mail an URL to a prospective client, whose mail program will mess it up. Or or or ...
As far as URLs are in question: Do not use umlauts. It should be quite easy to modify your CMS to take care of that.
Cheers,
faglork
BTW: Who did the translation of that site? It is a bit awkward ... it is *understandable*, ok, but it is not good German language. Do you know the German term "radebrechen"? It means somthing like "a foreign visitor with a limited knowlegde of the language is desperately trying to speak German". This makes the website "look" cheap ... better get an experienced native speaker to look over the text.
[/code]
|

06-02-2006, 09:21 PM
|
|
WebProWorld Veteran
|
|
Join Date: Dec 2005
Location: In Your Mind
Posts: 663
|
|
In reply to your first post yes Google will see one page as a duplicate of the other.
Second both title tags from the pages are the same
<title>Spuren von Schmieröl im Kühlwasser</title>
<title>Spuren von Schmieröl im Kühlwasser</title>
Did you have another title in mind?
================================================== ===
Yes the search engine displays the URL just as gthe search spider "read" the URL...the same as the rest of the text and links on your page that it reads and displays.
However processing the URL, is a server side directive (action) and not what a spider read.
----------------------------------------------------
Next for what reason are you going to this steps in your " seo system???" ??
a search of the keyword term
Spuren von Schmieröl im Kühlwasse
Returns just 164 pages
Ergebnisse 1 - 10 von ungefähr 164 für Spuren von Schmieröl im Kühlwasse
on the other term
Spuren von Schmieröl im Kühlwasser
Es wurden keine mit Ihrer Suchanfrage - Spuren von Schmieröl im Kühlwasser - übereinstimmenden Dokumente gefunden.
There are no competitive pages.
I cannot imagine that the amount of searches for these terms would justify the time you have spent trying to build an seo system.
Also as Faglork stated, no matter what you do.... there is always going to be an issue with using special characters, some search spiders can handle them others cannot.
Lastly I have to go back to your website and it's purpose. You are not selling these items in an e-commerce fashion, and so the object of your site then should be, to push the visitor to your offline sales efforts. Perhaps I am mistaken..
And being as what you are dealing in seems to be not standard fare.... people who search for you should find you easily... by you using simple basic seo steps and less of a "system".
My thoughts
|

06-03-2006, 12:16 AM
|
 |
Moderator
|
|
Join Date: Jan 2004
Location: Live in Cincy Now
Posts: 7,733
|
|
Just to back up a bit, why in the first place are you using special characters in the URL in the first place. Would it better to fix that from the beginning?
If you are doing it because that is the way those characters are displayed in German, then consider not doing it.
Also I am sure this is not a new issues for the SE and they probably each handle (translate those characters) it differently and should not be an issue. Sort of like when people mistakenly leave spaces in the middle of URLs
|

06-03-2006, 05:11 AM
|
 |
WebProWorld Veteran
|
|
Join Date: Feb 2005
Location: Forchheim, Germany
Posts: 990
|
|
Re: Search Umlauts urls
Quote:
|
Originally Posted by TrafficProducer
I believe domain registars where discusiing issues about domain names with Umlauts in them, not sure what is happening about this.
e.g.
|
You can register domains with umlauts, depending on the tld. Here is a list, and a very good explanatory article as well:
http://en.wikipedia.org/wiki/Interna...d_domain_names
Problem is, mailservers don't handle this, so it will be confusing - you need a maildomain without umlaut.
I personally do not recommend umlaut-domains, unless you got a special reason.
Cheers,
faglork
|

06-05-2006, 06:29 PM
|
|
WebProWorld Member
|
|
Join Date: May 2005
Location: Wisco
Posts: 45
|
|
Quote:
|
Originally Posted by Faglork
BTW: Who did the translation of that site? It is a bit awkward ... it is *understandable*, ok, but it is not good German language. Do you know the German term "radebrechen"? It means somthing like "a foreign visitor with a limited knowlegde of the language is desperately trying to speak German". This makes the website "look" cheap ... better get an experienced native speaker to look over the text.
|
It's funny you say that, because the translation was done by one of our employees in our German headquarters (a native speaker!). I am having another person review these pages over here, he also noticed the improper German.
I am also checking with our German office about substituting those characters for the Umlaut characters. Thanks for the tip.
|

06-05-2006, 06:59 PM
|
|
WebProWorld Member
|
|
Join Date: May 2005
Location: Wisco
Posts: 45
|
|
Sorry for not replying quicker.
Just to give everyone a little more background info, I created a CMS that allows both our German and American salespeople to enter these application notes. The URL is created according to the title of the application along with its language and industry codes.
All these notes were originally written in English then translated into German. This is when I first learned about the issue with the special characters.
Currently I feel the best solution would be to switch these special characters using the substitutes given by Faglork, then 301 any of the incorrectly indexed URLs.
Quote:
|
Originally Posted by Faglork
ä --> ae
ö --> oe
ü --> ue
ß --> ss
|
Is it common for German searchers to search using these substitutions?
Regarding the " seo system", this system handles every page on the website, it makes it easy to track changes made to any page. A simple querry and I can check when and who changed any content to the title, description, and keyword tags.
|

06-06-2006, 05:52 AM
|
 |
WebProWorld Veteran
|
|
Join Date: Feb 2005
Location: Forchheim, Germany
Posts: 990
|
|
Quote:
|
Originally Posted by pablowerk
It's funny you say that, because the translation was done by one of our employees in our German headquarters (a native speaker!). I am having another person review these pages over here, he also noticed the improper German.
|
Perhaps he was trying to keep too close to the original , which often results in a sort of "word by word" translation.
The best solution would be to get an acknowledged translator. It is not very expensive, but you sure get the best results.
Cheers,
faglork
|

06-06-2006, 05:58 AM
|
 |
WebProWorld Veteran
|
|
Join Date: Feb 2005
Location: Forchheim, Germany
Posts: 990
|
|
Quote:
|
Originally Posted by pablowerk
Currently I feel the best solution would be to switch these special characters using the substitutes given by Faglork, then 301 any of the incorrectly indexed URLs.
Quote:
|
Originally Posted by Faglork
ä --> ae
ö --> oe
ü --> ue
ß --> ss
|
|
Almost all CMSes do it that way.
Quote:
|
Originally Posted by pablowerk
Is it common for German searchers to search using these substitutions?
|
No, they use the umlauts. Don't confuse this: the substitution is just for the filename, so the umlauts stay in the text. No problem for search engines.
hth,
faglork
|

06-06-2006, 09:53 AM
|
|
WebProWorld Member
|
|
Join Date: May 2005
Location: Wisco
Posts: 45
|
|
Sounds good, thanks for all the help!
|

06-06-2006, 11:28 AM
|
|
WebProWorld Member
|
|
Join Date: May 2005
Location: Wisco
Posts: 45
|
|
Faglork, I was wondering if there is a list of these character substitutions somewhere? I have been trying to find one, but I don't actually know what these substitutions would be called?
|

06-06-2006, 12:15 PM
|
 |
WebProWorld Veteran
|
|
Join Date: Feb 2005
Location: Forchheim, Germany
Posts: 990
|
|
I don't know of any list, and the "conversion" is just a convention, sort of.
These are all, as far as the German language is concerned:
ä --> ae
ö --> oe
ü --> ue
Ä --> AE
Ö --> OE
Ü --> UE
ß --> ss
The last one ("sz-ligatur") does have no capital.
On the other hand, in the HTML code of your copy text and navigation links, you should use the correct HTML/XHTML entities, as listed e.g. in
http://www.cookwood.com/html/extras/entities.html
hth,
faglork
|
| Thread Tools |
Search this Thread |
|
| |