Submit Your Article Forum Rules

Page 2 of 4 FirstFirst 1234 LastLast
Results 11 to 20 of 35

Thread: Basic .html webpages and encoding types when saving; does it make any difference?

  1. #11
    WebProWorld MVP deepsand's Avatar
    Join Date
    May 2004
    Location
    State College, PA
    Posts
    16,446

    Re: Basic .html webpages and encoding types when saving; does it make any difference?

    Well, if the encoding used is different from that specified via said meta tag parameter, and the content contains anything other than bytes with values in the decimal range of 0 to 127, then there is certainly the possibility that the indexing engine will be unable to properly parse the content.

    Who is your host? Big5 is a Chinese encoding, used for displaying Traditional Chinese characters.

  2. #12
    WebProWorld MVP Clint1's Avatar
    Join Date
    Jun 2003
    Location
    Sitting down in a chair
    Posts
    2,585

    Re: Basic .html webpages and encoding types when saving; does it make any difference?

    Quote Originally Posted by deepsand View Post
    Well, if the encoding used is different from that specified via said meta tag parameter,
    What if you don't use the meta tag parameters?


    and the content contains anything other than bytes with values in the decimal range of 0 to 127, then there is certainly the possibility that the indexing engine will be unable to properly parse the content.
    I don't think they do, I'm still not quite clear on that. Like I said, the HTML code is just the English 26 letter alphabet and 0-9, and some punctuation. So is that within the range of 0-127?


    Who is your host? Big5 is a Chinese encoding, used for displaying Traditional Chinese characters.
    Web Site Hosting and Business Web Hosting Plans from Reliable web site hosting Provider . Yeah I thought the font kind of looked like "Chinese fonts in English" (for lack of a better term). They kind of have that "oriental look" to them, even though they are English.

    Thanks.

    It took about 20 minutes for me to get to this frickin' page due to the refresh loop problem.
    God Bless,
    -Clint
    (Join Date: 2003)

  3. #13
    WebProWorld MVP wige's Avatar
    Join Date
    Jun 2006
    Posts
    3,138

    Re: Basic .html webpages and encoding types when saving; does it make any difference?

    Quote Originally Posted by Clint1 View Post
    What if you don't use the meta tag parameters?
    There are a few different places where the character encoding is specified - a meta tag, the <html> tag, and the headers as sent by the server. Generally, the server headers take priority as they are set by the server, based on the encoding of the document itself.

    I don't think they do, I'm still not quite clear on that. Like I said, the HTML code is just the English 26 letter alphabet and 0-9, and some punctuation. So is that within the range of 0-127?
    Generally, the lower 127 values are the same from one encoding to another. These characters are the English alphabet and punctuation, plus a few other characters such as file seperators and keyboard commands (DEL, Newline, TAB, etc). Since most other charsets only differ from ASCII in the extended portion, you won't notice a difference between charsets. For example, in ASCII, "A" is |01000001|. In ISO-8859-1, "A" is still |01000001|. In UTF-8, "A" is still |01000001|. The difference between these three encodings is the characters in the extended portion of the set. The easiest way I can think to explain this is that when the leading bit is 0, most encodings are the same. When the leading bit is not 0, such as for European accent characters, the encodings differ. Also, some encodings are longer, for example, instead of the 8 bits of ASCII and ISO-8859-1, UTF16 is 16 bits in length.

    Yeah I thought the font kind of looked like "Chinese fonts in English" (for lack of a better term). They kind of have that "oriental look" to them, even though they are English.
    I should note that the font is generally seperate from the character set. The character set determines whether those eight ones and zeros are an "A" or a "B" - whether that A or B is displayed in Courier or Verdana is up to the software. It's possible that the editor (or browser) is applying the oriental styling so that you can tell that this other encoding is in use.

    It took about 20 minutes for me to get to this frickin' page due to the refresh loop problem.
    Still looking into this. Probably the best clue from here would be a header capture from that addon I mentioned last from the last issue.
    The best way to learn anything, is to question everything.
    WigeDev - Freelance web and software development

  4. #14
    WebProWorld MVP Clint1's Avatar
    Join Date
    Jun 2003
    Location
    Sitting down in a chair
    Posts
    2,585

    Re: Basic .html webpages and encoding types when saving; does it make any difference?

    Quote Originally Posted by wige View Post
    There are a few different places where the character encoding is specified - a meta tag, the <html> tag, and the headers as sent by the server. Generally, the server headers take priority as they are set by the server, based on the encoding of the document itself.
    How can the headers sent by the server be controlled, or can they? If they can't be controlled so-to-speak, then I guess it's "controlled" by the original raw encoding that the text editor gave it.


    Generally, the lower 127 values are the same from one encoding to another. These characters are the English alphabet and punctuation, plus a few other characters such as file seperators and keyboard commands (DEL, Newline, TAB, etc). Since most other charsets only differ from ASCII in the extended portion, you won't notice a difference between charsets. For example, in ASCII, "A" is |01000001|. In ISO-8859-1, "A" is still |01000001|. In UTF-8, "A" is still |01000001|. The difference between these three encodings is the characters in the extended portion of the set. The easiest way I can think to explain this is that when the leading bit is 0, most encodings are the same. When the leading bit is not 0, such as for European accent characters, the encodings differ. Also, some encodings are longer, for example, instead of the 8 bits of ASCII and ISO-8859-1, UTF16 is 16 bits in length.
    Yeah I gotcha there, good explanation, thanks.


    Still looking into this. Probably the best clue from here would be a header capture from that addon I mentioned last from the last issue.
    Now the notification emails are coming in again in the raw unformatted plain text! No line breaks, no paragraphs, everything all smashed together, and no links are clickable. It's like the emails were sent through some kind of "stripper".

    I don't remember if the refresh problem was happening in FF. I THINK, maybe Deepsand is using FF, and he said it happened to him. So Deepsand maybe you could check that header response with that plugin. I'll try to remember to try the page in FF when it happens again.

    So, all this good information, and still no one has told me if these encodings have any bearing on the way a SE bot spiders the pages, and, if mine are not all the same is that a bad thing.
    God Bless,
    -Clint
    (Join Date: 2003)

  5. #15
    WebProWorld MVP wige's Avatar
    Join Date
    Jun 2006
    Posts
    3,138

    Re: Basic .html webpages and encoding types when saving; does it make any difference?

    ISO-8859-1 is kind of the preferred encoding of the web. However, most encodings such as UTF-8 and US-ASCII should be well supported, and won't adversely affect your search engine placement - as long as the search engine can read it, you will be fine. Differences from document to document should also not be a problem. Worst comes to worst and a user agent doesn't support the specific character set, the user agent will treat the document as though it is in US-ASCII. Also, your server may even transcode the document (change it's encoding on the fly to correspond with the encodings that the user agent supports) to maximize compatibility.

    The only one I might be concerned about would be the files that are in Big5. I am not sure how well that is supported, and it should probably be changed unless you are using Chinese characters - Big5 is 16bits in length, so it sends twice as much data per character as ISO-8859-1. However, just like ISO-8859-1 and UTF-8, Big5 still uses the ASCII set for the lower 7 bits of the first byte, meaning it is interchangeable with other 16bit character sets.

    That being said, in the long run I would try to go for consistency, probably converting documents to 8859.

    I should probably ask, when you upload documents onto the server, are you using FTP, and uploading the document in ASCII mode?
    The best way to learn anything, is to question everything.
    WigeDev - Freelance web and software development

  6. #16
    WebProWorld MVP Clint1's Avatar
    Join Date
    Jun 2003
    Location
    Sitting down in a chair
    Posts
    2,585

    Re: Basic .html webpages and encoding types when saving; does it make any difference?

    Quote Originally Posted by wige View Post
    The only one I might be concerned about would be the files that are in Big5. I am not sure how well that is supported, and it should probably be changed unless you are using Chinese characters - Big5 is 16bits in length, so it sends twice as much data per character as ISO-8859-1. However, just like ISO-8859-1 and UTF-8, Big5 still uses the ASCII set for the lower 7 bits of the first byte, meaning it is interchangeable with other 16bit character sets.
    Thanks again. Ok, I mentioned before that I tried changing a "big5" page to us-ascii and it would not change. I even took the source code and copy/pasted it into the file of the File Manager! No telling how long that's been going on because as I mentioned this was never an option and never seen in cPanel x2 themes, only x3. (This is the cPanel Demo login for the host I think I'm going to go with), and you can see the x3 theme. Click "Legacy File Manager" under "Files", then click any of those ".wysiwygPro_edit" files or the htaccess file, then "Edit" at the right pane, and see the encoding prompt. http://74.52.116.98:2082/login?user=demo&pass=demo (demo & demo if those don't show in the URL here).


    I should probably ask, when you upload documents onto the server, are you using FTP, and uploading the document in ASCII mode?
    I don't use FTP for that. I use the File Manager to make page changes, and I also use it to upload programs and PDF files. I use FTP to make an FTP backup of everything, and it (old WS_FTP) is set to "Auto" so it automatically switches between Binary and ASCII as needed. Rarely, maybe once, I've restored the site using FTP and it was also set that way.

    When I create a new page, I just start it in Notepad, Metapad more recently, then paste that into the blank area of the File Manager for the new page. If the new page is similar to others, I just copy/paste the code from another page, then modify it. So that's why this doesn't make any sense to me why there would be more than one encoding type showing up in that File Manager area. I am consistent, but for some reason x3 File Manager sees it differently!
    God Bless,
    -Clint
    (Join Date: 2003)

  7. #17
    WebProWorld MVP Clint1's Avatar
    Join Date
    Jun 2003
    Location
    Sitting down in a chair
    Posts
    2,585

    Re: Basic .html webpages and encoding types when saving; does it make any difference?

    Ahhh, being new to the x3 them, I didn't know this: Once you click to Edit the file and the source code page opens up to edit it, there's a toolbar at the top (you'll see it). Since the file is already opened, I didn't see the need to click the "Open" button. But this time I took a page that was claimed as ansi_x3.110-1983, and in the drop-down I THEN changed that to ISO-8859-1, THEN I clicked "Open" again, and it opened in that encoding. THEN I was able to save it in that encoding. So obviously that's the trick to changing pages' encoding.
    God Bless,
    -Clint
    (Join Date: 2003)

  8. #18
    WebProWorld MVP deepsand's Avatar
    Join Date
    May 2004
    Location
    State College, PA
    Posts
    16,446

    Re: Basic .html webpages and encoding types when saving; does it make any difference?

    In the "File Manager," as opposed to the "Legacy File Manage," selecting a file and then clicking on "Edit" automatically pops up a panel re. encodings, with the option to disable auto-detect of encoding and/or manual selction of encoding to be used.

  9. #19
    WebProWorld MVP deepsand's Avatar
    Join Date
    May 2004
    Location
    State College, PA
    Posts
    16,446

    Re: Basic .html webpages and encoding types when saving; does it make any difference?

    Quote Originally Posted by Clint1 View Post
    Now the notification emails are coming in again in the raw unformatted plain text! No line breaks, no paragraphs, everything all smashed together, and no links are clickable. It's like the emails were sent through some kind of "stripper".
    Odd that you should mention that just now, as it only a short time ago occurred to me that I'd not seen that particular problem for quite some time now.

    Quote Originally Posted by Clint1 View Post
    I don't remember if the refresh problem was happening in FF. I THINK, maybe Deepsand is using FF, and he said it happened to him. So Deepsand maybe you could check that header response with that plugin. I'll try to remember to try the page in FF when it happens again.
    Which "plugin" are you referring to?

  10. #20
    WebProWorld MVP Clint1's Avatar
    Join Date
    Jun 2003
    Location
    Sitting down in a chair
    Posts
    2,585

    Re: Basic .html webpages and encoding types when saving; does it make any difference?

    Quote Originally Posted by deepsand View Post
    In the "File Manager," as opposed to the "Legacy File Manage," selecting a file and then clicking on "Edit" automatically pops up a panel re. encodings, with the option to disable auto-detect of encoding and/or manual selction of encoding to be used.
    I know, see my first post on this thread. ".....(I found out that prompt can be bypassed by changing a setting in the current non-legacy File Manager). But the options are still there if you want to change encoding."

    You can still see the drop-down menu of the encodings, and the encoding it "chose", that's how I found out about the odd encodings.
    God Bless,
    -Clint
    (Join Date: 2003)

Page 2 of 4 FirstFirst 1234 LastLast

Similar Threads

  1. How to Make Sure that Googlebot can Read My Webpages?
    By lkcheng in forum Google Discussion Forum
    Replies: 5
    Last Post: 12-14-2006, 07:51 PM
  2. I like to make a difference;)
    By EdwardHadome in forum Introductions
    Replies: 0
    Last Post: 01-20-2006, 02:31 AM
  3. Help please...How can 1 letter make a difference?
    By spiceboy in forum Google Discussion Forum
    Replies: 18
    Last Post: 06-10-2005, 04:49 PM
  4. More on Handling Basic Data Types
    By WPW_Feedbot in forum Graphics & Design Discussion Forum
    Replies: 0
    Last Post: 03-02-2005, 09:31 AM
  5. What difference does currency make?
    By Grith - WPW in forum Marketing Strategies Discussion Forum
    Replies: 19
    Last Post: 10-01-2004, 09:07 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •