logo

Random Useful Info

Correcting why foreign characters are not Đisplaying correctly

Some reasons and cures as to why your website is showing 'garbage' characters instead of the foreign language you hoped to see. By garbage we mean question marks '?' or the like displaying where you hoped to see an accent or distinctive character from the language in use.

There are two distinct places that the characters that hit your visible website come from.

  • First one being the page’s source document. In other words the xxx.html, xxx.php xxx.asp etc encoded pages that help render you site on someone's browser.
  • The second may be from a database that sits behind your website and stores a variety of content and settings. This database may be a custom designed one that your web developer built for your site. Alternatively, it may be the database of a Content Management System (CMS) such as Joomla! or the like.

Page's source document correction.

Let’s address the possibility of the first source first... It seems natural to do so... directly encoding the characters in the page’s source document.

While coding your page in English doesn’t seem to present any issues, the first time you try to use the Vietnamese ‘Đ’, for example, you end up with ‘?’, or a question mark in a diamond, or something else unexpected.

In a nutshell, the document needs to be encoded as ‘UTF8 without BOM’. If you’re into why the watch works follow the link, otherwise, if you only want to set the correct time read on. To set this encoding, you need a text editor other than Notepad or WordPad that let you control the encoding. Some that work well and are freeware are Notepad++, TopGun and BablePad.

The easiest way is to open the file at the source, change the encoding and save it back directly. A no fuss way to do this is to use FireFTP in Firefox, as it allows you to open, edit and save a file directly from your website’s file structure. You’ll need to set Notepad++ as one of the “open with...” options for html, htm, php etc type files.

So, the no fuss way to fix this particular issue after arming yourself with the tools mentioned

  1. Browse to your source document on your website using FireFTP
  2. Right click the file in question
  3. Select Open with...fffftprb01
  4. Choose Notepad++
  5. Choose Format... UTF8 without BOMnotepadplusplus
  6. And if your site is hosted on a unix system, just to cover the bases choose Format... convert to UNIX format.
  7. Edit your special characters as you tried before you knew you had this problem
  8. You should then be able to save the document.
  9. Browse to the page and refresh.
  10. Say thank you Recycler.

Database colation selection.

After determining that the characters that you are trying to output are actually coming from the CMS database, you'll need to take the following (or similar) steps. We will work throught the process assuming that you have Joomla! on a MySQL database and access to phpMyAdminphpMyAdmin for manipulating it.

The first thing you need to do with phpMyAdmin is to determine which database it is to be fixed,a clue for joomla... look in configuration.php if you have more than one. If you cannot identify which database it is you need to stop now and call a friend who may be a little more geekier than yourself.

  1. Find your database phpmyadmintableson the left frame, and click it. The database information should come up on the right, and the left frame should change to a list of tables.
  2. Click 'Operations' - the right most tab on the right hand pane.
  3. Down the bottom of the right pane you will now see Collation and a drop down list with it. Choose 'utf8_general_ci' and then 'Go'.
  4. Step 4 is remember that step 3 will only take effect on new tables created from this time forward. Next steps will help with existing tables.
  5. Continuing with the Joomla!example, click the 'content' table on the left. Now the table's information will appear on the right with a list of the fields. If you don't see that then click the 'Structure' tab on the right.
  6. Click the 'Check All'selallflds under the list of fields, and all the fields should now be selected.
  7. Click the 'Pencil' icon, to the right of `With Selected`.
  8. You will now be presented with a form which allows you to modify the selected fields. What you're interested in is the 'Collation' row. Change each collation to utf8_general_ci (only for those with an existing collation already). Press save to lock away your changes.
  9. Repeat 5-8 for each of the tables you think may be causing your issue.
  10. Edit your articles using the new characters (ăâêôđĐàảãáạa etc).
  11. thank the Recycler for this article.
recyc2

Comments   

 
0 #1 thank you recycler 2013-08-20 16:44
worked
Quote
 

Add comment


Security code
Refresh