1

I have a webserver administration question. In this website: http://www.mirkaphoto.hu/ All PHP generated pages contain the following line:

<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-2" />

But this is somehow disregarded probably via the php-apache processing and the page displays in browsers with an UTF-8 header. As a result of that question marks (�) are shown in the page text instead of accented characters (éáöőóüűúí). I tested this in Firefox, IE, Chrome and Seamonkey.

The strangest in this phenomenon, that this symptom started only yesterday, after I upgraded my server to Debian 8.0 Jessie from 7.0 Wheezy. During the upgrade I also upgraded all other packages as well, including apache, php, and so on, and selected "yes" for overwriting config files with factory default ones. After this I fine-tuned my config files to have everything the way I like, but I did not find a way to fix this. Before the upgrade, the page displayed just fine.

Here is a screenshot, where you can see that Firefox sees the "charset=iso-8859-2" definition, but still displays the page with UTF-8 encoding.

screen shot

My suspicion is, that this is a server configuration issue, but it could also be, that one part of the processing component (Apache, php) changed due to the upgrade in some way, resulting this strange behavior. The problem is, I can't pinpoint, what could possibly cause this problem.

Can anyone solve this mistery? What could be possibly going wrong during the processing of the page?

1

2 Answers 2

4

The server’s HTTP headers say

Content-Type: text/html; charset=UTF-8

which browsers would probably consider more trustworthy than what’s inside the file. Why not just use UTF-8? It’s an established encoding on all platforms.

Also, there’s garbage text before the HTML declaration:

[M _2<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
5
  • Kinda make sense to use UTF-8 instead, the problem is, I just host this site, I am not the developer of it. The webpages were created long time ago, and are not really maintained by the developer. The client says "It worked before, make it work again" and to be honest she is correct with this expectation.
    – giny8i8
    Commented Dec 3, 2015 at 8:38
  • A quick fix would probably be to use Apache’s mod_header to change the Content-Type header in .htaccess or similar. Also, unmaintained applications/CMS/whatever are insanely dangerous. They must absolutely be kept up-to-date. If that can’t be done, static pages must be used.
    – Daniel B
    Commented Dec 3, 2015 at 8:52
  • I enabled the mod_headers modul in apache. I also added the following to the virtualhost configuration: Header set charset iso-8859-2. Than I reloaded the server with service apache2 reload. But the page still displays with UTF-8 :( Maybe I did not use the right syntax? Not sure... I am not very familiar with using mods for apache2. Do you have a specif suggestion, what should I put where?
    – giny8i8
    Commented Dec 3, 2015 at 13:34
  • There is no charset header. There’s only the Content-Type header. So you’ll somehow need to restrict this to relevant files only. There are various methods to match files or URIs. You wouldn’t want to change image files’ headers. They aren’t HTML, after all. It may well be that this isn’t the solution. You’d be better of researching how to change the PHP application.
    – Daniel B
    Commented Dec 3, 2015 at 14:12
  • Thanks for your support @Daniel B, I think I managed to find the right way to fix this. I created an answer post for the solution, feel free to vote it up, if you think it is ok. It did the trick for me.
    – giny8i8
    Commented Dec 3, 2015 at 16:32
2

After a lot of searching I managed to find the right solution. My thanks fly out @Daniel B, for pointing me to the right direction. :)

It seems, that due to the upgrade, the apache2 engine processes all Content-Type "text/html files with UTF-8 charset, disregarding the <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-2" /> statement in the actual html / php files. I am not sure why this supposed to be a good thing (please explain if you can). Nevertheless the solution for getting rid of the question mark characters (�) was the following:

The Solution: I added the below line to the VirtualHost apache2 definition of my website in /etc/apache2/sites-available/MySiteName.conf than I reloaded the server configs with the service apache2 reload command. After this the files are served with proper Content-Type: text/html; charset=iso-8859-2 character encoding header.

<VirtualHost * >

# [...Some other configurations before this line]

    #To fix encoding problem, that pages display with UTF-8 header though they are created with iso-8859-2 encoding - giny8i8 2015-12-03
    Header set Content-Type "text/html; charset=iso-8859-2"
        # Source:  http://superuser.com/questions/1008480/charset-iso-8859-2-webpage-displays-with-utf-8-header-question-marks-inste/1008482?noredirect=1#comment1397150_1008482

</VirtualHost>

Let me know if this works for you too, if you encounter the same challange after a Debian 8.0 Jessie upgrade! I searched for this on the internet, but did not find it spelled out like this. Hence my answer post.

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .