Asked  2 Years ago    Answers:  5   Viewed   215 times

I have this very strange problem. I have a site that contains some German letters and when it's only html without php the symbols are property displayed with encoding when i change it to UTF-8 they dont display and instead of Ö I get ?. When I put the html inside php and start it with Zend studio on Wamp with the charset=iso-8859-1 encoding I get � instead of Ö ( I want to add that this same Ö is a value of a radio button). When it's in a

tag it displays properly. Can you tell me how to fix this issue. I look at other sites and they have UTF-8 Encoding and displaying properly the same symbol. I tried to change the php edior encoding but it doesn't matter I suppose -> everything is displaying properly inside Zend Studio's editor... Thank you in advance.

 Answers

5

You have probably come to mix encoding types. For example. A page that is sent as iso-8859-1, but get UTF-8 text encoding from MySQL or XML would typically fail.

To solve this problem you must keep control on input ecodings type in relation to the type of encoding you have chosen to use internal.

If you send it as an iso-8859-1, your input from the user is also iso-8859-1.

header("Content-type:text/html; charset: iso-8859-1");

And if mysql sends latin1 you do not have to do anything.

But if your input is not iso-8859-1 you must converted it, before it's sending to the user or to adapt it to Mysql before it's store.

mb_convert_encoding($text, mb_internal_encoding(), 'UTF-8'); // If it's UTF-8 to internal encoding

Short it means that you must always have input converted to fit internal encoding and convereter output to match the external encoding.


This is the internal encoding I have chosen to use.

mb_internal_encoding('iso-8859-1'); // Internal encoding

This is a code i use.

mb_language('uni'); // Mail encoding
mb_internal_encoding('iso-8859-1'); // Internal encoding
mb_http_output('pass'); // Skip

function convert_encoding($text, $from_code='', $to_code='')
{
    if (empty($from_code))
    {
        $from_code = mb_detect_encoding($text, 'auto');
        if ($from_code == 'ASCII')
        {
            $from_code = 'iso-8859-1';
        }
    }

    if (empty($to_code))
    {
        return mb_convert_encoding($text, mb_internal_encoding(), $from_code);
    }
    return mb_convert_encoding($text, $to_code, $from_code);
}

function encoding_html($text, $code='')
{
    if (empty($code))
    {
        return htmlentities($text, ENT_NOQUOTES, mb_internal_encoding());
    }

    return mb_convert_encoding(htmlentities($text, ENT_NOQUOTES, $code), mb_internal_encoding(), $code);
}
function decoding_html($text, $code='')
{
    if (empty($code))
    {
        return html_entity_decode($text, ENT_NOQUOTES, mb_internal_encoding());
    }

    return mb_convert_encoding(html_entity_decode($text, ENT_NOQUOTES, $code), mb_internal_encoding(), $code);
}
Tuesday, August 2, 2022
2

I'm not sure that the ' wrapping the charset are necessary, or even correct. Try removing them:

Content-Type: text/plain; charset=UTF-8
Monday, November 7, 2022
 
nkn
 
nkn
5

Use cURL. This function is an alternative to file_get_contents.

function url_get_contents($Url) {
if (!function_exists('curl_init')){
    die('CURL is not installed!');
}
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $Url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$output = curl_exec($ch);
curl_close($ch);
return $output;
}
$data = url_get_contents("http://www.bratm.sk/trochu-harlem-inspiracie-zaslite-nam-svoj-super-harlem-shake-ak-bude-husty-zverejnime-ho/");
print_r($data);
Thursday, September 1, 2022
 
devsnd
 
5

index.php expects the displayFood() method to return a string, which it concatenates to the HTML and then prints. But displayFood() is echoing its results instead of returning them as a string.

So you have to either change displayFood() to return a string, or change index.php to print the beginning HTML, call displayFood(), and then print the ending HTML, e.g.

echo '

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
...
<div id="breadcrumbs">
';

$poc->displayFood();

echo '

</div>
</div>





<div id="footer">
&copy 2014 - FoodManagement
</div>


</body>
</html>';
Saturday, November 5, 2022
 
dig
 
dig
2

The word in question is "ESPAÑOL". This can be encoded correctly in ISO-8859-1 since all characters in the word are represented in ISO-8859-1.

You can see this for yourself using the following simple program:

using System;
using System.Diagnostics;
using System.Text;

namespace ConsoleApplication1
{
    class Program
    {
        static void Main(string[] args)
        {
            Encoding enc = Encoding.GetEncoding("ISO-8859-1");
            string original = "ESPAÑOL";
            byte[] iso_8859_1 = enc.GetBytes(original);
            string roundTripped = enc.GetString(iso_8859_1);
            Debug.Assert(original == roundTripped);
            Console.WriteLine(roundTripped);
        }
    }
}

What this tells you is that you need to properly diagnose where the erroneous character comes from. By the time that you have a � character, it is too late. The information has been lost. The presence of the � character indicates that, at some point, a conversion was performed into a character set that did not contain the character Ñ.

A conversion from ISO-8859-1 to a Unicode encoding will correctly handle "ESPAÑOL" because that word can be encoded in ISO-8859-1.

The most likely explanation is that somewhere along the way, the text "ESPAÑOL" is being converted to a character set that does not contain the letter Ñ.

Saturday, December 3, 2022
 
Only authorized users can answer the search term. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :
 

Browse Other Code Languages