Viewed   271 times

I'm trying to parse an XML file using PHP, but I get an error message:

parser error : Char 0x0 out of allowed range in

I think it's because of the content of the XML, I think there is a speical symbol "?", any ideas what I can do to fix it?

I also get:

parser error : Premature end of data in tag item line

What might be causing that error?

I'm using simplexml_load_file.

Update:

I try to find the error line and paste its content as single xml file and it can work!! so I still cannot figure out what makes xml file parse fails. PS it's a huge xml file over 100M, will it makes parse error?

 Answers

1

Do you have control over the XML? If so, ensure the data is enclosed in <![CDATA[ .. ]]> blocks.

And you also need to clear the invalid characters:

/**
 * Removes invalid XML
 *
 * @access public
 * @param string $value
 * @return string
 */
function stripInvalidXml($value)
{
    $ret = "";
    $current;
    if (empty($value)) 
    {
        return $ret;
    }

    $length = strlen($value);
    for ($i=0; $i < $length; $i++)
    {
        $current = ord($value{$i});
        if (($current == 0x9) ||
            ($current == 0xA) ||
            ($current == 0xD) ||
            (($current >= 0x20) && ($current <= 0xD7FF)) ||
            (($current >= 0xE000) && ($current <= 0xFFFD)) ||
            (($current >= 0x10000) && ($current <= 0x10FFFF)))
        {
            $ret .= chr($current);
        }
        else
        {
            $ret .= " ";
        }
    }
    return $ret;
}
Sunday, September 25, 2022
2

If the text comes from database maybe the column is not in UTF-8, try iconv.

Wednesday, December 14, 2022
 
keilly
 
2

Yes you can use simplexml with xpath in this case:

$xml = simplexml_load_file('path/to/xml/file.xml');
$name = 'Tipul licentei';
$product_code = '70-14UF44-00';
$products = $xml->xpath("//Product/ProductCode[contains(text(), '$product_code')]/following-sibling::AttrList/element[@Name='$name']");
if(count($products) > 0) { // if found

    $value = (string) $products[0]->attributes()->Value;
    echo $value; // Full Package

}

Sample Output

Also possible with DOMDocument:

$dom = new DOMDocument();
$dom->load('path/to/xml/file.xml');
$xpath = new DOMXpath($dom);

$name = 'Tipul licentei';
$product_code = '70-14UF44-00';
$value = $xpath->evaluate("string(//Product/ProductCode[contains(text(), '$product_code')]/following-sibling::AttrList/element[@Name='$name']/@Value)");
echo $value; // Full Package
Sunday, September 18, 2022
4

The proper way to escape INI values is to enclose them in "double quotes". If your string doesn't contain double quotes, you can use it in as a value enclosed in double quotes.

Escaping single quotes with a backslash seems to work as long as there are not two consecutive double quotes in the value, as per http://php.net/manual/en/function.parse-ini-file.php#100046

If you want to do your own escaping, you certainly can:

htmlspecialchars / htmlspecialchars_decode escapes <,>,& and ".

htmlentities / html_entitity_decode will escape very aggresively (but also very safely) to HTML entities

urlencode / urldecode will escape all special characters except _-~..

base64_encode / base64_decode will ensure the encoded string contains only alphanumeric characters and +=/. This might be optimal for encoding binary data but doesn't preserve readability.

Tuesday, August 30, 2022
3

Make sure you send the XML with an appropriate content-type and encoding, e.g.

<?php header("Content-Type: application/xml; charset=utf-8"); ?>

Also check that the XML prolog contains the proper encoding, e.g.

<?xml version="1.0" encoding="UTF-8"?>

The message about the style information refers to a missing processing instruction, e.g.

<?xml-stylesheet type="text/xsl" href="someting"?>

which would tell the browser how to style/format the output. It is not necessarily needed if you just want to display the raw XML.

Friday, September 30, 2022
 
Only authorized users can answer the search term. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :