Asked  2 Years ago    Answers:  5   Viewed   409 times

I am having a little trouble with parsing XML from a google checkout response. The XML is coming straight from the google server so there is no problem with the XML itself.

I want to get hold of all the new-order-notification tags

I tried this but get an empty array() returned everytime.

$xml = new SimpleXmlElement($raw_xml);
$notifications = $xml->xpath('notifications');
$notifications = $xml->xpath('/notification-history-response/notifications/new-order-notification');
$notifications = $xml->xpath('//new-order-notification');

An XML snipet (Just the beginning)

<notification-history-response xmlns="http://checkout.google.com/schema/2" serial-number="c5cda190-0eb1-4f91-87cd-e656e5598d38">
  <notifications>
    <new-order-notification serial-number="271578974677716-00001-7">
      <buyer-billing-address>
        <address1>19 sandbox st</address1>
        <address2></address2>

 Answers

2

The issue is likely the default namespace. See

  • SimpleXMLElement::registerXPathNamespace
    Creates a prefix/ns context for the next XPath query

Example:

$sxe->registerXPathNamespace('x', 'http://checkout.google.com/schema/2');
$result = $sxe->xpath('//x:notifications');

As an alternative if there is no other namespaces, simply remove the default namespace with

str_replace('xmlns="http://checkout.google.com/schema/2"', '', $raw_xml);

before feeding the XML to your SimpleXmlElement.

Sunday, September 4, 2022
2

All you need is

$data = new SimpleXMLElement($xml);
$data->registerXPathNamespace('ns1','http://endpoint.websitecom/');
$part = $data->xpath("//ns1:return");
var_dump($part[0]->children("ns1",true));

Output

object(SimpleXMLElement)[3]
  public 'campaignID' => string '0' (length=1)
  public 'categoryID' => string '200230455' (length=9)
  public 'categoryName' => string 'Promotion' (length=9)
  public 'linkID' => string '10001599' (length=8)
  public 'linkName' => string 'KFL-20% off No Min' (length=18)
  public 'mid' => string '3071' (length=4)
  public 'nid' => string '1' (length=1)
  public 'clickURL' => string '
            http://someurl
        ' (length=36)
  public 'endDate' => string 'Oct 15, 2012' (length=12)
  public 'height' => string '250' (length=3)
  public 'iconURL' => string '
            http://someurl
        ' (length=36)
  public 'imgURL' => string '
            http://someurl
        ' (length=36)
  public 'landURL' => string '
            http://someurl
        ' (length=36)
  public 'serverType' => string '22' (length=2)
  public 'showURL' => string '
            http://someurl
        ' (length=36)
  public 'size' => string '13' (length=2)
  public 'startDate' => string 'Oct 14, 2012' (length=12)
  public 'width' => string '300' (length=3)
Sunday, November 13, 2022
 
3

You've been fooled (and had me fooled) by the oldest trick in the SimpleXML book: SimpleXML doesn't parse the whole document into a PHP object, it presents a PHP API to an internal structure. Functions like var_dump can't see this structure, so don't always give a useful idea of what's in the object.

The reason it looks "empty" is that it is listing the children of the root element which are in the default namespace - but there aren't any, they're all in the "soapenv:" namespace.

To access namespaced elements, you need to use the children() method, passing in the full namespace name (recommended) or its local prefix (simpler, but could be broken by changes in the way the file is generated the other end). To switch back to the "default namespace", use ->children(null).

So you could get the ID attribute of the first stationV2 element like this (live demo):

// Define constant for the namespace names, rather than relying on the prefix the remote service uses remaining stable
define('NS_SOAP', 'http://schemas.xmlsoap.org/soap/envelope/');

// Download the XML
$rawxml = file_get_contents("http://opendap.co-ops.nos.noaa.gov/axis/webservices/activestations/response.jsp?v=2&format=xml&Submit=Submit");
// Parse it
$ob = simplexml_load_string($rawxml);

// Use it!
echo $ob->children(NS_SOAP)->Body->children(null)->ActiveStationsV2->stationsV2->stationV2[0]['ID'];

I've written some debugging functions to use with SimpleXML which should be much less misleading than var_dump etc. Here's a live demo with your code and simplexml_dump.

Monday, August 1, 2022
 
2

The ArrayAdapter tries to display your Location-objects as strings (which causes the Hex-values), by calling the Object.toString()-method. It's default implementation returns:

[...] a string consisting of the name of the class of which the object is an instance, the at-sign character `@', and the unsigned hexadecimal representation of the hash code of the object.

To make the ArrayAdadpter show something actually useful in the item list, you can override the toString()-method to return something meaningful:

@Override
public String toString(){
  return "Something meaningful here...";
}

Another way to do this is, to extend BaseAdapter and implement SpinnerAdapter to create your own Adapter, which knows that the elements in your ArrayList are objects and how to use the properties of those objects.

[Revised] Implementation Example

I was playing around a bit and I managed to get something to work:

public class Main extends Activity {

    @Override
    public void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
        // Create and display a Spinner:
        Spinner s = new Spinner(this);
        AbsListView.LayoutParams params = new AbsListView.LayoutParams(
                ViewGroup.LayoutParams.FILL_PARENT, ViewGroup.LayoutParams.WRAP_CONTENT
        );
        this.setContentView(s, params);
        // fill the ArrayList:
        List<Guy> guys = new ArrayList<Guy>();
        guys.add(new Guy("Lukas", 18));
        guys.add(new Guy("Steve", 20));
        guys.add(new Guy("Forest", 50));
        MyAdapter adapter = new MyAdapter(guys);
        // apply the Adapter:
        s.setAdapter(adapter);
        // onClickListener:
        s.setOnItemSelectedListener(new AdapterView.OnItemSelectedListener() {
            /**
             * Called when a new item was selected (in the Spinner)
             */
            public void onItemSelected(AdapterView<?> parent,
                                       View view, int pos, long id) {
                Guy g = (Guy) parent.getItemAtPosition(pos);
                Toast.makeText(
                        getApplicationContext(),
                        g.getName()+" is "+g.getAge()+" years old.",
                        Toast.LENGTH_LONG
                ).show();
            }

            public void onNothingSelected(AdapterView parent) {
                // Do nothing.
            }
        });
    }

    /**
     * This is your own Adapter implementation which displays
     * the ArrayList of "Guy"-Objects.
     */
    private class MyAdapter extends BaseAdapter implements SpinnerAdapter {

        /**
         * The internal data (the ArrayList with the Objects).
         */
        private final List<Guy> data;

        public MyAdapter(List<Guy> data){
            this.data = data;
        }

        /**
         * Returns the Size of the ArrayList
         */
        @Override
        public int getCount() {
            return data.size();
        }

        /**
         * Returns one Element of the ArrayList
         * at the specified position.
         */
        @Override
        public Object getItem(int position) {
            return data.get(position);
        }

        @Override
        public long getItemId(int i) {
            return i;
        }
        /**
         * Returns the View that is shown when a element was
         * selected.
         */
        @Override
        public View getView(int position, View recycle, ViewGroup parent) {
            TextView text;
            if (recycle != null){
                // Re-use the recycled view here!
                text = (TextView) recycle;
            } else {
                // No recycled view, inflate the "original" from the platform:
                text = (TextView) getLayoutInflater().inflate(
                        android.R.layout.simple_dropdown_item_1line, parent, false
                );
            }
            text.setTextColor(Color.BLACK);
            text.setText(data.get(position).name);
            return text;
        }


    }

    /**
     * A simple class which holds some information-fields
     * about some Guys.
     */
    private class Guy{
        private final String name;
        private final int age;

        public Guy(String name, int age){
            this.name = name;
            this.age = age;
        }

        public String getName() {
            return name;
        }

        public int getAge() {
            return age;
        }
    }
}

I fully commented the code, if you have any questions, don't hesitate to ask them.

Tuesday, December 20, 2022
3

The website actually checks for the User-Agent header.

See what it returns if you don't specify it:

$ scrapy shell 'http://www.bbb.org/central-western-massachusetts/business-reviews/auto-repair-and-service/toms-automotive-in-fitchburg-ma-211787'
In [1]: print(response.body)
Out[1]: 123

In [2]: response.xpath('//*[@id="business-accreditation-content"]/p[2]').extract()
Out[2]: []

Yes, that's right - the response contains only 123 if there is an unexpected request user agent.

Now with the header (note the specified -s command-line argument):

$ scrapy shell 'http://www.bbb.org/central-western-massachusetts/business-reviews/auto-repair-and-service/toms-automotive-in-fitchburg-ma-211787' -s USER_AGENT='Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.80 Safari/537.36'
In [1]: response.xpath('//*[@id="business-accreditation-content"]/p[2]').extract()
Out[1]: [u'<p itemprop="description">BBB has determined that Tom's Automotive meets <a href="http://www.bbb.org/central-western-massachusetts/for-businesses/about-bbb-accreditation/bbb-code-of-business-practices-bbb-accreditation-standards/" lang="LS30TPCERNY5b60c87311af50cf82720b237d8ef866">BBB accreditation standards</a>, which include a commitment to make a good faith effort to resolve any consumer complaints. BBB Accredited Businesses pay a fee for accreditation review/monitoring and for support of BBB services to the public.</p>']

This was an example from the shell. In a real Scrapy project, you would need to set the USER_AGENT project setting. Or, you may also use user agent rotation with the help of this middleware: scrapy-fake-useragent.

Sunday, October 2, 2022
 
anshu
 
Only authorized users can answer the search term. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :
 

Browse Other Code Languages