Viewed   70 times

I've been reading up on message queueing lately, and I'd like to implement a simple, extendable, system for my app. While there's a lot of good information on the subject of setting up a MQ system out there, I can't find a lot about the actual implementation.

I'm looking for patterns and best practices on how to properly format messages for a queue, and ways to execute the jobs in PHP. Should I use JSON, serialized objects, text, URLs or XML? What information should I send? Is a worker with a switch($job['command']) {} (or something like that) the way to go, or are there any established patterns out there to implement a worker?

Help greatly appreciated!

 Answers

2

You can pick any of the following MQ implementations in PHP, so you don't have to roll your own and you can look at their sourcecode to learn about their implementation. For general integration, have a look at the ActiveMQ page on Enterprise Integration patterns.

  • http://sourceforge.net/projects/beanstalk/

    A PHP Client Library for beanstalkd. BeanStalk allows PHP developers to make use of the beanstalkd in-memory workqueue server (http://xph.us/software/beanstalkd).

  • http://kr.github.com/beanstalkd/

    Beanstalk is a simple, fast workqueue service. Its interface is generic, but was originally designed for reducing the latency of page views in high-volume web applications by running time-consuming tasks asynchronously.

  • http://activemq.apache.org/

    Apache ActiveMQ is the most popular and powerful open source messaging and Integration Patterns provider. Apache ActiveMQ is fast, supports many Cross Language Clients and Protocols, comes with easy to use Enterprise Integration Patterns and many advanced features while fully supporting JMS 1.1 and J2EE 1.4. Apache ActiveMQ is released under the Apache 2.0 License

  • http://memcachedb.org/memcacheq/

    Memcachedb is a distributed key-value storage system designed for persistent. It is not a cache solution, but a persistent storage for high-frequency writing and reading. It conforms to memcache protocol(not completed, see below), so any memcached client can have connectivity with it. Memcachedb uses Berkeley DB as a storing backend, so lots of features including transaction and replication are supported.

  • http://www.zend.com/en/products/server/

    Zend Server 5.0 incorporates Job Queue, providing full support for creating, executing and managing jobs to optimize application performance and reduce server load, minimizing application bottlenecks and improving the end-user experience.

  • https://www.dropr.org/

    dropr is a distributed message queue framework written in PHP. The main goals are:

    • reliable and durable (failsafe)-messaging over networks
    • decentralized architecture without a single (point of failure) server instance
    • easy to setup and use
    • modularity for queue storage and message transports (currently filesystem storage and curl-upload are implemented)
  • http://gearman.org/

    Gearman provides a generic application framework to farm out work to other machines or processes that are better suited to do the work. It allows you to do work in parallel, to load balance processing, and to call functions between languages.

  • http://www.zeromq.org/

    ØMQ (also spelled ZeroMQ, 0MQ or ZMQ) is a high-performance asynchronous messaging library aimed at use in scalable distributed or concurrent applications. It provides a message queue, but unlike message-oriented middleware, a ØMQ system can run without a dedicated message broker. The library is designed to have a familiar socket-style API.

Saturday, September 3, 2022
4

Using Crontab to make asynchronous tasks (asynchronous from your PHP code) is a basic approach where using a job/task queue manager is an elaborate one and give you more control, power and scalability/elasticity.

Crontab are very easy to deal with but does not offer a lot of functionalities. It is best for scheduled jobs rather than for asynchronous tasks.

On the other hand, deploying a Task queue (and its message broker) require more time. You have to choose the right tools first then learn how to implement them in your PHP code. But this is the way to go in 2011.

Thank God, I don't do PHP but have played around with Celery (coupled with RabbitMQ) on Python projects ; I am sure you can find something similar in the PHP world.

Thursday, September 15, 2022
 
3

My suggestions basically boil down to: Keep it simple!

With that in mind my first suggestion is to drop the DispatcherWorker. From my current understanding, the sole purpose of the worker is to listen to the MAIN queue and forward messages to the different task queues. Your application should take care of enqueuing the right message onto the right queue (or topic).

Answering your questions:

My workers, would be written in PHP they all have to be polling the cloud queue service? that could get expensive especially when you have a lot of workers.

Yes, there is no free lunch. Of course you could adapt and optimize your worker poll rate by application usage (when more messages arrive increase poll rate) by day/week time (if your users are active at specific times), and so on. Keep in mind that engineering costs might soon be higher than unoptimized polling.

Instead, you might consider push queues (see below).

I was thinking maybe have 1 worker just for polling the queue, and if there are messages, notify the other workers that they have jobs, i just have to keep this 1 worker online using supervisord perhaps? is this polling method better than using a MQ that can notify? How should I poll the MQ, once every second or as fast as it can poll? and then increase the polling workers if I see it slowing down?

This sounds too complicated. Communication is unreliable, there are reliable message queues however. If you don't want to loose data, stick to the message queues and don't invent custom protocols.

I was also thinking of having a single queue for all the messages, then the worker monitoring that distributes the messages to other cloud MQs depending on where they need to be processed, since 1 message might need to be processed by 2 diff workers.

As already mentioned, the application should enqueue your message to multiple queues as needed. This keeps things simple and in place.

Would I still need gearman to manage my workers or can I just use supervisord to spin workers up and down?

There are so many message queues and even more ways to use them. In general, if you are using poll queues you'll need to keep your workers alive by yourself. If however you are using push queues, the queue service will call an endpoint specified by you. Thus you'll just need to make sure your workers are available.

Basically how could I check the MQ as efficiently and as effectively as possible?

This depends on your business requirements and the job your workers do. What time spans are critical? Seconds, Minutes, Hours, Days? If you use workers to send emails, it shouldn't take hours, ideally a couple of seconds. Is there a difference (for the user) between polling every 3 seconds or every 15 seconds?

Solving your problem (with push queues):

My goal is to offload messages / data that needs to be sent to multiple third party APIs, so accessing them doesnt slow down the client. So sending the data to a message queue is ideal. I considered using just Gearman to hold the MQ/Jobs, but I wanted to use a Cloud Queue service like SQS or Rackspace Cloud Queues so i wouldnt have to manage the messages.

Indeed the scenario you describe is a good fit for message queues. As you mentioned you don't want to manage the message queue itself, maybe you do not want to manage the workers either? This is where push queues pop in.

Push queues basically call your worker. For example, Amazon ElasticBeanstalk Worker Environments do the heavy lifting (polling) in the background and simply call your application with an HTTP request containing the queue message (refer to the docs for details). I have personally used the AWS push queues and have been happy with how easy they are. Note, that there are other push queue providers like Iron.io.

As you mentioned you are using PHP, there is the QPush Bundle for Symfony, which handles incoming message requests. You may have a look at the code to roll your own solution.

Saturday, October 29, 2022
 
bryuk
 
5

You just need double-quotes around it:

to_char(date_updated, 'YYYY-MM-DD"T"HH:mm:ss')
Friday, September 9, 2022
3

An anonymous type is like any other class, but it's created by the compiler. What the compiler generates is something like:

class AnonymousType1 {
  public AnonymousType2 Key { get; set; }
  public int ProductCount { get; set; }
}

class AnonymousType2 {
  public int y { get; set; }
  public int m { get; set; }
  public int d { get; set; }
}

Those classes are not accessible to you so you have no choice but to use a custom class matching the definition of Anonymous1 instead if you want to keep strong typing. You'll then use it like this: new MyClass { Key = myGroup.Key, ProductCount = mygroup.Count() }.

Monday, November 21, 2022
 
Only authorized users can answer the search term. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :