Viewed   57 times

I work on a somewhat large web application, and the backend is mostly in PHP. There are several places in the code where I need to complete some task, but I don't want to make the user wait for the result. For example, when creating a new account, I need to send them a welcome email. But when they hit the 'Finish Registration' button, I don't want to make them wait until the email is actually sent, I just want to start the process, and return a message to the user right away.

Up until now, in some places I've been using what feels like a hack with exec(). Basically doing things like:

exec("doTask.php $arg1 $arg2 $arg3 >/dev/null 2>&1 &");

Which appears to work, but I'm wondering if there's a better way. I'm considering writing a system which queues up tasks in a MySQL table, and a separate long-running PHP script that queries that table once a second, and executes any new tasks it finds. This would also have the advantage of letting me split the tasks among several worker machines in the future if I needed to.

Am I re-inventing the wheel? Is there a better solution than the exec() hack or the MySQL queue?

 Answers

2

I've used the queuing approach, and it works well as you can defer that processing until your server load is idle, letting you manage your load quite effectively if you can partition off "tasks which aren't urgent" easily.

Rolling your own isn't too tricky, here's a few other options to check out:

  • GearMan - this answer was written in 2009, and since then GearMan looks a popular option, see comments below.
  • ActiveMQ if you want a full blown open source message queue.
  • ZeroMQ - this is a pretty cool socket library which makes it easy to write distributed code without having to worry too much about the socket programming itself. You could use it for message queuing on a single host - you would simply have your webapp push something to a queue that a continuously running console app would consume at the next suitable opportunity
  • beanstalkd - only found this one while writing this answer, but looks interesting
  • dropr is a PHP based message queue project, but hasn't been actively maintained since Sep 2010
  • php-enqueue is a recently (2017) maintained wrapper around a variety of queue systems
  • Finally, a blog post about using memcached for message queuing

Another, perhaps simpler, approach is to use ignore_user_abort - once you've sent the page to the user, you can do your final processing without fear of premature termination, though this does have the effect of appearing to prolong the page load from the user perspective.

Sunday, October 23, 2022
2

Task.Run is a shorthand for Task.Factory.StartNew with specific safe arguments:

Task.Factory.StartNew(
    action, 
    CancellationToken.None, 
    TaskCreationOptions.DenyChildAttach, 
    TaskScheduler.Default);

It was added in .Net 4.5 to help with the increasingly frequent usage of async and offloading work to the ThreadPool.

Task.Factory.StartNew (added with TPL in .Net 4.0) is much more robust. You should only use it if Task.Run isn't enough, for example when you want to use TaskCreationOptions.LongRunning (though it's unnecessary when the delegate is async. More on that on my blog: LongRunning Is Useless For Task.Run With async-await). More on Task.Factory.StartNew in Task.Run vs Task.Factory.StartNew

Don't ever create a Task and call Start() unless you find an extremely good reason to do so. It should only be used if you have some part that needs to create tasks but not schedule them and another part that schedules without creating. That's almost never an appropriate solution and could be dangerous. More in "Task.Factory.StartNew" vs "new Task(...).Start"

In conclusion, mostly use Task.Run, use Task.Factory.StartNew if you must and never use Start.

Thursday, September 1, 2022
5

Well, if you're on Linux, you can use pcntl_fork to fork children off. The "master" then watches the children. Each child completes its task and then exists normally.

Personally, in my implementations I've never needed a message queue. I simply used an array in the "master" with locks. When a child got a job, it would write a lock file with the job id number. The master would then wait until that child exited. If the lock file still exists after the child exited, then I know the task wasn't completed, and re-launch a child with the same job (after removing the lock file). Depending on your situation, you could implement the queue in a simple database table. Insert jobs in the table, and check the table in the master every 30 or 60 seconds for new jobs. Then only delete them from the table once the child is finished (and the child removed the lock file). This would have issues if you had more than one "master" running at a time, but you could implement a global "master pid file" to detect and prevent multiple instances...

And I would not suggest forking with FastCGI. It can result in some very obscure problems since the environment is meant to persist. Instead, use CGI if you must have it web interface, but ideally use a CLI app (a deamon). To interface with the master from other processes, you can either use sockets for TCP communication, or create a FIFO file for communication.

As for detecting hung workers, you could implement a "heart-beat" system, where the child issues a SIG_USR1 to the master process every so many seconds. Then if you haven't heard from the child in two or three times that time, it may be hung. But the thing is since PHP isn't multi-threaded, you can't tell if a child is hung or if it's just waiting on a blocking resource (like a database call)... As for implementing the "heart-beat", you could use a tick function to automate the heart-beat (but keep in mind, blocking calls still won't execute)...

Saturday, September 3, 2022
 
2

It should be clear, but the language specification never said that mismatched return types would have any effect during overload resolution. Because of that, there was no rule that said to prefer Action over Func<Task>. If Action would be picked, sure, it would work. If Func<Task> would be picked, then sure, you'd get an error. But to pick either, overload resolution has to succeed, and it isn't taking this into account.

This is supposed to be fixed with new overload resolution in C# 7.3.

Thursday, November 3, 2022
 
devops
 
4

Yes you would want to combine boost::bind and boost::functions its very powerful stuff.

This version now compiles, thanks to Slava!

#include <boost/function.hpp>
#include <boost/bind.hpp>
#include <iostream>
#include <vector>

class CClass1
{
public:
    void AMethod(int i, float f) { std::cout << "CClass1::AMethod(" << i <<");n"; }
};

class CClass2
{
public:
    void AnotherMethod(int i) { std::cout << "CClass2::AnotherMethod(" << i <<");n"; }
};

int main() {
    boost::function< void (int) > method1, method2;
    CClass1 class1instance;
    CClass2 class2instance;
    method1 = boost::bind(&CClass1::AMethod, class1instance, _1, 6.0) ;
    method2 = boost::bind(&CClass2::AnotherMethod, class2instance, _1) ;

    // does class1instance.AMethod(5, 6.0)
    method1(5);

    // does class2instance.AMethod(5)
    method2(5);


    // stored in a vector of functions...
    std::vector< boost::function<void(int)> > functionVec;
    functionVec.push_back(method1);
    functionVec.push_back(method2);

    for ( int i = 0; i < functionVec.size(); ++i)
    {         
         functionVec[i]( 5);
    };
    return 0;
};
Wednesday, November 16, 2022
Only authorized users can answer the search term. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :