Retry Lost or Failed Tasks (Celery, Django and RabbitMQ) -


Is there a way to determine if a work is lost and trying again?

I think the reason for being lost can be a Dispatcher Bug or Worker Thread Accident.

I was planning to try them again, but I'm not sure which work should be removed again?

And how to do this process automatically? Can I use my own custom scheduler which can create new tasks?

Edit: I have come to know from documentation that RabbitMQ never does loose work, but what happens if the worker thread is crashing in the middle of execution?

what you need to set

CELERY_ACKS_LATE = True

By late it means that the job message will be executed only after accepting the message, not the first, which is the default behavior. In this way, if the worker destroys the rabbit then the MQ will still have the message.

Obviously, there is no way to correct the total accident (rabbit + workers) at the same time, except that you start the job and logging on the task end personally, I start the work and at the end of the work (make an independent result), in this way I can understand that what was interrupted by analyzing the Mongo log.

You can do this easily by overriding the method to __ call __ and the celery base class of after_return .

After you follow, see a piece of my code: Task Loger class, with a reference manager (with entry and exit point) of a Task Loger class, just writes a line containing work information in a Mangod .

 "lang-py prettyprint-override">  def __call __ (auto, * args, ** kwargs): "" "This function calls in calligraphy call, call here You can set some environment variables before the job runs "" #Inizialize context managers self.taskLogger = TaskLogger (args, kwargs) self.taskLogger .__ enter __ () return Self.run (* args, ** kwargs ) Def after_return (auto, position, retval, task_id, args, kwargs, einfo): #exit dot self.tasklogger .__ exit__ for reference manager (status, retval, task_id, args, kwargs, einfo)   

I hope this can help

Comments