Feedback Of Conditional Updates In PostgreSQL

Wednesday, November 20, 2024

Imagine, if you will, a service responsible for dispatching jobs to a pool of workers. The worker reports when a job has started, and will send updated while the job is in progress, and when it’s finished. The dispatcher tracks this in a PostgreSQL database and forwards the completion message upstream. This message should only be sent if the job exists in the database and should only be ever sent once (i.e. duplicates should be ignored).

Everything’s working well, until a bit of load is applied and… Oh oh! You discovered a race condition. The job in-progress report came in after the job finished report. Not to worry: you just change your query to update the job state if it’s not already completed:

UPDATE job
SET status = $1
WHERE job_id = $2 AND status != 'completed';

But what about the completion message? Could you know if PostgreSQL applied the update so as to know whether to send the completion message upstream?

Looking a such a problem today, I found out that PostgreSQL allows one to return the rows involved in the update. This could be return the updated rows as they are:

UPDATE job
SET status = $1
WHERE job_id = $2 AND status != 'completed'
RETURNING *;

But there’s nothing stopping you from returning any expression (so long as it’s not an aggregate function), and simply returning 1 to indicate that an update occurred is totally doable:

UPDATE job
SET status = $1
WHERE job_id = $2 AND status != 'completed'
RETURNING 1 AS was_updated;

The only problem with this is that this will be returned for every updated row. So three updated rows will result in three rows set to 1:

was_updated
-----------
         1
         1
         1

It is possible to wrap the update in an outer query and return a count of the rows, but much like this contrived example, I was only expecting at most one row to be updated at a time. So I simply returned the 1‘s and just counted them in code. If the result set is empty, then I could assume that no update was made.

res := sql.exec(updateSQL, newState, jobID)
if newState == "completed" && len(res) > 0 {
  sendMessage(jobID, "job is done")
} else {
  // job not completed, or it was already completed
}

This RETURNING clause also works for inserts, and I typically use it to return any generated IDs. I guess a similar pattern could be used if the inserts are conditional, or were the result of a SELECT.