Data consistency with findFirst then ->create()

Question

Data consistency with findFirst then ->create()

slothstronaut Sep '17

Created Sep '17	Last Reply Sep '17	Replies 2	Views 415	Votes 0

slothstronaut 1.4k

Sep '17

I'm consistently having an issue in a production environment where I'm doing a query to see if the row exists, if so update, if not insert.

Here is the code

$hasExistingVisit = \Model\UserVisit::findFirst(array(
    'fromUserId = :fromUserId: AND toUserId = :toUserId:',
    "bind" => array(
        'fromUserId' => $this->currentUser->userId,
        'toUserId' => $this->profileUser->userId,
    ),
));

if($hasExistingVisit) {
    $hasExistingVisit->visitCount++;
    $hasExistingVisit->seen = 0;
    $hasExistingVisit->lastVisit = new \Phalcon\Db\RawValue('NOW()');               
    $hasExistingVisit->save();              
} else {
    $newVisit = new \Model\UserVisit;
    $newVisit->fromUserId = $this->currentUser->userId;
    $newVisit->toUserId = $this->profileUser->userId;
    $newVisit->lastVisit = new \Phalcon\Db\RawValue('NOW()');
    $newVisit->create();
}

About .001% of the time it will get this error:

SQLSTATE[23000]: Integrity constraint violation: 1062 Duplicate entry '977070-935378' for key 'PRIMARY'

I've tried everything, including using $hasExistingVisit !== false

This is probably a symptom of another problem, but I'm not calling it twice, and even if it were a race condition, its directly after the query, so it would be unlikely to trigger an error.

Any help/thoughts would be appreciated, thank you!

Wojciech Ślawski · Answer 1 · 2017-09-24T12:47:00-07:00

Simply - if people will refresh two times before adding to database in first time this condition will be ture. But when actually making query it can be already there - this is why you have error. For this case you simply should put this has existing visit to some kind of cache to be honest.

Dylan Anderson · Answer 2 · 2017-09-25T13:52:28-07:00

What about locking beforehand?

Altenatively you could go down to raw SQL and do a REPLACE or INSERT...IGNORE or INSERT ... ON DUPLICATE KEY UPDATE. This page explains the differences: https://chartio.com/resources/tutorials/how-to-insert-if-row-does-not-exist-upsert-in-mysql/

If it were me, I'd do an INSERT...IGNORE. In this case, the data is only going to be off by one, and you don't really need to record both visits anyway.

slothstronaut · Answer 3 · 2020-06-04T11:55:52-07:00

Answering myself after several years: the challenge with this problem was actually an unavoidable race condition. The solution for these kinds of problems are to publish the change to a queue and run the queries by a worker which guarantees they will run in order and not have a race condition as long as the worker consumer is single-threaded.