9

I'm trying to find a way to do a conditional DELETE on an InnoDB table which contains millions of records, without locking it (thus not bringing the website down).

I've tried to find information on mysql.com, but to no avail. Any tips on how to proceed?

3
  • Are there a lot of indexes on this table? Commented Feb 13, 2010 at 14:51
  • Index the columns which will handle conditions. Then perform the delete when the traffic to the site is low (@ night probably).
    – Sergey
    Commented Feb 13, 2010 at 14:54
  • see also stackoverflow.com/questions/23193761/… which mentions gap locking
    – rogerdpack
    Commented Sep 5, 2016 at 18:08

3 Answers 3

8

I don't think it is possible to delete without locking. That said, I don't think locking the record you want to delete is a problem. What would be a problem is locking other rows.

I found some information on that subject here: http://dev.mysql.com/doc/refman/5.0/en/innodb-locks-set.html

What I would suggest, is to try and do a million single row deletes. I think that if you do all those in a single transaction, performance should not hurt too much. so you would get something like:

START TRANSACTION;

DELETE FROM tab WHERE id = 1;
..
..
DELETE FROM tab WHERE id = x;

COMMIT;

You can generate the required statments by doing something like

SELECT CONCAT('DELETE FROM tab WHERE id = ', id)
FROM   tab
WHERE  <some intricate condition that selects the set you want to delete>

So the advantage over this method instead of doing:

DELETE FROM tab 
WHERE  <some intricate condition that selects the set you want to delete>

is that in the first approach you only ever lock the record you're deleting, whereas in the second approach you could run the risk of locking other records that happen to be in the same range as the rows you are deleteing.

2
  • +1, promising solution! Have you ever used this approach in practice?
    – Frunsi
    Commented Feb 13, 2010 at 15:44
  • @frunsi: I honestly admit that I have not. Typically, I don't have much of an issue with deleting a million rows, but then again the systems I manage are not being hit by many users at the same time. Commented Feb 13, 2010 at 15:58
5

If it fits your application, then you could limit the number of rows to delete, and setup a cronjob for repeating the deletion. E.g.:

DELETE FROM tab WHERE .. LIMIT 1000

I found this to be good compromise in a similar scenario.

3
  • I think the problem with this approach is that the LIMIT is evaluated after isolating the set. So the WHERE is applied, which could lock rows that happen to be in the same range as the rows you are deleting. The fact that you then only delete a limited number, does not change the fact that the records have already been locked. What this method does do is keep the transaction size small, which is also good. But it does not prevent spurious locking as far as i understand. Commented Feb 13, 2010 at 15:40
  • @Roland: You could be right. This solution may not be ideal, but works fine in practice (limited delete is fast, so the locks are hold for very small timespans, the limit parameter can be tuned to the real life data). I did some benchmarks before using this approach, and the accumulated time of all the limited deletes finished in less time than a single non-limited one (but at this time, it used MySQL 3.x and MyISAM tables). It still works, but benchmark on current MySQL and InnoDB tables would bring some light into the details..
    – Frunsi
    Commented Feb 13, 2010 at 15:49
  • 1
    yes I think it's safe to say that the scenario you describe, a table locking MyISAM engine on a more than 9 yrs old version of the product as compared to the InnoDB engine wich features both multi-version concurrency control and row-level locking, might indeed be somewhat different :p Commented Feb 13, 2010 at 15:57
1

I use procedure to delete

create procedure delete_last_year_data() 
begin
  DECLARE del_row varchar(255);
  DECLARE done INT DEFAULT 0;

  declare del_rows cursor for select CONCAT('DELETE FROM table_name WHERE id = ', id)
                            from table_name 
                            where created_time < '2018-01-01 00:00:00';
  DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = 1;

  open del_rows;

  repeat
    fetch del_rows into del_row;
    if not done
    then
      set @del = del_row;
      prepare stmt from @del;
      execute stmt;
      DEALLOCATE PREPARE stmt;
    end if;
  until done end repeat;

  close del_rows;

end //

Not the answer you're looking for? Browse other questions tagged or ask your own question.