0

This is a MySQL case with 5,000,000 records in trade table, and 5,000,000 records in registries table.

little cases exist where uni is duplicate. i.e. values of uni is almost different. same comment for other columns.

  update trades t
  inner join registries r on
    ( t.uni = r.uni and r.nationalid <> '' )
    or ( t.account = r.account and r.nationalid = '' )
  set
    t.registry_id = r.rowid

this statement takes about 1 hour to complete.

indices:

create index idx1 on trades (uni);
create index idx2 on trades (account);
create index idx3 on trades (account);
create index idx1 on registries (nationalid, uni);
create index idx2 on registries (account);
create index idx3 on registries (uni);

what are any indices or configurations that I have to apply to get best performance?

1 Answer 1

2

Use two separated updates instead, to get rid of the OR operator.

First update

update trades t
inner join registries r on t.uni = r.uni and r.nationalid <> '' 
set t.registry_id = r.rowid

The following indexes would speed up the first update

create index un_regid on trades (uni , registry_id);
create index un_regid on registries (uni , nationalid, rowid );

Second update

update trades t
inner join registries r on t.account = r.account and r.nationalid = '' 
set t.registry_id = r.rowid

The following indexes would speed up the second update

create index un_regid on trades (account , registry_id);
create index un_regid on registries (account , nationalid, rowid );
5
  • could not run a clean profiling. the same statement runs slow in the first and second round. in the third round, they both (my method and your recommendation) become really fast so speed difference cannot be spotted. I will let you know when i manage to figure out and remove other disturbing parameters. though I know my ubuntu pc does not run anything heavy in background except mysql server. Commented Jul 4 at 7:52
  • switching to MyISAM brought a huge performance gain in my case. dba.stackexchange.com/a/16405/295351 Commented Jul 4 at 10:35
  • finally used a smaller subset of data and found your recommendation useful: update with mixed conditions took 10 mins, and separating into two statements avoinding or operator took less than 3 minutes. tnx Commented Jul 4 at 11:39
  • 1
    @AliTavakol note that the more indexes present the more time for update/delete... is needed. Please do not use MyISAM it has too many disadvantages and I can not think of any advantages. Commented Jul 4 at 12:55
  • 2
    @ErgestBasha I also recommend to stop using MyISAM, mostly because it does not support any of the properties of an ACID database. But it does have one genuine advantage: it can store the same data in about half the storage space as InnoDB. At a job a few years ago, I was forced to use MyISAM for some large tables because the management was penny-pinching and refused to upgrade the db server storage. I think that was false economy, but it was their decision. Commented Jul 4 at 18:41

Not the answer you're looking for? Browse other questions tagged or ask your own question.