5

In short my question is this: Why is this

SELECT r.x, r.y FROM `base` AS r
WHERE r.l=50 AND AND r.n<>'name' AND 6=(SELECT COUNT(*) FROM surround AS d 
    WHERE d.x >= r.x -1 AND d.x <= r.x +1 AND 
          d.y>=r.y -1 AND d.y<=r.y +1 AND d.n='name')

a lot slower than this:

$q="SELECT x,y FROM `base` WHERE l=50 AND n<>'name'";
$sr=mysql_query($q);
if(mysql_num_rows($sr)>=1){
    while($row=mysql_fetch_assoc($sr)){
        $q2="SELECT x,y FROM surround WHERE n='name' AND x<=".
            ($row["x"]+1)." AND x>=".($row["x"]-1).
            " AND y<=".($row["y"]+1)." AND y>=".($row["y"]-1)." ";
        $sr2=mysql_query($q2);
        if(mysql_num_rows($sr2)=6){
            echo $row['x'].','.$row[y].'\n';
        }
    }
}

The php version takes about 300 ms to complete, if I run the "pure SQL" version, be it via phpadmin or via php, that takes roughly 5 seconds (and even 13 seconds when I used BETWEEN for those ranges of x and y)

I would suspect that the SQL version would in general be faster, and more efficient at least, so I wonder, am I doing something wrong, or does it make sense?

EDIT: I added the structure of both tables, as requested:

CREATE TABLE IF NOT EXISTS `base` (
  `bid` int(12) NOT NULL COMMENT 'Base ID',
  `n` varchar(25) NOT NULL COMMENT 'Name',
  `l` int(3) NOT NULL,
  `x` int(3) NOT NULL,
  `y` int(3) NOT NULL,
  `LastModified` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  UNIQUE KEY `coord` (`x`,`y`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;


CREATE TABLE IF NOT EXISTS `surround` (
  `bid` int(12) NOT NULL COMMENT 'Base ID',
  `n` varchar(25) NOT NULL COMMENT 'Name',
  `l` int(3) NOT NULL,
  `x` int(3) NOT NULL,
  `y` int(3) NOT NULL,
  `LastModified` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  UNIQUE KEY `coord` (`x`,`y`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

EDIT 2:

EXPLAIN SELECT for the query above: (the key coord is the combination of x and y)

id  select_type         table   type    possible_keys   key     key_len     ref     rows    Extra   
1   PRIMARY             r       range   coord,n         coord   4           NULL    4998    Using where
2   DEPENDENT SUBQUERY  d       ALL     coord           NULL    NULL        NULL    57241   Range checked for each record (index map: 0x1)
4
  • 5
    Most likely to do with the way indexes are being used. Best to do an explain sql and post that here too, for all 3 statements.
    – bumperbox
    Commented Nov 17, 2013 at 3:14
  • I would think doing it with the Query is faster. At the end, you will have a MYSQL result set and that will be counted via PHP. I would rather have the Database count the records and provide me the count since it is not heavy transaction.
    – Hozikimaru
    Commented Nov 17, 2013 at 3:22
  • is the name field indexed in both tables?
    – Sebas
    Commented Nov 17, 2013 at 4:08
  • the n (name) field was not indexed in the tables, I added that for testing purposes, but it didn't result in a change in loading times. And all fields are not unique by them self (except for bid, but I don't really use that field), only the combination of x and y is unique, so that's why I chose this key.
    – rolfv1
    Commented Nov 17, 2013 at 14:18

1 Answer 1

2

You are joinning two tables by yourself. you're an optimizer. you choice 'base' table is outer table for nested loop join. I guess MySQL's optimizer produced execution plan and it was not same as you.

so people want EXPLAIN output to see join order and to check index was used.

by the way, can you try this query?:

SELECT r.x, r.y
FROM `base` AS r, surround AS d
WHERE r.l=50
  AND r.n<>'name'
  AND d.x >= r.x -1
  AND d.x <= r.x +1
  AND d.y>=r.y -1
  AND d.y<=r.y +1
  AND d.n='name'
GROUP BY r.x, r.y
HAVING COUNT(*) = 6

UPDATED

how your original query works

It was first time seeing Range checked for each record (index map: 0x1) so I can't figure out how your query works. MySQL Manual gives us some information about it. It seems like that every row in surround (surround has 57k rows?) is compare to base's x,y. If so, your query is evaluated using 3 depth nested loop join. (base => surround => base) and moreover every row in surround is compared (this is inefficient)

I will make more effort to find how it works later. It's time to work.

3
  • And you are missing d.n = 'name' in PHP version.
    – Jason Heo
    Commented Nov 18, 2013 at 0:37
  • You are right about doing difficult things with joining, that's where I started this entire query, by trying to figure out the appropriate join, but in the end came up with this. So I added the explain to the original post, doesn't really tell me much, but your query works beautifully, very fast (0.15 sec) and gives the same results as the php queries and the original (slow) query. Thanks a lot! (And d.n='name' is in the PHP version, right after the WHERE of the subquery)
    – rolfv1
    Commented Nov 18, 2013 at 20:13
  • @rolfv1 good to see your comment. that makes me happy ;-) but sorry for not giving exact answer about EXPLAIN output.
    – Jason Heo
    Commented Nov 19, 2013 at 0:27

Not the answer you're looking for? Browse other questions tagged or ask your own question.