mongoDBとMySQLのlimit offset skip

mongoDBにはMySQLのoffset limitに似たskip limitという機能がありますが、名前だけでなく機能的にも差異があります。

意外と知られていないようなのですが、MySQLでは以下のようなSQLで最初の10件を取り出す場合でもテーブルフルスキャンされてしまいます。

select * from hoge limit 10 offset 0 ;

mysql> explain select * from hoge limit 10 offset 0 ;
+----+-------------+-------+------+---------------+------+---------+------+------+-------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+------+-------+
|  1 | SIMPLE      | hoge  | ALL  | NULL          | NULL | NULL    | NULL | 1000 |       |
+----+-------------+-------+------+---------------+------+---------+------+------+-------+
1 row in set (0.00 sec)

mongoDBでは期待通り10件目で検索を打ち切ってくれます。

db.Hoge.find().skip(0).limit(10);

> db.Hoge.find().skip(0).limit(10).explain();
{
	"cursor" : "BasicCursor",
	"nscanned" : 10,
	"nscannedObjects" : 10,
	"n" : 10,
	"millis" : 0,
	"nYields" : 0,
	"nChunkSkips" : 0,
	"isMultiKey" : false,
	"indexOnly" : false,
	"indexBounds" : {
		
	}
}

MySQL等のRDBのデータモデルは集合モデルであり、テーブルには順番という概念が存在していません。このため、まず最初に必要なデータを全て取得して並べた後でなければ、offset limitを適用することができないのです。

一方、mongoDBは、先日ブログに書いたようにnatural orderという順序付けをコレクションが持っているので、必要な件数だけ取得して処理を打ち切ることができます。もちろん、mongoDBでもfindにsort条件があると必要なデータを取り出して並び替えるまで順番が不明になるので、途中で検索を打ち切ることはできなくなります。

> db.Hoge.find().sort({ date:1 }).skip(0).limit(10).explain();
{
	"cursor" : "BasicCursor",
	"nscanned" : 1000,
	"nscannedObjects" : 1000,
	"n" : 10,
	"scanAndOrder" : true,
	"millis" : 9,
	"nYields" : 0,
	"nChunkSkips" : 0,
	"isMultiKey" : false,
	"indexOnly" : false,
	"indexBounds" : {
		
	}
}

このような場合は、sort条件となるフィールドにB-Treeインデックスを作成しておくと、取得時にインデックスの順番が利用されてフルスキャンが回避されます。

> db.Hoge.ensureIndex({date: 1});
> db.Hoge.find().sort({ date:1 }).skip(0).limit(10).explain();
{
	"cursor" : "BtreeCursor date_1",
	"nscanned" : 10,
	"nscannedObjects" : 10,
	"n" : 10,
	"millis" : 7,
	"nYields" : 0,
	"nChunkSkips" : 0,
	"isMultiKey" : false,
	"indexOnly" : false,
	"indexBounds" : {
		"date" : [
			[
				{
					"$minElement" : 1
				},
				{
					"$maxElement" : 1
				}
			]
		]
	}
}