Cost-based query optimization allows you to assign costs to operations and assign costs to partial plans, which allows them to find better search plans with lower costs.
Possible because query plans are simple, they have data statistics and they have meta-data.
Cost of an operation is proportional to disk accesses. A cost unit equals a random read of data page (4Kb).
Variable length fields (BLOB, VARCHAR, etc.) require extra guess work and adds to the cost of that operation. I would assume this means variable length fields would be a bad idea when used in an index.
MySQL does internal optimization on your queries to decide how it should do searches and joins based on the cost of each way to execute the query. Each choice is called a “plan” and each plan has a cost (see above).
The problem with this approach is that the more joins you add the more plans (options) are added to the mix for optimization, which means that as you add joins the more time the “optimizer” takes to optimize your query. Evidently, this limit is somewhere around 5-7 joins in a single table.
With 5.0 they’ve introduced a “Greedy search” that lets you suggest the plan, which is a trade off as the optimizer is skipped (at least this is how it sounds from his explanation).
Users can influence the choice of indexes with the USE INDEX, FORCE INDEX and IGNORE INDEX sytnax. These should be used with extreme care as indexes can be added and dropped.
You can also check out optimizer_search_depth whith tells the optimizer how much effort to put into looking for the best search plan. The default is automatic.
Another option is optermizer_prune_level.
The third way to force optimization on joins is to use the STRAIGHT_JOIN syntax to force join order.
The range optimizer can detect ranges that can be merged, such as SELECT * FROM tb1 WHERE foo < 10 OR foo < 20. In doing so it finds the minimal sequence of smallest possible disjoint intervals.