19 commonly used MySQL optimizations in projects

19 commonly used MySQL optimizations in projects

Let me declare that the optimization solutions are based on the Mysl-Index-BTree type

最后更新 5/2/2022 5:33 PM
anchovy
预计阅读 11 分钟
分类
MySQL
标签
MySQL

在写文章之前,首先感谢 飞友科技 陆老师提供的文档。。

声明一下:下面的优化方案都是基于 “ Mysql-索引-BTree 类型 ” 的

1. EXPLAIN

做 MySQL 优化,我们要善用 EXPLAIN 查看 SQL 执行计划。

Here is a simple example, marking (1, 2, 3, 4, 5) the data we want to focus on

  1. type column, connection type. A good sql statement must reach at least the range level. Eliminate all levels
  2. key column, the index name used. If no index is selected, the value is NULL. Mandatory indexing can be adopted
  3. key_len column, index length
  4. rows column, scanning the number of rows. This value is an estimate
  5. extra column, detailed description. Note that common and less friendly values are: Using filesort, Using temporary

2. IN in a SQL statement should not contain too many values

MySQL 对于 IN 做了相应的优化,即将 IN 中的常量全部存储在一个数组里面,而且这个数组是排好序的。但是如果数值较多,产生的消耗也是比较大的。再例如:select id from table_name where num in(1,2,3) 对于连续的数值,能用 between 就不要用 in 了;再或者使用连接来替换。

3. The SELECT statement must specify the field name

SELECT * increases a lot of unnecessary consumption (cpu, io, memory, network bandwidth); increases the possibility of using override indexes; and when the table structure changes, the front end also needs to be updated. Therefore, it is required to follow the field name directly after select.

4. Use limit 1 when only one piece of data is needed

This is to make the type column in EXPLAIN reach the const type

5. If the sorting field does not use an index, try to sort as little as possible

6. If other fields in the restriction do not have indexes, use or as little as possible

or if one of the fields on both sides is not an index field and the other conditions are not an index field, the query will not follow the index. Many times, using union all or union(when necessary) instead of "or" will get better results

7. Try to replace union with union all

union 和 union all 的差异主要是前者需要将结果集合并后再进行唯一性过滤操作,这就会涉及到排序,增加大量的 CPU 运算,加大资源消耗及延迟。当然,union all的前提条件是两个结果集没有重复数据。

8. Do not use ORDER BY RAND()

select id from `table_name` order by rand() limit 1000;

The above sql statement can be optimized to

select id from `table_name` t1 join (select  rand() * (select max(id) from `table_name`) as nid) t2 on t1.id > t2.nid limit 1000;

9. Distinguish between in and exists, not in and not exists

select * from 表A where id in (select id from 表B)

The above sql statement is equivalent to

select * from 表A where exists(select * from 表B where 表B.id=表A.id)

区分 in 和 exists 主要是造成了驱动顺序的改变(这是性能变化的关键),如果是 exists,那么以外层表为驱动表,先被访问,如果是 IN,那么先执行子查询。所以IN适合于外表大而内表小的情况;EXISTS适合于外表小而内表大的情况

关于 not in 和 not exists,推荐使用 not exists,不仅仅是效率问题,not in 可能存在逻辑问题。如何高效的写出一个替代not exists的sql语句?

Original sql statement

select colname …  from A表 where a.id not in (select b.id from B表)

Efficient sql statements

select colname …  from A表 Left join B表 on where a.id = b.id where b.id is null

The extracted result set is shown in the following figure. Data from Table A that is not in Table B.

10. Use reasonable paging methods to improve paging efficiency

select id,name from table_name limit 866613, 20

When using the above sql statements to perform paging, some people may find that as the amount of table data increases, directly using limit paging queries will become slower and slower.

The optimization method is as follows: You can take the id of the maximum number of rows on the previous page, and then limit the starting point of the next page based on this maximum id. For example, in this column, the largest id on the previous page is 866612. SQL can be written as follows:

select id,name from table_name where id> 866612 limit 20

11. Sub-section query

In some user-selected pages, some users may have selected a too large time range, resulting in slow queries. The main reason is the excessive number of scanning lines. At this time, you can use the program to query in sections, iterate through it, and merge the results for display.

As shown in the following sql statement, segmented query can be used when the number of rows scanned exceeds one million.

12. Avoid making null value judgments on fields in the where clause

A determination of null causes the engine to abandon using indexes and perform a full table scan.

For example, LIKE " %name" or LIKE " %name%", this query will cause the index to be invalid and a full table scan will be performed. However, you can use LIKE "name%".

How to query %name%?

As shown in the following figure, although an index has been added to the secret field, it is not used in the explain result.

那么如何解决这个问题呢,答案:使用全文索引

在我们查询中经常会用到select id,fnum,fdst from table_name where user_name like '%zhangsan%'; 。这样的语句,普通索引是无法满足查询需求的。庆幸的是在 MySQL 中,有全文索引来帮助我们。

The sql syntax for creating a full-text index is:

ALTER TABLE `table_name` ADD  FULLTEXT INDEX  `idx_user_name` (`user_name`);

The sql statements that use full-text indexing are:

select id,fnum,fdst from table_name where match(user_name) against('zhangsan' in boolean mode);

** Note: Before you need to create a full-text index, please contact the DBA to determine whether it can be created. At the same time, it is important to note the difference between the writing of query statements and ordinary indexes **

14. Avoid expression operations on fields in the where clause

such as

select user_id,user_project from table_name where age*2=36;

Arithmetic operations are performed on fields in, which will cause the engine to give up using indexes. It is recommended to change to

select user_id,user_project from table_name where age=36/2;

15. Avoid implicit type conversions

Type conversion occurs when the type of the column field in the where clause is inconsistent with the type of the passed parameter. It is recommended to determine the type of the parameter in the where first

16. For joint indexes, the leftmost prefix rule must be observed

For example, the index contains the fields id,name,school. You can use the id fields directly or in the order id,name, but name;school cannot use this index. Therefore, when creating a federated index, you must pay attention to the order of index fields, and put commonly used query fields at the top.

17. If necessary, you can use force index to force a query to follow a certain index

Sometimes the MySQL optimizer uses the index it deems appropriate to retrieve sql statements, but maybe the index it uses is not what we want. At this time, we can use force index to force the optimizer to use the index we have formulated.

18. Pay attention to the scope query statement

For federated indexes, if there are range queries, such as between,>,<, etc., the subsequent index fields will be invalid.

19. About JOIN Optimization

  • LEFT JOIN A table is the driving table
  • INNER JOIN MySQL will automatically find out which table with little data to drive the table
  • The RIGHT JOIN B table is the driving table

注意:MySQL中没有full join,可以用以下方式来解决

select * from A left join B on B.name = A.name
where B.name is null
 union all
select * from B;

** Try to use inner join and avoid left join**

There are at least 2 tables participating in the joint query, and there are generally differences in size. If the connection method is inner join, MySQL will automatically select a small table as the driving table if there are no other filtering conditions. However, left join follows the principle of left driving the right in selecting driving tables, that is, the table on the left side of left join is named the driving table.

** Reasonable use of index **

The index field of the driven table serves as the restricted field on the on.

** Use small watches to drive large watches **

It can be intuitively seen from the schematic diagram that if the driver tables can be reduced, the number of cycles in the nested loop can be reduced to reduce the total amount of IO and the number of CPU operations.

** Use STRAIGHT_JOIN skillfully *

Inner join is the driving table selected by mysql, but there are some special situations where another table needs to be selected as the driving table, such as "Using filesort" and "Using temporary" such as group by and order by. STRAIGHT_JOIN is used to enforce the connection order. The table name on the left of STRAIGHT_JOIN is the driving table, and the driven table on the right is the driving table. A prerequisite for using STRAIGHT_JOIN is that the query is an inner join, that is, an inner join. STRAIGHT_JOIN is not recommended for other links, otherwise the query results may be inaccurate.

This method can sometimes reduce the time by three times.

Only the above optimization solutions are listed here. Of course, there are other optimization methods. Everyone can explore and try them. Thank you for your attention.

写到最后,欢迎关注:http://fenxianglu.cn/

Keep Exploring

延伸阅读

更多文章