过年这段时间由于线上数据库经常压力过大导致响应非常缓慢甚至死机,咬咬牙下大决心来解决效率不高的问题!
首先是由于公司秉承快速开发原则,频繁上线,导致每次忽视了性能问题!日积月累,所以导致系统越来越慢,所以如果你的系统查询语句本来就优化的很好了可能参考意义不大!
提取慢查询日志文件,应该在你的DataDir目录下面
通过程序处理慢查询文件,将文件格式的慢查询导入到数据库中:
1 mysql> desc slow_query;
2 +---------------+-------------+------+-----+---------+-------+
3 | Field | Type | Null | Key | Default | Extra |
4 +---------------+-------------+------+-----+---------+-------+
5 | Date | varchar(32) | NO | | | | 查询发生的时间
6 | user | varchar(64) | NO | | | |
7 | host | varchar(64) | NO | | | |
8 | content | text | NO | | | | 将Statement进行Mask后的语句,便于Group By
9 | query_time | int(11) | NO | | | | 查询所用时间,直接性能指标
10 | lock_time | int(11) | YES | | 0 | | 等待锁定的时间
11 | rows_sent | int(11) | YES | | 0 | | 返回的结果行数
12 | rows_examined | int(11) | YES | | 0 | | 扫描行数
13 | statement | text | YES | | NULL | | 实际查询语句
14 +---------------+-------------+------+-----+---------+-------+
然后发挥您的想象力在这个表中尽力捕捉你想捕捉的,那类型语句压力最大、扫描行数最多、等锁最久……
比如:
优化后:
mysql> select sum(query_time)/count(*),count
(*),sum(query_time),min(Date),Max(Date) from slow where Date>'2008-02-20 22:50:52'and Date<'2008-02-21 17:34:35';
+--------------------------+----------+-----------------+---------------------+---------------------+
| sum(query_time)/count(*) | count(*) | sum(query_time) | min(Date) | Max(Date) |
+--------------------------+----------+-----------------+---------------------+---------------------+
| 5.7233 | 2197 | 12574 | 2008-02-20 22:51:16 | 2008-02-21 17:34:10 |
+--------------------------+----------+-----------------+---------------------+---------------------+
1 row in set (0.09 sec)
优化前:
mysql> select sum(query_time)/count(*),count(*),sum(query_time),min(Date),Max(Date) from slow where Date>'2008-02-17 22:50:52' and Date<'2008-02-18 17:34:35';
+--------------------------+----------+-----------------+---------------------+---------------------+
| sum(query_time)/count(*) | count(*) | sum(query_time) | min(Date) | Max(Date) |
+--------------------------+----------+-----------------+---------------------+---------------------+
| 2.5983 | 16091 | 41810 | 2008-02-17 22:50:58 | 2008-02-18 17:34:34 |
+--------------------------+----------+-----------------+---------------------+---------------------+
1 row in set (0.15 sec)
再比如,优化前:
基本信息:
慢查询统计从 2008-02-17 17:59:34 到2008-02-18 22:45:22时间段,接近29个小时的数据;
总共有慢查询28914个,平均一小时有1000个慢查询;(花了一天优化降到每小时100个的样子了,成就感啊)
所有慢查询耗费总时间75690秒;
慢查询时间设置是大于2秒
参数说明:
sum--总执行时间(秒);
count--执行次数;
avg--平均执行时间(秒);
content--类似SQL语句的表达通式,其中'DD'代表数字;
statement--某一条具体执行的SQL语句
由于访问时的锁,导致update非常慢:
1 mysql> select count(*) as n,sum(query_time) as s, sum(query_time)/count(*) as avg,substring_index(statement,' ',2) as u from slow where statement like 'update%' and query_time>14 group by u;
2 +-----+------+---------+--------------------------+
3 | n | s | avg | u |
4 +-----+------+---------+--------------------------+
5 | 7 | 112 | 16.0000 | update conversation |
6 | 151 | 2413 | 15.9801 | update user |
7 | 4 | 65 | 16.2500 | update user_modification |
8 +-----+------+---------+--------------------------+
说明程序中还是存在一些忘记释放事务锁的情况
最耗费资源的10个查询:
其中第1,2,5应该是同一类查询,这样的话这一类查询占总查询的一半以上,每分钟出现10个以上这样的慢查询,需要重点解决!
1 mysql> select sum(query_time) as sum, count(*) as count, sum(query_time)/count(*) as avg,statement from slow wher
2 e host like '%69.12.23.%' group by content order by sum desc limit 0,10\G
3 *************************** 1. row ***************************
4 sum: 27326
5 count: 11681
6 avg: 2.3394
7 …………