经常会遇到这样一个场景:
业务那边觉得数据库“很慢”,上去通过show processlist查看发现大量State为在等待lock,如:
Waiting for table metadata/level lock等
比如在执行一个DDL时,发现被hang住,查看到目前进程状态,有MDL
mysql> SHOW PROCESSLIST;
+-----+------+-----------+------+---------+------+---------------------------------+--------------------------------------------+
| Id | User | Host | db | Command | Time | State| Info|
+-----+------+-----------+------+---------+------+---------------------------------+--------------------------------------------+
| 585 | root | localhost | test | Sleep | 1658 || NULL |
| 586 | root | localhost | test | Query | 1654 | Waiting for table metadata lock | alter table t change name name varchar(32) |
| 590 | root | localhost | test | Query | 0| starting| show processlist|
+-----+------+-----------+------+---------+------+---------------------------------+--------------------------------------------+
3 rows in set (0.00 sec)
在实验环境里,我们很快就能定位到,应该是id为585这个线程,但是无法知道正在执行什么sql:
通过information_schema.innodb_trx\G,也不能查询到具体执行了什么sql。
通过简单的kill的确可以解决眼前的问题,但如果继续遇到该问题,也难以定位具体内容。
但其实,performance_schema.events_statements_current提供了相关信息,此处复现一下:
session1> BEGIN;
Query OK, 0 rows affected (0.00 sec)
session1> UPDATE t SET name='fasdfsad';
Query OK, 3 rows affected (0.00 sec)
Rows matched: 3Changed: 3Warnings: 0
session2> ALTER TABLE t CHANGE name name varchar(32)
发现被hang住
查看一下是否有事务未提交,可以发现的确有:
该事务内的语句执行完毕(处于Sleep),但未提交,就会看不到对应的trx_query:
session3> SELECT * FROM information_schema.innodb_trx\G
*************************** 1. row ***************************
trx_id: 9614
trx_state: RUNNING
trx_started: 2017-09-19 15:58:05
trx_requested_lock_id: NULL
trx_wait_started: NULL
trx_weight: 2
trx_mysql_thread_id: 585
trx_query: NULL
trx_operation_state: NULL
trx_tables_in_use: 0
trx_tables_locked: 1
trx_lock_structs: 2
trx_lock_memory_bytes: 1136
trx_rows_locked: 4
trx_rows_modified: 0
trx_concurrency_tickets: 0
trx_isolation_level: REPEATABLE READ
trx_unique_checks: 1
trx_foreign_key_checks: 1
trx_last_foreign_key_error: NULL
trx_adaptive_hash_latched: 0
trx_adaptive_hash_timeout: 0
trx_is_read_only: 0
trx_autocommit_non_locking: 0
1 row in set (0.00 sec)
只能根据trx_mysql_thread_id看到未提交的事务的process id,看一下processlist,INFO内也没有具体内容:
session3> SHOW PROCESSLIST;
+-----+------+-----------+------+---------+------+---------------------------------+--------------------------------------------+
| Id | User | Host| db | Command | Time | State| Info|
+-----+------+-----------+------+---------+------+---------------------------------+--------------------------------------------+
| 585 | root | localhost | test | Sleep | 42 || NULL |
| 586 | root | localhost | test | Query | 37 | Waiting for table metadata lock | ALTER TABLE t CHANGE name name varchar(32) |
| 590 | root | localhost | test | Query | 0 | starting| SHOW PROCESSLIST|
+-----+------+-----------+------+---------+------+---------------------------------+--------------------------------------------+
3 rows in set (0.00 sec)
但只要打开了P_S,就可以通过performance_schema.events_statements_current来查看到对应的sql,包括已经执行完,但没有提交的。
session3> SELECT sql_text FROM performance_schema.events_statements_current;
+-------------------------------------------------------------------+
| sql_text |
+-------------------------------------------------------------------+
| UPDATE t SET name='fasdfsad' |
| ALTER TABLE t CHANGE name name varchar(32) |
| select sql_text from performance_schema.events_statements_current |
+-------------------------------------------------------------------+
3 rows in set (0.00 sec)
当然,在复杂的生产环境中,光凭上面的语句查出来的信息,是远远不够的。
通过如下语句,可以扩展show processlist的显示结果,并提供对应的SQL。
SELECT b.processlist_id, c.db, a.sql_text, c.command, c.time, c.state
FROM performance_schema.events_statements_current a JOIN performance_schema.threads b USING(thread_id)
JOIN information_schema.processlist c ON b.processlist_id = c.id
WHERE a.sql_text NOT LIKE '%performance%';
结果:
+----------------+------+--------------------------------------------+---------+------+---------------------------------+
| processlist_id | db | sql_text | command | time | state |
+----------------+------+--------------------------------------------+---------+------+---------------------------------+
| 585| test | UPDATE t SET name='fasdfsad'| Sleep | 243 | |
| 586| test | ALTER TABLE t CHANGE name name varchar(32) | Query | 238 | Waiting for table metadata lock |
+----------------+------+--------------------------------------------+---------+------+---------------------------------+
2 rows in set (0.01 sec)
也可以很容易定位到执行的内容是做了一个update操作未提交。
作者微信公众号(持续更新)