前言

数据库管理员或者运维人员经常需定期对数据进行归档和清除,我们可以使用percona的pt-archiver工具能完成这一功能,使得数据归档变得方便简单。

归档之前准备

  1. pt-archiver归档前,需要先建立归档表(备份表)且表结构要一样。
  2. pt-archiver操作的表必须有主键。

1.查询表、数据信息

MySQL [pttest1]> show table status like 'demo_table'\G;
*************************** 1. row ***************************
           Name: demo_table
         Engine: InnoDB
        Version: 10
     Row_format: Compact
           Rows: 7964647
 Avg_row_length: 75
    Data_length: 601882624
Max_data_length: 0
   Index_length: 0
      Data_free: 37432066048
 Auto_increment: NULL
    Create_time: 2018-04-27 05:42:56
    Update_time: NULL
     Check_time: NULL
      Collation: utf8_general_ci
       Checksum: NULL
 Create_options: 
        Comment: 
1 row in set (0.03 sec)

2.查看表DDL(表结构)

MySQL [pttest1] show create table demo_table;
+------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table      | Create Table                                                                                                                                                                                                                                               |
+------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| demo_table | CREATE TABLE `demo_table` (
  `url` varchar(200) NOT NULL COMMENT 'url',
  `src` varchar(50) NOT NULL DEFAULT '',
  `count` bigint(20) NOT NULL DEFAULT '0',
  `day` date NOT NULL,
  PRIMARY KEY (`url`,`day`,`src`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 |
+------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

3.建立归档表

CREATE TABLE `demo_table_bak` (
  `url` varchar(200) NOT NULL COMMENT 'url',
  `src` varchar(50) NOT NULL DEFAULT '',
  `count` bigint(20) NOT NULL DEFAULT '0',
  `day` date NOT NULL,
  PRIMARY KEY (`url`,`day`,`src`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

pt-archiver常用参数

--limit10000       每次取1000行数据用pt-archive处理,Number of rows to fetch and archive per statement.
--txn-size  1000   设置1000行为一个事务提交一次,Number of rows pertransaction.
--where‘id<3000‘   设置操作条件
--progress5000     每处理5000行输出一次处理信息
--statistics       输出执行过程及最后的操作统计。(只要不加上--quiet,默认情况下pt-archive都会输出执行过程的)
--charset=UTF8     指定字符集为UTF8
--bulk-delete      批量删除source上的旧数据(例如每次1000行的批量删除操作)
--bulk-insert      批量插入数据到dest主机 (看dest的general log发现它是通过在dest主机上LOAD DATA LOCAL INFILE插入数据的)
--replace          将insert into 语句改成replace写入到dest库
--sleep120         每次归档了limit个行记录后的休眠120秒(单位为秒)
--file ‘/root/test.txt‘:数据存放的文件,最好指定绝对路径,文件名可以灵活地组合
--purge             删除source数据库的相关匹配记录
--header            输入列名称到首行(和--file一起使用)
--no-check-charset  不指定字符集
--check-columns    检验dest和source的表结构是否一致,不一致自动拒绝执行(不加这个参数也行。默认就是执行检查的)
--no-check-columns    不检验dest和source的表结构是否一致,不一致也执行(会导致dest上的无法与source匹配的列值被置为null或者0)
--chekc-interval      默认1s检查一次
--local            不把optimize或analyze操作写入到binlog里面(防止造成主从延迟巨大)
--retries         超时或者出现死锁的话,pt-archiver进行重试的间隔(默认1s)
--no-version-check   目前为止,发现部分pt工具对阿里云RDS操作必须加这个参数
--analyze=ds      操作结束后,优化表空间(d表示dest,s表示source)
 默认情况下,pt-archiver操作结束后,不会对source、dest表执行analyze或optimize操作,因为这种操作费时间,并且需要你提前预估有足够的磁盘空间用于拷贝表。一般建议也是pt-archiver操作结束后,在业务低谷手动执行analyze table用以回收表空间。

归档常用方法

归档(复制)数据到归档表,且不删除源表的数据(指定字符集),删除源数据的话把--no-delete选项换成--purge就可以了。

[root@db-work dbback]# pt-archiver --source h=localhost,u=root,p=qQSVdqzBUlT5TbeI,P=3306,D=pttest1,t=demo_table --dest h=localhost,P=3306,u=root,p=qQSVdqzBUlT5TbeI,D=pttest_bak,t=demo_table_bak  --progress 5000 --where "day < '2017-12-01'"  --statistics --charset=UTF8 --limit=10000 --txn-size 1000 --no-delete
TIME                ELAPSED   COUNT
2018-04-27T07:09:22       0       0
2018-04-27T07:09:23       1    5000
2018-04-27T07:09:24       2   10000
2018-04-27T07:09:25       3   15000
2018-04-27T07:09:26       4   20000
2018-04-27T07:09:27       5   25000
2018-04-27T07:09:28       6   30000
2018-04-27T07:09:29       7   35000
2018-04-27T07:09:30       8   40000
2018-04-27T07:09:31       9   45000
2018-04-27T07:09:32      10   50000
2018-04-27T07:09:33      11   55000
2018-04-27T07:09:34      12   60000
2018-04-27T07:09:35      13   65000
2018-04-27T07:09:36      14   70000
2018-04-27T07:09:37      15   75000
2018-04-27T07:09:38      16   80000
2018-04-27T07:09:39      17   85000
2018-04-27T07:09:40      18   90000
2018-04-27T07:09:41      19   95000
2018-04-27T07:09:42      20  100000
2018-04-27T07:09:44      21  105000
2018-04-27T07:09:45      22  110000
2018-04-27T07:09:46      24  115000
2018-04-27T07:09:47      25  120000
2018-04-27T07:09:48      26  125000
2018-04-27T07:09:49      27  130000
2018-04-27T07:09:50      28  135000
2018-04-27T07:09:51      29  140000
2018-04-27T07:09:53      31  145000
2018-04-27T07:09:54      32  150000
2018-04-27T07:09:55      33  155000
2018-04-27T07:09:56      34  160000
2018-04-27T07:09:57      35  165000
2018-04-27T07:09:58      36  170000
2018-04-27T07:10:00      37  175000
2018-04-27T07:10:01      38  180000
2018-04-27T07:10:02      39  185000
2018-04-27T07:10:02      40  190000
2018-04-27T07:10:03      41  195000
2018-04-27T07:10:04      42  200000
2018-04-27T07:10:05      43  205000
2018-04-27T07:10:06      44  210000
2018-04-27T07:10:08      46  215000
2018-04-27T07:10:09      47  220000
2018-04-27T07:10:10      48  225000
2018-04-27T07:10:11      49  230000
2018-04-27T07:10:12      50  235000
2018-04-27T07:10:13      51  240000
2018-04-27T07:10:14      52  245000
2018-04-27T07:10:15      53  250000
2018-04-27T07:10:17      54  255000
2018-04-27T07:10:18      55  260000
2018-04-27T07:10:19      57  265000
2018-04-27T07:10:20      58  270000
2018-04-27T07:10:21      59  275000
2018-04-27T07:10:22      60  280000
2018-04-27T07:10:23      61  285000
2018-04-27T07:10:24      62  290000
2018-04-27T07:10:25      63  295000
2018-04-27T07:10:26      64  300000
2018-04-27T07:10:27      65  305000
2018-04-27T07:10:28      66  310000
2018-04-27T07:10:29      67  315000
2018-04-27T07:10:30      68  320000
2018-04-27T07:10:31      69  325000
2018-04-27T07:10:32      70  330000
2018-04-27T07:10:34      71  335000
2018-04-27T07:10:35      72  340000
2018-04-27T07:10:36      73  345000
2018-04-27T07:10:37      75  350000
2018-04-27T07:10:38      76  355000
2018-04-27T07:10:39      77  360000
2018-04-27T07:10:40      78  365000
2018-04-27T07:10:41      79  370000
2018-04-27T07:10:42      80  375000
2018-04-27T07:10:43      81  380000
2018-04-27T07:10:45      82  385000
2018-04-27T07:10:46      83  390000
2018-04-27T07:10:47      84  395000
2018-04-27T07:10:48      85  400000
2018-04-27T07:10:49      87  405000
2018-04-27T07:10:50      88  410000
2018-04-27T07:10:51      89  415000
2018-04-27T07:10:52      90  420000
2018-04-27T07:10:53      91  425000
2018-04-27T07:10:54      92  430000
2018-04-27T07:10:55      93  435000
2018-04-27T07:10:56      94  440000
2018-04-27T07:10:57      95  445000
2018-04-27T07:10:58      96  450000
2018-04-27T07:10:59      97  455000
2018-04-27T07:11:00      98  458522
Started at 2018-04-27T07:09:22, ended at 2018-04-27T07:11:00
Source: A=UTF8,D=pttest1,P=3306,h=localhost,p=...,t=demo_table,u=root
Dest:   A=UTF8,D=pttest_bak,P=3306,h=localhost,p=...,t=demo_table_bak,u=root
SELECT 458522
INSERT 458522
DELETE 0
Action         Count       Time        Pct
inserting     458522    60.4543      61.66
select            47     9.7850       9.98
commit           918     1.1202       1.14
other              0    26.6915      27.22

参考资料:

  1. 码迷
  2. zengxuewen2045的博客
  3. For DBA
文章目录