Skip to content

支持多索引和 中间件的处理 #26

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 21 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
FROM rackspacedot/python37
LABEL MAINTAINER="xunhanliu<[email protected]>"
# 修改时区、 pip>10 可以config 换源
RUN echo "Asia/Shanghai" > /etc/timezone \
&& pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple/ \
&& python -m pip install --upgrade pip

RUN apt-get update && apt-get install mysql-client -y


RUN mkdir -p /usr/src/middleware

WORKDIR /usr/src

COPY es_sync/* ./es_sync/
COPY run.py .

ADD requirements.txt /usr/src
RUN pip install -r /usr/src/requirements.txt
CMD ["python","run.py", "config.yaml"]
# docker build 示例: docker build -t py-mysql-elasticsearch-sync:latest .
2 changes: 1 addition & 1 deletion MANIFEST.in
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
include README.md LICENSE
recursive-include src *.yaml
recursive-include es_sync *.yaml
18 changes: 7 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,9 @@
tips: orginal project: [orginal project](https://github.com/zhongbiaodev/py-mysql-elasticsearch-sync)

# py-mysql-elasticsearch-sync
Simple and fast MySQL to Elasticsearch sync tool, written in Python.

[中文文档](https://github.com/zhongbiaodev/py-mysql-elasticsearch-sync/blob/master/README_CN.md)
[中文文档](https://github.com/xunhanliu/py-mysql-elasticsearch-sync/blob/master/README_CN.md)

## Introduction
This tool helps you to initialize MySQL dump table to Elasticsearch by parsing mysqldump, then incremental sync MySQL table to Elasticsearch by processing MySQL Binlog.
Expand Down Expand Up @@ -38,7 +40,7 @@ pip install py-mysql-elasticsearch-sync
```

## Configuration
There is a [sample config](https://github.com/zhongbiaodev/py-mysql-elasticsearch-sync/blob/master/es_sync/sample.yaml) file in repo, you can start by editing it.
There is a [sample config](https://github.com/xunhanliu/py-mysql-elasticsearch-sync/blob/master/es_sync/sample.yaml) file in repo, you can start by editing it.

## Running
Simply run command
Expand All @@ -60,14 +62,8 @@ es-sync path/to/your/config.yaml --fromfile
to start sync, when xml sync is over, it will also start binlog sync.

## Deployment
We provide an [upstart script]((https://github.com/zhongbiaodev/py-mysql-elasticsearch-sync/blob/master/upstart.conf)) to help you deploy this tool, you can edit it for your own condition, besides, you can deploy it in your own way.
We provide an [upstart script]((https://github.com/xunhanliu/py-mysql-elasticsearch-sync/blob/master/upstart.conf)) to help you deploy this tool, you can edit it for your own condition, besides, you can deploy it in your own way.

## MultiTable Supporting
Now Multi-table is supported through setting tables in config file, the first table is master as default and the others are slave.

Master table and slave tables must use the same primary key, which is defined via _id.

Table has higher priority than tables.

## TODO
- [ ] MultiIndex Supporting
already supported .
see zh doc-> [中文文档](https://github.com/xunhanliu/py-mysql-elasticsearch-sync/blob/master/README_CN.md)
42 changes: 30 additions & 12 deletions README_CN.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,10 @@
docker部署需要挂载 middleware 目录,config.yaml binlog.info 文件




tips: 原始项目为: [原始项目](https://github.com/zhongbiaodev/py-mysql-elasticsearch-sync)

# py-mysql-elasticsearch-sync
一个从MySQL向Elasticsearch同步数据的工具,使用Python实现。

Expand Down Expand Up @@ -35,7 +42,7 @@ pip install py-mysql-elasticsearch-sync
```

## 配置
你可以通过修改[配置文件示例](https://github.com/zhongbiaodev/py-mysql-elasticsearch-sync/blob/master/es_sync/sample.yaml)来编写自己的配置文件
你可以通过修改[配置文件示例](https://github.com/xunhanliu/py-mysql-elasticsearch-sync/blob/master/es_sync/sample.yaml)来编写自己的配置文件

## 运行
运行命令
Expand All @@ -60,14 +67,25 @@ es-sync path/to/your/config.yaml --fromfile
启动从xml导入,当从xml导入完毕后,它会开始同步binlog

## 服务管理
我们写了一个[upstart脚本](https://github.com/zhongbiaodev/py-mysql-elasticsearch-sync/blob/master/upstart.conf)来管理本工具的运行,你也可以用你自己的方式进行部署运行

## 多表支持
你可以在config文件中配置tables以支持多表,默认tables中第一张表为主表,其余表为从表。

主表和从表主键必须相同,均为_id字段。

当同时设置table和tables时,table优先级较高。

## TODO
- [ ] 多索引支持
我们写了一个[upstart脚本](https://github.com/xunhanliu/py-mysql-elasticsearch-sync/blob/master/upstart.conf)来管理本工具的运行,你也可以用你自己的方式进行部署运行

## 对配置文件新增多索引和middleware的说明
```yaml
# 配置映射
table_mappings:
- mysql_table_name: keyword
es_index: test
es_type: _doc
middlewares:
- "middleware.keyword:process_id"
mapping:
_id: id
- mysql_table_name: user
es_index: user
es_type: _doc
mapping:
_id: id
middlewares:
```
mysql_table_name :keyword 表示从keyword表中读取数据,
在每行数据处理过程中,先经过 mapping 操作,然后经过 middleware.keyword:process_id 的中间件的操作,最后导入到es /test/_doc 中,
2 changes: 2 additions & 0 deletions binlog.info
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
log_file:
log_pos:
Loading