aliyun
diff --git a/‎README.rst
Lines changed: 103 additions & 106 deletions b/‎README.rst
Lines changed: 103 additions & 106 deletions
diff --git a/‎docs/source/base-tables.rst
Lines changed: 3 additions & 0 deletions b/‎docs/source/base-tables.rst
Lines changed: 3 additions & 0 deletions
diff --git a/‎docs/source/df-basic.rst
Lines changed: 4 additions & 2 deletions b/‎docs/source/df-basic.rst
Lines changed: 4 additions & 2 deletions
@@ -1,7 +1,10 @@
 ODPS Python SDK and data analysis framework
 ===========================================
 
-|PyPI version| |Docs| |License| |Implementation|
+`PyPI version <https://pypi.python.org/pypi/pyodps>`__
+`Docs <http://pyodps.readthedocs.org/>`__
+`License <https://github.com/aliyun/aliyun-odps-python-sdk/blob/master/License>`__
+|Implementation|
 
 Elegent way to access ODPS API.
 `Documentation <http://pyodps.readthedocs.org/>`__
@@ -13,25 +16,25 @@ The quick way:
 
 ::
 
-    pip install 'pyodps[full]'
+   pip install 'pyodps[full]'
 
-If you don't need to use Jupyter, just type
+If you don’t need to use Jupyter, just type
 
 ::
 
-    pip install pyodps
+   pip install pyodps
 
 The dependencies will be installed automatically.
 
 Or from source code:
 
 .. code:: shell
 
-    $ virtualenv pyodps_env
-    $ source pyodps_env/bin/activate
-    $ git clone <git clone URL> pyodps
-    $ cd pyodps
-    $ python setup.py install
+   $ virtualenv pyodps_env
+   $ source pyodps_env/bin/activate
+   $ git clone <git clone URL> pyodps
+   $ cd pyodps
+   $ python setup.py install
 
 Dependencies
 ------------
@@ -52,116 +55,116 @@ Usage
 
 .. code:: python
 
-    >>> from odps import ODPS
-    >>> o = ODPS('**your-access-id**', '**your-secret-access-key**',
-    ...          project='**your-project**', endpoint='**your-end-point**')
-    >>> dual = o.get_table('dual')
-    >>> dual.name
-    'dual'
-    >>> dual.schema
-    odps.Schema {
-      c_int_a                 bigint
-      c_int_b                 bigint
-      c_double_a              double
-      c_double_b              double
-      c_string_a              string
-      c_string_b              string
-      c_bool_a                boolean
-      c_bool_b                boolean
-      c_datetime_a            datetime
-      c_datetime_b            datetime
-    }
-    >>> dual.creation_time
-    datetime.datetime(2014, 6, 6, 13, 28, 24)
-    >>> dual.is_virtual_view
-    False
-    >>> dual.size
-    448
-    >>> dual.schema.columns
-    [<column c_int_a, type bigint>,
-     <column c_int_b, type bigint>,
-     <column c_double_a, type double>,
-     <column c_double_b, type double>,
-     <column c_string_a, type string>,
-     <column c_string_b, type string>,
-     <column c_bool_a, type boolean>,
-     <column c_bool_b, type boolean>,
-     <column c_datetime_a, type datetime>,
-     <column c_datetime_b, type datetime>]
+   >>> from odps import ODPS
+   >>> o = ODPS('**your-access-id**', '**your-secret-access-key**',
+   ...          project='**your-project**', endpoint='**your-end-point**')
+   >>> dual = o.get_table('dual')
+   >>> dual.name
+   'dual'
+   >>> dual.schema
+   odps.Schema {
+     c_int_a                 bigint
+     c_int_b                 bigint
+     c_double_a              double
+     c_double_b              double
+     c_string_a              string
+     c_string_b              string
+     c_bool_a                boolean
+     c_bool_b                boolean
+     c_datetime_a            datetime
+     c_datetime_b            datetime
+   }
+   >>> dual.creation_time
+   datetime.datetime(2014, 6, 6, 13, 28, 24)
+   >>> dual.is_virtual_view
+   False
+   >>> dual.size
+   448
+   >>> dual.schema.columns
+   [<column c_int_a, type bigint>,
+    <column c_int_b, type bigint>,
+    <column c_double_a, type double>,
+    <column c_double_b, type double>,
+    <column c_string_a, type string>,
+    <column c_string_b, type string>,
+    <column c_bool_a, type boolean>,
+    <column c_bool_b, type boolean>,
+    <column c_datetime_a, type datetime>,
+    <column c_datetime_b, type datetime>]
 
 DataFrame API
 -------------
 
 .. code:: python
 
-    >>> from odps.df import DataFrame
-    >>> df = DataFrame(o.get_table('pyodps_iris'))
-    >>> df.dtypes
-    odps.Schema {
-      sepallength           float64
-      sepalwidth            float64
-      petallength           float64
-      petalwidth            float64
-      name                  string
-    }
-    >>> df.head(5)
-    |==========================================|   1 /  1  (100.00%)         0s
-       sepallength  sepalwidth  petallength  petalwidth         name
-    0          5.1         3.5          1.4         0.2  Iris-setosa
-    1          4.9         3.0          1.4         0.2  Iris-setosa
-    2          4.7         3.2          1.3         0.2  Iris-setosa
-    3          4.6         3.1          1.5         0.2  Iris-setosa
-    4          5.0         3.6          1.4         0.2  Iris-setosa
-    >>> df[df.sepalwidth > 3]['name', 'sepalwidth'].head(5)
-    |==========================================|   1 /  1  (100.00%)        12s
-              name  sepalwidth
-    0  Iris-setosa         3.5
-    1  Iris-setosa         3.2
-    2  Iris-setosa         3.1
-    3  Iris-setosa         3.6
-    4  Iris-setosa         3.9
+   >>> from odps.df import DataFrame
+   >>> df = DataFrame(o.get_table('pyodps_iris'))
+   >>> df.dtypes
+   odps.Schema {
+     sepallength           float64
+     sepalwidth            float64
+     petallength           float64
+     petalwidth            float64
+     name                  string
+   }
+   >>> df.head(5)
+   |==========================================|   1 /  1  (100.00%)         0s
+      sepallength  sepalwidth  petallength  petalwidth         name
+   0          5.1         3.5          1.4         0.2  Iris-setosa
+   1          4.9         3.0          1.4         0.2  Iris-setosa
+   2          4.7         3.2          1.3         0.2  Iris-setosa
+   3          4.6         3.1          1.5         0.2  Iris-setosa
+   4          5.0         3.6          1.4         0.2  Iris-setosa
+   >>> df[df.sepalwidth > 3]['name', 'sepalwidth'].head(5)
+   |==========================================|   1 /  1  (100.00%)        12s
+             name  sepalwidth
+   0  Iris-setosa         3.5
+   1  Iris-setosa         3.2
+   2  Iris-setosa         3.1
+   3  Iris-setosa         3.6
+   4  Iris-setosa         3.9
 
 Command-line and IPython enhancement
 ------------------------------------
 
 ::
 
-    In [1]: %load_ext odps
+   In [1]: %load_ext odps
 
-    In [2]: %enter
-    Out[2]: <odps.inter.Room at 0x10fe0e450>
+   In [2]: %enter
+   Out[2]: <odps.inter.Room at 0x10fe0e450>
 
-    In [3]: %sql select * from pyodps_iris limit 5
-    |==========================================|   1 /  1  (100.00%)         2s
-    Out[3]: 
-       sepallength  sepalwidth  petallength  petalwidth         name
-    0          5.1         3.5          1.4         0.2  Iris-setosa
-    1          4.9         3.0          1.4         0.2  Iris-setosa
-    2          4.7         3.2          1.3         0.2  Iris-setosa
-    3          4.6         3.1          1.5         0.2  Iris-setosa
-    4          5.0         3.6          1.4         0.2  Iris-setosa
+   In [3]: %sql select * from pyodps_iris limit 5
+   |==========================================|   1 /  1  (100.00%)         2s
+   Out[3]: 
+      sepallength  sepalwidth  petallength  petalwidth         name
+   0          5.1         3.5          1.4         0.2  Iris-setosa
+   1          4.9         3.0          1.4         0.2  Iris-setosa
+   2          4.7         3.2          1.3         0.2  Iris-setosa
+   3          4.6         3.1          1.5         0.2  Iris-setosa
+   4          5.0         3.6          1.4         0.2  Iris-setosa
 
 Python UDF Debugging Tool
 -------------------------
 
 .. code:: python
 
-    #file: plus.py
-    from odps.udf import annotate
+   #file: plus.py
+   from odps.udf import annotate
 
-    @annotate('bigint,bigint->bigint')
-    class Plus(object):
-        def evaluate(self, a, b):
-            return a + b
+   @annotate('bigint,bigint->bigint')
+   class Plus(object):
+       def evaluate(self, a, b):
+           return a + b
 
 ::
 
-    $ cat plus.input
-    1,1
-    3,2
-    $ pyou plus.Plus < plus.input
-    2
-    5
+   $ cat plus.input
+   1,1
+   3,2
+   $ pyou plus.Plus < plus.input
+   2
+   5
 
 Contributing
 ------------
@@ -171,29 +174,23 @@ source:
 
 ::
 
-    git clone https://github.com/aliyun/aliyun-odps-python-sdk
-    cd pyodps
-    pip install -r requirements.txt -e .
+   git clone https://github.com/aliyun/aliyun-odps-python-sdk
+   cd pyodps
+   pip install -r requirements.txt -e .
 
 If you need to modify the frontend code, you need to install
 `nodejs/npm <https://www.npmjs.com/>`__. To build and install your
 frontend code, use
 
 ::
 
-    python setup.py build_js
-    python setup.py install_js
+   python setup.py build_js
+   python setup.py install_js
 
 License
 -------
 
 Licensed under the `Apache License
 2.0 <https://www.apache.org/licenses/LICENSE-2.0.html>`__
 
-.. |PyPI version| image:: https://img.shields.io/pypi/v/pyodps.svg?style=flat-square
-   :target: https://pypi.python.org/pypi/pyodps
-.. |Docs| image:: https://img.shields.io/badge/docs-latest-brightgreen.svg?style=flat-square
-   :target: http://pyodps.readthedocs.org/
-.. |License| image:: https://img.shields.io/pypi/l/pyodps.svg?style=flat-square
-   :target: https://github.com/aliyun/aliyun-odps-python-sdk/blob/master/License
 .. |Implementation| image:: https://img.shields.io/pypi/implementation/pyodps.svg?style=flat-square
@@ -279,6 +279,9 @@ Record表示表的一行记录，我们在 Table 对象上调用 new_record 就
     同时过多的文件会降低后续的查询效率。因此，我们建议在使用 write_table 方法时，一次性写入多组数据，
     或者传入一个 generator 对象。
 
+    write_table 写表时会追加到原有数据。PyODPS 不提供覆盖数据的选项，如果需要覆盖数据，需要手动清除
+    原有数据。对于非分区表，需要调用 table.truncate()，对于分区表，需要删除分区后再建立。
+
 删除表
 -------
 
 
@@ -810,7 +810,9 @@ ResultFrame 也支持在安装有 pandas 的前提下转换为 pandas DataFrame
     3          5.0         2.0          3.5         1.0  Iris-versicolor
     4          6.0         2.2          4.0         1.0  Iris-versicolor
 
-``persist``\ 可以传入partitions参数，这样会创建一个表，它的分区是partitions所指定的字段。
+``persist``\ 可以传入 partitions 参数。加入该参数后，会创建一个分区表，它的分区字段为 partitions 列出的字段，
+DataFrame 中相应字段的值决定该行将被写入的分区。例如，当 partitions 为 ['name'] 且某行 name 的值为 test，
+那么该行将被写入分区 ``name=test``。这适用于当分区需要通过计算获取的情形。
 
 .. code:: python
 
@@ -827,7 +829,7 @@ ResultFrame 也支持在安装有 pandas 的前提下转换为 pandas DataFrame
         name                  : string
 
 
-如果想写入已经存在的表的某个分区，``persist``\ 可以传入partition参数，指明写入表的哪个分区（如ds=******）。
+如果想写入已经存在的表的某个分区，``persist``\ 可以传入 partition 参数，指明写入表的哪个分区（如ds=******）。
 这时要注意，该DataFrame的每个字段都必须在该表存在，且类型相同。drop_partition和create_partition参数只有在此时有效,
 分别表示是否要删除（如果分区存在）或创建（如果分区不存在）该分区。