pandas-dev · jreback · Jul 5, 2018 · Jun 14, 2018 · Jun 15, 2018 · Jun 15, 2018
diff --git a/doc/source/whatsnew/v0.24.0.txt b/doc/source/whatsnew/v0.24.0.txt
@@ -40,6 +40,7 @@ Other Enhancements
   <https://pandas-gbq.readthedocs.io/en/latest/changelog.html#changelog-0-5-0>`__.
   (:issue:`21627`)
 - New method :meth:`HDFStore.walk` will recursively walk the group hierarchy of an HDF5 file (:issue:`10932`)
+- :func:`read_html` copies cell data across ``colspan``s and ``rowspan``s, and it treats all-``th`` table rows as headers if ``header`` kwarg is not given and there is no ``thead`` (:issue:`17054`)
 - :meth:`Series.nlargest`, :meth:`Series.nsmallest`, :meth:`DataFrame.nlargest`, and :meth:`DataFrame.nsmallest` now accept the value ``"all"`` for the ``keep` argument. This keeps all ties for the nth largest/smallest value (:issue:`16818`)
 -
 
@@ -167,6 +168,120 @@ Current Behavior:
     ...
     OverflowError: Trying to coerce negative values to unsigned integers
 
+read_html Incompatibilities
+---------------------------
+
+:func:`read_html` previously ignored ``colspan`` and ``rowspan`` attributes.
+Now it understands them, treating them as a sequence of cells with the same
+value.
+
+Previous Behavior:
+
+.. code-block:: ipython
+
+    In [1]: pd.read_html("""
+          <table>
+            <thead>
+              <tr>
+                <th>A</th><th>B</th><th>C</th>
+              </tr>
+            </thead>
+            <tbody>
+              <tr>
+                <td colspan="2">1</td><td>2</td>
+              </tr>
+            </tbody>
+          </table>
+        """)
+    Out [1]:
+    [   A  B   C
+    0  1  2 NaN]
+
+Current Behavior:
+
+.. code-block:: ipython
+
+    In [1]: pd.read_html("""
+          <table>
+            <thead>
+              <tr>
+                <th>A</th><th>B</th><th>C</th>
+              </tr>
+            </thead>
+            <tbody>
+              <tr>
+                <td colspan="2">1</td><td>2</td>
+              </tr>
+            </tbody>
+          </table>
+        """)
+    Out [1]:
+    [   A  B  C
+    0  1  2  2]
+
+Calls that relied on the previous behavior will need to be changed.
+
+Also, :func:`read_html` previously ignored some ``<tr>`` elements when called
+with ``header=`` or ``skiprows=`` on some unusual HTML tables.
+(:issue:`21641`)
+
+Previous Behavior:
+
+.. code-block:: ipython
+
+    In [1]: pd.read_html("""
+          <table>
+            <thead>
+              <tr>
+                <!-- empty header row, was ignored -->
+                <th></th><th></th><th></th>
+              </tr>
+              <tr>
+                <th>A</th><th>B</th><th>C</th>
+              </tr>
+            </thead>
+            <tbody>
+              <tr>
+                <td>1</td><td>2</td><td>3</td>
+              </tr>
+            </tbody>
+          </table>
+        """, header=2)
+    Out [1]:
+    [Empty DataFrame
+    Columns: [1, 2, 3]
+    Index: []]
+
+Current Behavior:
+
+.. code-block:: ipython
+
+    In [1]: pd.read_html("""
+          <table>
+            <thead>
+              <tr>
+                <!-- empty header row, was ignored -->
+                <th></th><th></th><th></th>
+              </tr>
+              <tr>
+                <th>A</th><th>B</th><th>C</th>
+              </tr>
+            </thead>
+            <tbody>
+              <tr>
+                <td>1</td><td>2</td><td>3</td>
+              </tr>
+            </tbody>
+          </table>
+        """, header=2)
+    Out [1]:
+    [   A  B  C
+    0  1  2  3]
+
+Previously, the workaround was to write ``header=0`` instead of ``header=1``
+for this example table. Now, that workaround must be removed. This should not
+affect many users, since most HTML tables do not have empty header rows.
+
 - :class:`DatetimeIndex` now accepts :class:`Int64Index` arguments as epoch timestamps (:issue:`20997`)
 -
 -
@@ -297,7 +412,7 @@ MultiIndex
 I/O
 ^^^
 
--
+- :func:`read_html()` no longer ignores all-whitespace ``<tr>`` within ``<thead>`` when considering the ``skiprows`` and ``header`` arguments. Previously, users had to decrease their ``header`` and ``skiprows`` values on such tables to work around the issue. (:issue:`21641`)
 -
 -