+ problem 591

fruit-in · fruit-in · commit 58cfdf10d6e1 · 2025-04-19T10:47:21.000+08:00
diff --git a/Problemset/0591-Tag Validator/README.md b/Problemset/0591-Tag Validator/README.md
@@ -0,0 +1,99 @@
+# 591. Tag Validator
+Given a string representing a code snippet, implement a tag validator to parse the code and return whether it is valid.
+
+A code snippet is valid if all the following rules hold:
+1. The code must be wrapped in a **valid closed tag**. Otherwise, the code is invalid.
+2. A **closed tag** (not necessarily valid) has exactly the following format : `<TAG_NAME>TAG_CONTENT</TAG_NAME>`. Among them, `<TAG_NAME>` is the start tag, and `</TAG_NAME>` is the end tag. The TAG_NAME in start and end tags should be the same. A closed tag is **valid** if and only if the TAG_NAME and TAG_CONTENT are valid.
+3. A **valid** `TAG_NAME` only contain **upper-case letters**, and has length in range [1,9]. Otherwise, the `TAG_NAME` is **invalid**.
+4. A **valid** `TAG_CONTENT` may contain other **valid closed tags**, **cdata** and any characters (see note1) **EXCEPT** unmatched `<`, unmatched start and end tag, and unmatched or closed tags with invalid TAG_NAME. Otherwise, the `TAG_CONTENT` is **invalid**.
+5. A start tag is unmatched if no end tag exists with the same TAG_NAME, and vice versa. However, you also need to consider the issue of unbalanced when tags are nested.
+6. A `<` is unmatched if you cannot find a subsequent `>`. And when you find a `<` or `</`, all the subsequent characters until the next `>` should be parsed as TAG_NAME (not necessarily valid).
+7. The cdata has the following format : `<![CDATA[CDATA_CONTENT]]>`. The range of `CDATA_CONTENT` is defined as the characters between `<![CDATA[` and the **first subsequent** `]]>`.
+8. `CDATA_CONTENT` may contain **any characters**. The function of cdata is to forbid the validator to parse `CDATA_CONTENT`, so even it has some characters that can be parsed as tag (no matter valid or invalid), you should treat it as **regular characters**.
+
+#### Example 1:
+<pre>
+<strong>Input:</strong> code = "<DIV>This is the first line <![CDATA[<div>]]></DIV>"
+<strong>Output:</strong> true
+<strong>Explanation:</strong>
+The code is wrapped in a closed tag : <DIV> and </DIV>.
+The TAG_NAME is valid, the TAG_CONTENT consists of some characters and cdata.
+Although CDATA_CONTENT has an unmatched start tag with invalid TAG_NAME, it should be considered as plain text, not parsed as a tag.
+So TAG_CONTENT is valid, and then the code is valid. Thus return true.
+</pre>
+
+#### Example 2:
+<pre>
+<strong>Input:</strong> code = "<DIV>>>  ![cdata[]] <![CDATA[<div>]>]]>]]>>]</DIV>"
+<strong>Output:</strong> true
+<strong>Explanation:</strong>
+We first separate the code into : start_tag|tag_content|end_tag.
+start_tag -> "<DIV>"
+end_tag -> "</DIV>"
+tag_content could also be separated into : text1|cdata|text2.
+text1 -> ">>  ![cdata[]] "
+cdata -> "<![CDATA[<div>]>]]>", where the CDATA_CONTENT is "<div>]>"
+text2 -> "]]>>]"
+The reason why start_tag is NOT "<DIV>>>" is because of the rule 6.
+The reason why cdata is NOT "<![CDATA[<div>]>]]>]]>" is because of the rule 7.
+</pre>
+
+#### Example 3:
+<pre>
+<strong>Input:</strong> code = "<A>  <B> </A>   </B>"
+<strong>Output:</strong> false
+<strong>Explanation:</strong> Unbalanced. If "<A>" is closed, then "<B>" must be unmatched, and vice versa.
+</pre>
+
+#### Constraints:
+* `1 <= code.length <= 500`
+* `code` consists of English letters, digits, `'<'`, `'>'`, `'/'`, `'!'`, `'['`, `']'`, `'.'`, and `' '`.
+
+## Solutions (Python)
+
+### 1. Solution
+```Python
+class Solution:
+    def isValid(self, code: str) -> bool:
+        cdata = False
+        tagstack = []
+        i = 0
+
+        while i < len(code):
+            if cdata:
+                if code[i:i + 3] == "]]>":
+                    cdata = False
+                    i += 2
+            elif tagstack == [] and (code[i] != '<' or code[i:i + 2] in "</<!"):
+                return False
+            elif code[i:i + 9] == "<![CDATA[":
+                cdata = True
+                i += 8
+            elif code[i:i + 2] == "</":
+                for j in range(i + 2, i + 13):
+                    if j >= len(code) or j == i + 12 or (j == i + 2 and code[j] == '>'):
+                        return False
+                    elif code[j] == '>':
+                        if tagstack.pop() != code[i + 2:j]:
+                            return False
+                        if tagstack == [] and j != len(code) - 1:
+                            return False
+                        i = j
+                        break
+                    elif not code[j].isupper():
+                        return False
+            elif code[i] == '<':
+                for j in range(i + 1, i + 12):
+                    if j >= len(code) or j == i + 11 or (j == i + 1 and code[j] == '>'):
+                        return False
+                    elif code[j] == '>':
+                        tagstack.append(code[i + 1:j])
+                        i = j
+                        break
+                    elif not code[j].isupper():
+                        return False
+
+            i += 1
+
+        return tagstack == []
+```
diff --git a/Problemset/0591-Tag Validator/README_CN.md b/Problemset/0591-Tag Validator/README_CN.md
@@ -0,0 +1,103 @@
+# 591. 标签验证器
+给定一个表示代码片段的字符串，你需要实现一个验证器来解析这段代码，并返回它是否合法。合法的代码片段需要遵守以下的所有规则：
+1. 代码必须被**合法的闭合标签**包围。否则，代码是无效的。
+2. **闭合标签**（不一定合法）要严格符合格式：`<TAG_NAME>TAG_CONTENT</TAG_NAME>`。其中，`<TAG_NAME>`是起始标签，`</TAG_NAME>`是结束标签。起始和结束标签中的 TAG_NAME 应当相同。当且仅当 TAG_NAME 和 TAG_CONTENT 都是合法的，闭合标签才是**合法的**。
+3. **合法的** `TAG_NAME` 仅含有**大写字母**，长度在范围 [1,9] 之间。否则，该 `TAG_NAME` 是**不合法的**。
+4. **合法的** `TAG_CONTENT` 可以包含其他**合法的闭合标签**，**cdata** （请参考规则7）和任意字符（注意参考规则1）除了不匹配的`<`、不匹配的起始和结束标签、不匹配的或带有不合法 TAG_NAME 的闭合标签。否则，`TAG_CONTENT` 是**不合法的**。
+5. 一个起始标签，如果没有具有相同 TAG_NAME 的结束标签与之匹配，是不合法的。反之亦然。不过，你也需要考虑标签嵌套的问题。
+6. 一个`<`，如果你找不到一个后续的`>`与之匹配，是不合法的。并且当你找到一个`<`或`</`时，所有直到下一个`>`的前的字符，都应当被解析为 TAG_NAME（不一定合法）。
+7. cdata 有如下格式：`<![CDATA[CDATA_CONTENT]]>`。`CDATA_CONTENT` 的范围被定义成 `<![CDATA[` 和**后续的第一个** `]]>`之间的字符。
+8. `CDATA_CONTENT` 可以包含**任意字符**。cdata 的功能是阻止验证器解析`CDATA_CONTENT`，所以即使其中有一些字符可以被解析为标签（无论合法还是不合法），也应该将它们视为**常规字符**。
+
+#### 合法代码的例子:
+<pre>
+<strong>输入:</strong> "<DIV>This is the first line <![CDATA[<div>]]></DIV>"
+<strong>输出:</strong> True
+<strong>解释:</strong>
+代码被包含在了闭合的标签内： <DIV> 和 </DIV> 。
+TAG_NAME 是合法的，TAG_CONTENT 包含了一些字符和 cdata 。
+即使 CDATA_CONTENT 含有不匹配的起始标签和不合法的 TAG_NAME，它应该被视为普通的文本，而不是标签。
+所以 TAG_CONTENT 是合法的，因此代码是合法的。最终返回True。
+
+<strong>输入:</strong> "<DIV>>>  ![cdata[]] <![CDATA[<div>]>]]>]]>>]</DIV>"
+<strong>输出:</strong> True
+<strong>解释:</strong>
+我们首先将代码分割为： start_tag|tag_content|end_tag 。
+start_tag -> "<DIV>"
+end_tag -> "</DIV>"
+tag_content 也可被分割为： text1|cdata|text2 。
+text1 -> ">>  ![cdata[]] "
+cdata -> "<![CDATA[<div>]>]]>" ，其中 CDATA_CONTENT 为 "<div>]>"
+text2 -> "]]>>]"
+start_tag 不是 "<DIV>>>" 的原因参照规则 6 。
+cdata 不是 "<![CDATA[<div>]>]]>]]>" 的原因参照规则 7 。
+</pre>
+
+#### 不合法代码的例子:
+<pre>
+<strong>输入:</strong> "<A>  <B> </A>   </B>"
+<strong>输出:</strong> False
+<strong>解释:</strong> 不合法。如果 "<A>" 是闭合的，那么 "<B>" 一定是不匹配的，反之亦然。
+<strong>输入:</strong> "<DIV>  div tag is not closed  <DIV>"
+<strong>输出:</strong> False
+<strong>输入:</strong> "<DIV>  unmatched <  </DIV>"
+<strong>输出:</strong> False
+<strong>输入:</strong> "<DIV> closed tags with invalid tag name  <b>123</b> </DIV>"
+<strong>输出:</strong> False
+<strong>输入:</strong> "<DIV> unmatched tags with invalid tag name  </1234567890> and <CDATA[[]]>  </DIV>"
+<strong>输出:</strong> False
+<strong>输入:</strong> "<DIV>  unmatched start tag <B>  and unmatched end tag </C>  </DIV>"
+<strong>输出:</strong> False
+</pre>
+
+#### 注意:
+1. 为简明起见，你可以假设输入的代码（包括提到的**任意字符**）只包含`数字`, `字母`, `'<'`,`'>'`,`'/'`,`'!'`,`'['`,`']'`和`' '`。
+
+## 题解 (Python)
+
+### 1. 题解
+```Python
+class Solution:
+    def isValid(self, code: str) -> bool:
+        cdata = False
+        tagstack = []
+        i = 0
+
+        while i < len(code):
+            if cdata:
+                if code[i:i + 3] == "]]>":
+                    cdata = False
+                    i += 2
+            elif tagstack == [] and (code[i] != '<' or code[i:i + 2] in "</<!"):
+                return False
+            elif code[i:i + 9] == "<![CDATA[":
+                cdata = True
+                i += 8
+            elif code[i:i + 2] == "</":
+                for j in range(i + 2, i + 13):
+                    if j >= len(code) or j == i + 12 or (j == i + 2 and code[j] == '>'):
+                        return False
+                    elif code[j] == '>':
+                        if tagstack.pop() != code[i + 2:j]:
+                            return False
+                        if tagstack == [] and j != len(code) - 1:
+                            return False
+                        i = j
+                        break
+                    elif not code[j].isupper():
+                        return False
+            elif code[i] == '<':
+                for j in range(i + 1, i + 12):
+                    if j >= len(code) or j == i + 11 or (j == i + 1 and code[j] == '>'):
+                        return False
+                    elif code[j] == '>':
+                        tagstack.append(code[i + 1:j])
+                        i = j
+                        break
+                    elif not code[j].isupper():
+                        return False
+
+            i += 1
+
+        return tagstack == []
+```
diff --git a/Problemset/0591-Tag Validator/Solution.py b/Problemset/0591-Tag Validator/Solution.py
@@ -0,0 +1,43 @@
+class Solution:
+    def isValid(self, code: str) -> bool:
+        cdata = False
+        tagstack = []
+        i = 0
+
+        while i < len(code):
+            if cdata:
+                if code[i:i + 3] == "]]>":
+                    cdata = False
+                    i += 2
+            elif tagstack == [] and (code[i] != '<' or code[i:i + 2] in "</<!"):
+                return False
+            elif code[i:i + 9] == "<![CDATA[":
+                cdata = True
+                i += 8
+            elif code[i:i + 2] == "</":
+                for j in range(i + 2, i + 13):
+                    if j >= len(code) or j == i + 12 or (j == i + 2 and code[j] == '>'):
+                        return False
+                    elif code[j] == '>':
+                        if tagstack.pop() != code[i + 2:j]:
+                            return False
+                        if tagstack == [] and j != len(code) - 1:
+                            return False
+                        i = j
+                        break
+                    elif not code[j].isupper():
+                        return False
+            elif code[i] == '<':
+                for j in range(i + 1, i + 12):
+                    if j >= len(code) or j == i + 11 or (j == i + 1 and code[j] == '>'):
+                        return False
+                    elif code[j] == '>':
+                        tagstack.append(code[i + 1:j])
+                        i = j
+                        break
+                    elif not code[j].isupper():
+                        return False
+
+            i += 1
+
+        return tagstack == []
diff --git a/README.md b/README.md
@@ -412,6 +412,7 @@
 [587][587l]  |[Erect the Fence][587]                                                                |![rs]
 [589][589l]  |[N-ary Tree Preorder Traversal][589]                                                  |![py]
 [590][590l]  |[N-ary Tree Postorder Traversal][590]                                                 |![py]
+[591][591l]  |[Tag Validator][591]                                                                  |![py]
 [592][592l]  |[Fraction Addition and Subtraction][592]                                              |![rs]
 [593][593l]  |[Valid Square][593]                                                                   |![rs]
 [594][594l]  |[Longest Harmonious Subsequence][594]                                                 |![rs]
@@ -2099,6 +2100,7 @@
 [587]:Problemset/0587-Erect%20the%20Fence/README.md#587-erect-the-fence
 [589]:Problemset/0589-N-ary%20Tree%20Preorder%20Traversal/README.md#589-n-ary-tree-preorder-traversal
 [590]:Problemset/0590-N-ary%20Tree%20Postorder%20Traversal/README.md#590-n-ary-tree-postorder-traversal
+[591]:Problemset/0591-Tag%20Validator/README.md#591-tag-validator
 [592]:Problemset/0592-Fraction%20Addition%20and%20Subtraction/README.md#592-fraction-addition-and-subtraction
 [593]:Problemset/0593-Valid%20Square/README.md#593-valid-square
 [594]:Problemset/0594-Longest%20Harmonious%20Subsequence/README.md#594-longest-harmonious-subsequence
@@ -3780,6 +3782,7 @@
 [587l]:https://leetcode.com/problems/erect-the-fence/
 [589l]:https://leetcode.com/problems/n-ary-tree-preorder-traversal/
 [590l]:https://leetcode.com/problems/n-ary-tree-postorder-traversal/
+[591l]:https://leetcode.com/problems/tag-validator/
 [592l]:https://leetcode.com/problems/fraction-addition-and-subtraction/
 [593l]:https://leetcode.com/problems/valid-square/
 [594l]:https://leetcode.com/problems/longest-harmonious-subsequence/
diff --git a/README_CN.md b/README_CN.md
@@ -412,6 +412,7 @@
 [587][587l]  |[安装栅栏][587]                                           |![rs]
 [589][589l]  |[N叉树的前序遍历][589]                                    |![py]
 [590][590l]  |[N叉树的后序遍历][590]                                    |![py]
+[591][591l]  |[标签验证器][591]                                         |![py]
 [592][592l]  |[分数加减运算][592]                                       |![rs]
 [593][593l]  |[有效的正方形][593]                                       |![rs]
 [594][594l]  |[最长和谐子序列][594]                                     |![rs]
@@ -2099,6 +2100,7 @@
 [587]:Problemset/0587-Erect%20the%20Fence/README_CN.md#587-安装栅栏
 [589]:Problemset/0589-N-ary%20Tree%20Preorder%20Traversal/README_CN.md#589-n叉树的前序遍历
 [590]:Problemset/0590-N-ary%20Tree%20Postorder%20Traversal/README_CN.md#590-n叉树的后序遍历
+[591]:Problemset/0591-Tag%20Validator/README_CN.md#591-标签验证器
 [592]:Problemset/0592-Fraction%20Addition%20and%20Subtraction/README_CN.md#592-分数加减运算
 [593]:Problemset/0593-Valid%20Square/README_CN.md#593-有效的正方形
 [594]:Problemset/0594-Longest%20Harmonious%20Subsequence/README_CN.md#594-最长和谐子序列
@@ -3780,6 +3782,7 @@
 [587l]:https://leetcode.cn/problems/erect-the-fence/
 [589l]:https://leetcode.cn/problems/n-ary-tree-preorder-traversal/
 [590l]:https://leetcode.cn/problems/n-ary-tree-postorder-traversal/
+[591l]:https://leetcode.cn/problems/tag-validator/
 [592l]:https://leetcode.cn/problems/fraction-addition-and-subtraction/
 [593l]:https://leetcode.cn/problems/valid-square/
 [594l]:https://leetcode.cn/problems/longest-harmonious-subsequence/