Skip to content

Commit 3cdb98b

Browse files
authored
Merge pull request #2523 from sparklemotion/flavorjones-pattern-matching
feat: experimental implementation of pattern matching
2 parents 371c026 + d4dcb8b commit 3cdb98b

11 files changed

+611
-6
lines changed

.rubocop.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,7 @@ AllCops:
1212
- 'lib/nokogiri/css/parser.rb' # generated by racc
1313
- 'lib/nokogiri/css/tokenizer.rb' # generated by rex
1414
- 'lib/nokogiri/jruby/nokogiri_jars.rb' # generated by jar-dependencies
15+
- 'test/_test_pattern_matching.rb' # until TargetRubyVersion >= 3.0
1516
TargetRubyVersion: "2.6"
1617
Naming/MethodName:
1718
Enabled: false

CHANGELOG.md

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,22 @@ This version of Nokogiri ships official native gem support for the `aarch64-linu
1818
This version of Nokogiri ships experimental native gem support for the `arm-linux` platform. Please note that glibc >= 2.29 is required for arm-linux systems, see [Supported Platforms](https://nokogiri.org/#supported-platforms) for more information.
1919

2020

21+
#### Experimental pattern matching support
22+
23+
This version introduces an experimental pattern matching API for `XML::Attr`, `XML::Document`, `XML::DocumentFragment`, `XML::Namespace`, `XML::Node`, and `XML::NodeSet` (and their subclasses).
24+
25+
Some documentation on what can be matched:
26+
27+
- [`XML::Attr#deconstruct_keys`](https://nokogiri.org/rdoc/Nokogiri/XML/Attr.html?h=deconstruct#method-i-deconstruct_keys)
28+
- [`XML::Document#deconstruct_keys`](https://nokogiri.org/rdoc/Nokogiri/XML/Document.html?h=deconstruct#method-i-deconstruct_keys)
29+
- [`XML::Namespace#deconstruct_keys`](https://nokogiri.org/rdoc/Nokogiri/XML/Namespace.html?h=deconstruct+namespace#method-i-deconstruct_keys)
30+
- [`XML::Node#deconstruct_keys`](https://nokogiri.org/rdoc/Nokogiri/XML/Node.html?h=deconstruct#method-i-deconstruct_keys)
31+
- [`XML::DocumentFragment#deconstruct`](https://nokogiri.org/rdoc/Nokogiri/XML/DocumentFragment.html?h=deconstruct#method-i-deconstruct)
32+
- [`XML::NodeSet#deconstruct`](https://nokogiri.org/rdoc/Nokogiri/XML/NodeSet.html?h=deconstruct#method-i-deconstruct)
33+
34+
We welcome feedback on this API at [#2360](https://github.com/sparklemotion/nokogiri/issues/2360).
35+
36+
2137
#### Maven-managed JRuby dependencies
2238

2339
This version of Nokogiri uses [`jar-dependencies`](https://github.com/mkristian/jar-dependencies) to manage most of the vendored Java dependencies. `nokogiri -v` now outputs maven metadata for all Java dependencies, and `Nokogiri::VERSION_INFO` also contains this metadata. [[#2432](https://github.com/sparklemotion/nokogiri/issues/2432)]

ext/nokogiri/xml_namespace.c

Lines changed: 38 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -74,10 +74,26 @@ static const rb_data_type_t nokogiri_xml_namespace_type_without_dealloc = {
7474
};
7575

7676
/*
77-
* call-seq:
78-
* prefix
77+
* :call-seq:
78+
* prefix() → String or nil
7979
*
80-
* Get the prefix for this namespace. Returns +nil+ if there is no prefix.
80+
* Return the prefix for this Namespace, or +nil+ if there is no prefix (e.g., default namespace).
81+
*
82+
* *Example*
83+
*
84+
* doc = Nokogiri::XML.parse(<<~XML)
85+
* <?xml version="1.0"?>
86+
* <root xmlns="http://nokogiri.org/ns/default" xmlns:noko="http://nokogiri.org/ns/noko">
87+
* <child1 foo="abc" noko:bar="def"/>
88+
* <noko:child2 foo="qwe" noko:bar="rty"/>
89+
* </root>
90+
* XML
91+
*
92+
* doc.root.elements.first.namespace.prefix
93+
* # => nil
94+
*
95+
* doc.root.elements.last.namespace.prefix
96+
* # => "noko"
8197
*/
8298
static VALUE
8399
prefix(VALUE self)
@@ -91,10 +107,26 @@ prefix(VALUE self)
91107
}
92108

93109
/*
94-
* call-seq:
95-
* href
110+
* :call-seq:
111+
* href() → String
112+
*
113+
* Returns the URI reference for this Namespace.
114+
*
115+
* *Example*
116+
*
117+
* doc = Nokogiri::XML.parse(<<~XML)
118+
* <?xml version="1.0"?>
119+
* <root xmlns="http://nokogiri.org/ns/default" xmlns:noko="http://nokogiri.org/ns/noko">
120+
* <child1 foo="abc" noko:bar="def"/>
121+
* <noko:child2 foo="qwe" noko:bar="rty"/>
122+
* </root>
123+
* XML
124+
*
125+
* doc.root.elements.first.namespace.href
126+
* # => "http://nokogiri.org/ns/default"
96127
*
97-
* Get the href for this namespace
128+
* doc.root.elements.last.namespace.href
129+
* # => "http://nokogiri.org/ns/noko"
98130
*/
99131
static VALUE
100132
href(VALUE self)

lib/nokogiri/xml/attr.rb

Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
1+
# coding: utf-8
12
# frozen_string_literal: true
23

34
module Nokogiri
@@ -7,6 +8,54 @@ class Attr < Node
78
alias_method :to_s, :content
89
alias_method :content=, :value=
910

11+
#
12+
# :call-seq: deconstruct_keys(array_of_names) → Hash
13+
#
14+
# Returns a hash describing the Attr, to use in pattern matching.
15+
#
16+
# Valid keys and their values:
17+
# - +name+ → (String) The name of the attribute.
18+
# - +value+ → (String) The value of the attribute.
19+
# - +namespace+ → (Namespace, nil) The Namespace of the attribute, or +nil+ if there is no namespace.
20+
#
21+
# ⚡ This is an experimental feature, available since v1.14.0
22+
#
23+
# *Example*
24+
#
25+
# doc = Nokogiri::XML.parse(<<~XML)
26+
# <?xml version="1.0"?>
27+
# <root xmlns="http://nokogiri.org/ns/default" xmlns:noko="http://nokogiri.org/ns/noko">
28+
# <child1 foo="abc" noko:bar="def"/>
29+
# </root>
30+
# XML
31+
#
32+
# attributes = doc.root.elements.first.attribute_nodes
33+
# # => [#(Attr:0x35c { name = "foo", value = "abc" }),
34+
# # #(Attr:0x370 {
35+
# # name = "bar",
36+
# # namespace = #(Namespace:0x384 {
37+
# # prefix = "noko",
38+
# # href = "http://nokogiri.org/ns/noko"
39+
# # }),
40+
# # value = "def"
41+
# # })]
42+
#
43+
# attributes.first.deconstruct_keys([:name, :value, :namespace])
44+
# # => {:name=>"foo", :value=>"abc", :namespace=>nil}
45+
#
46+
# attributes.last.deconstruct_keys([:name, :value, :namespace])
47+
# # => {:name=>"bar",
48+
# # :value=>"def",
49+
# # :namespace=>
50+
# # #(Namespace:0x384 {
51+
# # prefix = "noko",
52+
# # href = "http://nokogiri.org/ns/noko"
53+
# # })}
54+
#
55+
def deconstruct_keys(keys)
56+
{ name: name, value: value, namespace: namespace }
57+
end
58+
1059
private
1160

1261
def inspect_attributes

lib/nokogiri/xml/document.rb

Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -415,6 +415,50 @@ def xpath_doctype
415415
Nokogiri::CSS::XPathVisitor::DoctypeConfig::XML
416416
end
417417

418+
#
419+
# :call-seq: deconstruct_keys(array_of_names) → Hash
420+
#
421+
# Returns a hash describing the Document, to use in pattern matching.
422+
#
423+
# Valid keys and their values:
424+
# - +root+ → (Node, nil) The root node of the Document, or +nil+ if the document is empty.
425+
#
426+
# In the future, other keys may allow accessing things like doctype and processing
427+
# instructions. If you have a use case and would like this functionality, please let us know
428+
# by opening an issue or a discussion on the github project.
429+
#
430+
# ⚡ This is an experimental feature, available since v1.14.0
431+
#
432+
# *Example*
433+
#
434+
# doc = Nokogiri::XML.parse(<<~XML)
435+
# <?xml version="1.0"?>
436+
# <root>
437+
# <child>
438+
# </root>
439+
# XML
440+
#
441+
# doc.deconstruct_keys([:root])
442+
# # => {:root=>
443+
# # #(Element:0x35c {
444+
# # name = "root",
445+
# # children = [
446+
# # #(Text "\n" + " "),
447+
# # #(Element:0x370 { name = "child", children = [ #(Text "\n")] }),
448+
# # #(Text "\n")]
449+
# # })}
450+
#
451+
# *Example* of an empty document
452+
#
453+
# doc = Nokogiri::XML::Document.new
454+
#
455+
# doc.deconstruct_keys([:root])
456+
# # => {:root=>nil}
457+
#
458+
def deconstruct_keys(keys)
459+
{ root: root }
460+
end
461+
418462
private
419463

420464
IMPLIED_XPATH_CONTEXTS = ["//"].freeze # :nodoc:

lib/nokogiri/xml/document_fragment.rb

Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
1+
# coding: utf-8
12
# frozen_string_literal: true
23

34
module Nokogiri
@@ -144,6 +145,52 @@ def fragment(data)
144145
document.fragment(data)
145146
end
146147

148+
#
149+
# :call-seq: deconstruct() → Array
150+
#
151+
# Returns the root nodes of this document fragment as an array, to use in pattern matching.
152+
#
153+
# 💡 Note that text nodes are returned as well as elements. If you wish to operate only on
154+
# root elements, you should deconstruct the array returned by
155+
# <tt>DocumentFragment#elements</tt>.
156+
#
157+
# ⚡ This is an experimental feature, available since v1.14.0
158+
#
159+
# *Example*
160+
#
161+
# frag = Nokogiri::HTML5.fragment(<<~HTML)
162+
# <div>Start</div>
163+
# This is a <a href="#jump">shortcut</a> for you.
164+
# <div>End</div>
165+
# HTML
166+
#
167+
# frag.deconstruct
168+
# # => [#(Element:0x35c { name = "div", children = [ #(Text "Start")] }),
169+
# # #(Text "\n" + "This is a "),
170+
# # #(Element:0x370 {
171+
# # name = "a",
172+
# # attributes = [ #(Attr:0x384 { name = "href", value = "#jump" })],
173+
# # children = [ #(Text "shortcut")]
174+
# # }),
175+
# # #(Text " for you.\n"),
176+
# # #(Element:0x398 { name = "div", children = [ #(Text "End")] }),
177+
# # #(Text "\n")]
178+
#
179+
# *Example* only the elements, not the text nodes.
180+
#
181+
# frag.elements.deconstruct
182+
# # => [#(Element:0x35c { name = "div", children = [ #(Text "Start")] }),
183+
# # #(Element:0x370 {
184+
# # name = "a",
185+
# # attributes = [ #(Attr:0x384 { name = "href", value = "#jump" })],
186+
# # children = [ #(Text "shortcut")]
187+
# # }),
188+
# # #(Element:0x398 { name = "div", children = [ #(Text "End")] })]
189+
#
190+
def deconstruct
191+
children.to_a
192+
end
193+
147194
private
148195

149196
# fix for issue 770

lib/nokogiri/xml/namespace.rb

Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
1+
# coding: utf-8
12
# frozen_string_literal: true
23

34
module Nokogiri
@@ -6,6 +7,47 @@ class Namespace
67
include Nokogiri::XML::PP::Node
78
attr_reader :document
89

10+
#
11+
# :call-seq: deconstruct_keys(array_of_names) → Hash
12+
#
13+
# Returns a hash describing the Namespace, to use in pattern matching.
14+
#
15+
# Valid keys and their values:
16+
# - +prefix+ → (String, nil) The namespace's prefix, or +nil+ if there is no prefix (e.g., default namespace).
17+
# - +href+ → (String) The namespace's URI
18+
#
19+
# ⚡ This is an experimental feature, available since v1.14.0
20+
#
21+
# *Example*
22+
#
23+
# doc = Nokogiri::XML.parse(<<~XML)
24+
# <?xml version="1.0"?>
25+
# <root xmlns="http://nokogiri.org/ns/default" xmlns:noko="http://nokogiri.org/ns/noko">
26+
# <child1 foo="abc" noko:bar="def"/>
27+
# <noko:child2 foo="qwe" noko:bar="rty"/>
28+
# </root>
29+
# XML
30+
#
31+
# doc.root.elements.first.namespace
32+
# # => #(Namespace:0x35c { href = "http://nokogiri.org/ns/default" })
33+
#
34+
# doc.root.elements.first.namespace.deconstruct_keys([:prefix, :href])
35+
# # => {:prefix=>nil, :href=>"http://nokogiri.org/ns/default"}
36+
#
37+
# doc.root.elements.last.namespace
38+
# # => #(Namespace:0x370 {
39+
# # prefix = "noko",
40+
# # href = "http://nokogiri.org/ns/noko"
41+
# # })
42+
#
43+
# doc.root.elements.last.namespace.deconstruct_keys([:prefix, :href])
44+
# # => {:prefix=>"noko", :href=>"http://nokogiri.org/ns/noko"}
45+
#
46+
#
47+
def deconstruct_keys(keys)
48+
{ prefix: prefix, href: href }
49+
end
50+
951
private
1052

1153
def inspect_attributes

lib/nokogiri/xml/node.rb

Lines changed: 63 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1403,6 +1403,69 @@ def canonicalize(mode = XML::XML_C14N_1_0, inclusive_namespaces = nil, with_comm
14031403
end
14041404
end
14051405

1406+
DECONSTRUCT_KEYS = [:name, :attributes, :children, :namespace, :content, :elements, :inner_html].freeze # :nodoc:
1407+
DECONSTRUCT_METHODS = { attributes: :attribute_nodes }.freeze # :nodoc:
1408+
1409+
#
1410+
# :call-seq: deconstruct_keys(array_of_names) → Hash
1411+
#
1412+
# Returns a hash describing the Node, to use in pattern matching.
1413+
#
1414+
# Valid keys and their values:
1415+
# - +name+ → (String) The name of this node, or "text" if it is a Text node.
1416+
# - +namespace+ → (Namespace, nil) The namespace of this node, or nil if there is no namespace.
1417+
# - +attributes+ → (Array<Attr>) The attributes of this node.
1418+
# - +children+ → (Array<Node>) The children of this node. 💡 Note this includes text nodes.
1419+
# - +elements+ → (Array<Node>) The child elements of this node. 💡 Note this does not include text nodes.
1420+
# - +content+ → (String) The contents of all the text nodes in this node's subtree. See #content.
1421+
# - +inner_html+ → (String) The inner markup for the children of this node. See #inner_html.
1422+
#
1423+
# ⚡ This is an experimental feature, available since v1.14.0
1424+
#
1425+
# *Example*
1426+
#
1427+
# doc = Nokogiri::XML.parse(<<~XML)
1428+
# <?xml version="1.0"?>
1429+
# <parent xmlns="http://nokogiri.org/ns/default" xmlns:noko="http://nokogiri.org/ns/noko">
1430+
# <child1 foo="abc" noko:bar="def">First</child1>
1431+
# <noko:child2 foo="qwe" noko:bar="rty">Second</noko:child2>
1432+
# </parent>
1433+
# XML
1434+
#
1435+
# doc.root.deconstruct_keys([:name, :namespace])
1436+
# # => {:name=>"parent",
1437+
# # :namespace=>
1438+
# # #(Namespace:0x35c { href = "http://nokogiri.org/ns/default" })}
1439+
#
1440+
# doc.root.deconstruct_keys([:inner_html, :content])
1441+
# # => {:content=>"\n" + " First\n" + " Second\n",
1442+
# # :inner_html=>
1443+
# # "\n" +
1444+
# # " <child1 foo=\"abc\" noko:bar=\"def\">First</child1>\n" +
1445+
# # " <noko:child2 foo=\"qwe\" noko:bar=\"rty\">Second</noko:child2>\n"}
1446+
#
1447+
# doc.root.elements.first.deconstruct_keys([:attributes])
1448+
# # => {:attributes=>
1449+
# # [#(Attr:0x370 { name = "foo", value = "abc" }),
1450+
# # #(Attr:0x384 {
1451+
# # name = "bar",
1452+
# # namespace = #(Namespace:0x398 {
1453+
# # prefix = "noko",
1454+
# # href = "http://nokogiri.org/ns/noko"
1455+
# # }),
1456+
# # value = "def"
1457+
# # })]}
1458+
#
1459+
def deconstruct_keys(keys)
1460+
requested_keys = DECONSTRUCT_KEYS & keys
1461+
{}.tap do |values|
1462+
requested_keys.each do |key|
1463+
method = DECONSTRUCT_METHODS[key] || key
1464+
values[key] = send(method)
1465+
end
1466+
end
1467+
end
1468+
14061469
# :section:
14071470

14081471
protected

0 commit comments

Comments
 (0)