HTML

The XML DataSource can also be used to read HTML. In this mode, the HTML is parsed into well-formed XML that can be queried using XPath expressions. The HTML parser can handle HTTPS pages, pages that are not well-formed - e.g. with missing close tags, or unquoted attributes and will clean the tree and make it usable for effective data acquisition. HTTPS is supported.

Note

HTML parsing is not available when Subtree Optimization is enabled.