Python supports a variety of modules to work with various forms of structured data markup. This includes modules to work with the Standard Generalized Markup Language (SGML) and the Hypertext Markup Language (HTML), and several interfaces for working with the Extensible Markup Language (XML).
It is important to note that modules in the
xml
package require that there be at least one SAX-compliant XML parser available. Starting with Python 2.3, the Expat parser is included with Python, so the
xml.parsers.expat
module will always be available. You may still want to be aware of the
PyXML add-on package
; that package provides an extended set of XML libraries for Python.
The documentation for the
xml.dom
and
xml.sax
packages are the definition of the Python bindings for the DOM and SAX interfaces.
HTMLParser
— 简单 HTML 和 XHTML 剖析器
sgmllib
— 简单 SGML 剖析器
htmllib
— HTML 文档剖析器
htmlentitydefs
— HTML 一般实体的定义
xml.etree.ElementTree
— ElementTree XML API
xml.dom
— DOM (文档对象模型) API
xml.dom.minidom
— 最小 DOM (文档对象模型) 实现
xml.dom.pulldom
— 支持构建部分 DOM (文档对象模型) 树
xml.sax
— 支持 SAX2 剖析器
xml.sax.handler
— 用于 SAX 处理程序的基类
xml.sax.saxutils
— SAX 实用程序
xml.sax.xmlreader
— 用于 XML 剖析器的接口
xml.parsers.expat
— 使用 Expat 快速剖析 XML