Author: | David Goodger |
---|---|
Contact: | docutils-develop@lists.sourceforge.net |
Date: | 2012-01-03 |
Revision: | 7302 |
Copyright: | This document has been placed in the public domain. |
Contents
The docutils.core.Publisher class is the core of Docutils, managing all the processing and relationships between components. See PEP 258 for an overview of Docutils components.
The docutils.core.publish_* convenience functions are the normal entry points for using Docutils as a library.
See Inside A Docutils Command-Line Front-End Tool for an overview of a typical Docutils front-end tool, including how the Publisher class is used.
Each of these functions set up a docutils.core.Publisher object, then call its publish method. docutils.core.Publisher.publish handles everything else. There are several convenience functions in the docutils.core module:
publish_cmdline: | |
---|---|
for command-line front-end tools, like rst2html.py. There are several examples in the tools/ directory. A detailed analysis of one such tool is in Inside A Docutils Command-Line Front-End Tool |
|
publish_file: | for programmatic use with file-like I/O. In addition to writing the encoded output to a file, also returns the encoded output as a string. |
publish_string: | for programmatic use with string I/O. Returns the encoded output as a string. |
publish_parts: | for programmatic use with string input; returns a dictionary of document parts. Dictionary keys are the names of parts, and values are Unicode strings; encoding is up to the client. Useful when only portions of the processed document are desired. See publish_parts Details below. There are usage examples in the docutils/examples.py module. |
publish_doctree: | |
for programmatic use with string input; returns a Docutils document tree data structure (doctree). The doctree can be modified, pickled & unpickled, etc., and then reprocessed with publish_from_doctree. |
|
publish_from_doctree: | |
for programmatic use to render from an existing document tree data structure (doctree); returns the encoded output as a string. |
|
publish_programmatically: | |
for custom programmatic use. This function implements common code and is used by publish_file, publish_string, and publish_parts. It returns a 2-tuple: the encoded string output and the Publisher object. |
To pass application-specific setting defaults to the Publisher convenience functions, use the settings_overrides parameter. Pass a dictionary of setting names & values, like this:
overrides = {'input_encoding': 'ascii', 'output_encoding': 'latin-1'} output = publish_string(..., settings_overrides=overrides)
Settings from command-line options override configuration file settings, and they override application defaults. For details, see Docutils Runtime Settings. See Docutils Configuration Files for details about individual settings.
The default output encoding of Docutils is UTF-8. If you have any non-ASCII in your input text, you may have to do a bit more setup. Docutils may introduce some non-ASCII text if you use auto-symbol footnotes or the "contents" directive.
The docutils.core.publish_parts convenience function returns a dictionary of document parts. Dictionary keys are the names of parts, and values are Unicode strings.
Each Writer component may publish a different set of document parts, described below. Not all writers implement all parts.
parts['body_prefix'] contains:
</head> <body> <div class="document" ...>
and, if applicable:
<div class="header"> ... </div>
parts['body_pre_docinfo] contains (as applicable):
<h1 class="title">...</h1> <h2 class="subtitle" id="...">...</h2>
parts['body_suffix'] contains:
</div>
(the end-tag for <div class="document">), the footer division if applicable:
<div class="footer"> ... </div>
and:
</body> </html>
parts['html_head'] contains the HTML <head> content, less the stylesheet link and the <head> and </head> tags themselves. Since publish_parts returns Unicode strings and does not know about the output encoding, the "Content-Type" meta tag's "charset" value is left unresolved, as "%s":
<meta http-equiv="Content-Type" content="text/html; charset=%s" />
The interpolation should be done by client code.
parts['html_prolog] contains the XML declaration and the doctype declaration. The XML declaration's "encoding" attribute's value is left unresolved, as "%s":
<?xml version="1.0" encoding="%s" ?>
The interpolation should be done by client code.
The PEP/HTML writer provides the same parts as the HTML writer, plus the following:
The S5/HTML writer provides the same parts as the HTML writer.
See the template files for examples how these parts can be combined into a valid LaTeX document.
parts['body'] contains the document's content. In other words, it contains the entire document, except the document title, subtitle, and docinfo.
This part can be included into another LaTeX document body using the \input{} command.
parts['docinfo'] contains the document bibliographic data, the docinfo field list rendered as a table.
With --use-latex-docinfo 'author', 'organization', 'contact', 'address' and 'date' info is moved to titledata.
'dedication' and 'abstract' are always moved to separate parts.
parts['titledata] contains the combined title data in \title, \author, and \data macros.
With --use-latex-docinfo, this includes the 'author', 'organization', 'contact', 'address' and 'date' docinfo items.