2012-03-01 13:53:12 +00:00
|
|
|
|
<!doctype html>
|
|
|
|
|
<meta charset=utf-8>
|
|
|
|
|
<title>HTML Tidy for HTML5 (experimental)</title>
|
|
|
|
|
<style type="text/css">
|
|
|
|
|
html {
|
|
|
|
|
background: #DDE5D9 url() repeat 0 0;
|
|
|
|
|
font-family: "Lucida Sans Unicode", "Lucida Sans", verdana, arial, helvetica;
|
|
|
|
|
}
|
|
|
|
|
body {
|
|
|
|
|
border: solid 1px #CED4CA;
|
|
|
|
|
background-color: #FFF;
|
|
|
|
|
padding: 4px 40px 40px 40px;
|
|
|
|
|
margin: 20px 20px 20px 20px;
|
|
|
|
|
padding-right: 20%;
|
|
|
|
|
}
|
|
|
|
|
h1, h2 {
|
|
|
|
|
color: #0B5B9D;
|
|
|
|
|
}
|
|
|
|
|
h1 {
|
|
|
|
|
font-size: 39px;
|
|
|
|
|
font-weight: normal;
|
|
|
|
|
vertical-align: top;
|
|
|
|
|
margin-bottom: 0px;
|
|
|
|
|
}
|
|
|
|
|
a {
|
|
|
|
|
text-decoration: none;
|
|
|
|
|
color: #0B5B9D;
|
|
|
|
|
padding: 2px;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
a:hover {
|
|
|
|
|
text-decoration: none;
|
|
|
|
|
background-color: #0B5B9D;
|
|
|
|
|
color: white;
|
|
|
|
|
}
|
|
|
|
|
a:active {
|
|
|
|
|
text-decoration: none;
|
|
|
|
|
background-color: white;
|
|
|
|
|
color: black;
|
|
|
|
|
}
|
|
|
|
|
#toc {
|
|
|
|
|
position: fixed;
|
|
|
|
|
top: 10px;
|
|
|
|
|
right: 10px;
|
|
|
|
|
border: 2px solid #0B5B9D;
|
|
|
|
|
background: rgba(255,255,255,.9);
|
|
|
|
|
padding: 15px;
|
|
|
|
|
z-index: 999;
|
|
|
|
|
max-height: 400px;
|
|
|
|
|
overflow: auto;
|
|
|
|
|
font-size: 11px;
|
|
|
|
|
font-family: Verdana, sans-serif;
|
|
|
|
|
}
|
|
|
|
|
#toc-button {
|
2012-03-17 07:48:07 +00:00
|
|
|
|
position: fixed;
|
|
|
|
|
top: 10px;
|
|
|
|
|
right: 10px;
|
|
|
|
|
background: transparent;
|
|
|
|
|
padding: 15px;
|
|
|
|
|
z-index: 999;
|
|
|
|
|
max-height: 400px;
|
|
|
|
|
overflow: auto;
|
|
|
|
|
font-size: 11px;
|
|
|
|
|
font-family: Verdana, sans-serif;
|
2012-03-01 13:53:12 +00:00
|
|
|
|
}
|
|
|
|
|
#toc .button,
|
2012-03-17 07:48:07 +00:00
|
|
|
|
#toc-button .button,
|
|
|
|
|
#quickref-button .button {
|
2012-03-01 13:53:12 +00:00
|
|
|
|
float: right;
|
2012-03-17 07:48:07 +00:00
|
|
|
|
width: 59px;
|
|
|
|
|
text-align: center;
|
2012-03-01 13:53:12 +00:00
|
|
|
|
margin: 0 0 5px 5px;
|
|
|
|
|
padding: 5px;
|
|
|
|
|
border: 1px #008 solid;
|
|
|
|
|
color:#00f;
|
|
|
|
|
background-color:#ccf;
|
|
|
|
|
}
|
|
|
|
|
#toc ol {
|
|
|
|
|
margin: 0;
|
|
|
|
|
padding: 0;
|
|
|
|
|
font-size: 11px;
|
|
|
|
|
font-family: Verdana, sans-serif;
|
|
|
|
|
}
|
|
|
|
|
#toc li {
|
|
|
|
|
list-style: decimal outside;
|
|
|
|
|
margin-left: 20px;
|
|
|
|
|
font-size: 11px;
|
|
|
|
|
font-family: Verdana, sans-serif;
|
|
|
|
|
}
|
|
|
|
|
#toc li a {
|
|
|
|
|
font-size: 11px;
|
|
|
|
|
font-family: Verdana, sans-serif;
|
|
|
|
|
}
|
|
|
|
|
.hide {
|
|
|
|
|
display: none;
|
|
|
|
|
}
|
|
|
|
|
.show {
|
|
|
|
|
display: block;
|
|
|
|
|
}
|
2012-03-17 07:48:07 +00:00
|
|
|
|
#quickref-button {
|
|
|
|
|
position: fixed;
|
|
|
|
|
top: 40px;
|
|
|
|
|
right: 10px;
|
|
|
|
|
background: transparent;
|
|
|
|
|
padding: 15px;
|
|
|
|
|
z-index: 999;
|
|
|
|
|
max-height: 400px;
|
|
|
|
|
overflow: auto;
|
|
|
|
|
font-size: 11px;
|
|
|
|
|
font-family: Verdana, sans-serif;
|
|
|
|
|
}
|
2012-03-01 13:53:12 +00:00
|
|
|
|
code { color: green; font-weight: bold; }
|
|
|
|
|
pre { color: green; font-weight: bold; font-family: monospace}
|
|
|
|
|
em { font-style: italic; color: rgb(0, 0, 153) }
|
|
|
|
|
:link { color: rgb(0, 0, 153) }
|
|
|
|
|
:visited { color: rgb(153, 0, 153) }
|
|
|
|
|
</style>
|
|
|
|
|
|
|
|
|
|
<h1 id=intro>HTML Tidy for HTML5 (experimental)</h1>
|
|
|
|
|
<p>This page documents the experimental HTML5 fork of HTML Tidy available
|
|
|
|
|
at
|
|
|
|
|
<a href="https://github.com/w3c/tidy-html5">https://github.com/w3c/tidy-html5</a>.
|
|
|
|
|
|
|
|
|
|
<p>File bug reports and enhancement requests at
|
|
|
|
|
<a href="https://github.com/w3c/tidy-html5/issues">https://github.com/w3c/tidy-html5/issues</a>.</p>
|
|
|
|
|
|
|
|
|
|
<p>The W3C public mailing list for HTML Tidy discussion is
|
|
|
|
|
<b>html-tidy@w3.org</b> (<a href= "http://lists.w3.org/Archives/Public/html-tidy/">list archive</a>).
|
|
|
|
|
|
|
|
|
|
<p>For more information on HTML5:</p>
|
|
|
|
|
<ul>
|
|
|
|
|
<li>
|
|
|
|
|
<a href="http://dev.w3.org/html5/spec-author-view">HTML: Edition for Web Authors</a> (the latest HTML specification)
|
|
|
|
|
<li>
|
|
|
|
|
<a href="http://dev.w3.org/html5/markup/">HTML: The Markup Language</a> (an HTML language reference)
|
|
|
|
|
</ul>
|
|
|
|
|
<p>
|
|
|
|
|
Validate your HTML documents using the
|
|
|
|
|
<a href="http://validator.w3.org/nu/">W3C Nu Markup Validator</a>.
|
|
|
|
|
|
|
|
|
|
<h2 id=what-tidy-does>What Tidy does</h2>
|
|
|
|
|
<p>Tidy corrects and cleans up HTML content by fixing markup errors.
|
|
|
|
|
Here are a few examples:
|
|
|
|
|
<ul>
|
|
|
|
|
<li><b>Mismatched end tags:</b>
|
|
|
|
|
<pre>
|
|
|
|
|
<h2>subheading</h3>
|
|
|
|
|
</pre>
|
|
|
|
|
<p>…is converted to:</p>
|
|
|
|
|
<pre>
|
|
|
|
|
<h2>subheading</h2>
|
|
|
|
|
</pre></li>
|
|
|
|
|
<li><b>Misnested tags:</b>
|
|
|
|
|
<pre>
|
|
|
|
|
<p>here is a para <b>bold <i>bold italic</b> bold?</i> normal?
|
|
|
|
|
</pre>
|
|
|
|
|
<p>…is converted to:</p>
|
|
|
|
|
<pre>
|
|
|
|
|
<p>here is a para <b>bold <i>bold italic</i> bold?</b> normal?
|
|
|
|
|
</pre></li>
|
|
|
|
|
<li><b>Missing end tags:</b>
|
|
|
|
|
<pre>
|
|
|
|
|
<h1>heading
|
|
|
|
|
<h2>subheading</h2>
|
|
|
|
|
</pre>
|
|
|
|
|
<p>…is converted to:</p>
|
|
|
|
|
<pre>
|
|
|
|
|
<h1>heading</h1>
|
|
|
|
|
<h2>subheading</h2>
|
|
|
|
|
</pre>
|
|
|
|
|
…and
|
|
|
|
|
<pre>
|
|
|
|
|
<h1><i>italic heading</h1>
|
|
|
|
|
</pre>
|
|
|
|
|
<p>…is converted to:</p>
|
|
|
|
|
<pre>
|
|
|
|
|
<h1><i>italic heading</i></h1>
|
|
|
|
|
</pre></li>
|
|
|
|
|
<li><b>Mixed-up tags</b>
|
|
|
|
|
<pre>
|
|
|
|
|
<i><h1>heading</h1></i>
|
|
|
|
|
<p>new paragraph <b>bold text
|
|
|
|
|
<p>some more bold text
|
|
|
|
|
</pre>
|
|
|
|
|
<p>…is converted to:</p>
|
|
|
|
|
<pre>
|
|
|
|
|
<h1><i>heading</i></h1>
|
|
|
|
|
<p>new paragraph <b>bold text</b>
|
|
|
|
|
<p><b>some more bold text</b>
|
|
|
|
|
</pre></li>
|
|
|
|
|
<li><b>Tag in the wrong place:</b>
|
|
|
|
|
<pre>
|
|
|
|
|
<h1><hr>heading</h1>
|
|
|
|
|
<h2>sub<hr>heading</h2>
|
|
|
|
|
</pre>
|
|
|
|
|
<p>…is converted to:</p>
|
|
|
|
|
<pre>
|
|
|
|
|
<hr>
|
|
|
|
|
<h1>heading</h1>
|
|
|
|
|
<h2>sub</h2>
|
|
|
|
|
<hr>
|
|
|
|
|
<h2>heading</h2>
|
|
|
|
|
</pre></li>
|
|
|
|
|
<li><b>Missing "/" in end tags:</b>
|
|
|
|
|
<pre>
|
|
|
|
|
<a href="#refs">References<a>
|
|
|
|
|
</pre>
|
|
|
|
|
<p>…is converted to:</p>
|
|
|
|
|
<pre>
|
|
|
|
|
<a href="#refs">References</a>
|
|
|
|
|
</pre></li>
|
|
|
|
|
<li><b>List markup with missing tags:</b>
|
|
|
|
|
<pre>
|
|
|
|
|
<body>
|
|
|
|
|
<li>1st list item
|
|
|
|
|
<li>2nd list item
|
|
|
|
|
</pre>
|
|
|
|
|
<p>…is converted to:</p>
|
|
|
|
|
<pre>
|
|
|
|
|
<body>
|
|
|
|
|
<ul>
|
|
|
|
|
<li>1st list item</li>
|
|
|
|
|
<li>2nd list item</li>
|
|
|
|
|
</ul>
|
|
|
|
|
</pre></li>
|
|
|
|
|
<li><b>Missing quotation marks around attribute values</b>
|
|
|
|
|
<p>Tidy inserts quotation marks around all attribute values for you. It
|
|
|
|
|
can also detect when you have forgotten the closing quotation mark,
|
|
|
|
|
although this is something you will have to fix yourself.</p>
|
|
|
|
|
</li>
|
|
|
|
|
<li><b>Unknown/proprietary attributes</b>
|
|
|
|
|
<p>Tidy has a comprehensive knowledge of the attributes defined in HTML5.
|
|
|
|
|
That often allows you to spot where you have mis-typed an attribute.
|
|
|
|
|
</li>
|
|
|
|
|
<li><b>Tags lacking a terminating ">"</b>
|
|
|
|
|
<p>This is something you then have to fix yourself as Tidy cannot
|
|
|
|
|
determine where the ">" was meant to be inserted.</p>
|
|
|
|
|
</li>
|
|
|
|
|
</ul>
|
|
|
|
|
|
|
|
|
|
<h2 id="help">How to run Tidy from the command line</h2>
|
|
|
|
|
<p>This is the syntax for invoking Tidy from the command line:
|
|
|
|
|
<pre>
|
|
|
|
|
<code>tidy <em>[[options] filename]*</em></code>
|
|
|
|
|
</pre>
|
|
|
|
|
<p>
|
|
|
|
|
Tidy defaults to reading from standard input, so if you run Tidy without
|
|
|
|
|
specifying the <code><em>filename</em></code> argument, it will just sit
|
|
|
|
|
there waiting for input to read.
|
|
|
|
|
And Tidy defaults to writing to standard output. So you can pipe output
|
|
|
|
|
from Tidy to other programs, as well as pipe output from other programs to
|
|
|
|
|
Tidy. You can page through the output from Tidy by piping it to a pager:</p>
|
|
|
|
|
<pre>
|
|
|
|
|
tidy file.html | less
|
|
|
|
|
</pre>
|
|
|
|
|
<p>
|
|
|
|
|
To have Tidy write its output to a file instead, either use the
|
|
|
|
|
<code>-o <em>filename</em></code> or <code>-output <em>filename</em></code>
|
|
|
|
|
option, or redirect standard output to the file; for example:
|
|
|
|
|
<pre>
|
|
|
|
|
tidy -o output.html index.html
|
|
|
|
|
tidy index.html > output.html
|
|
|
|
|
</pre>
|
|
|
|
|
<p>Both of those run tidy on the file <b>index.html</b> and write the
|
|
|
|
|
output to the file <b>output.html</b>, while writing any error messages to
|
|
|
|
|
standard error.
|
|
|
|
|
<p>
|
|
|
|
|
Tidy defaults to writing its error messages to standard error (that is, to
|
|
|
|
|
the console where you’re running Tidy). To page through the error messages,
|
|
|
|
|
along with the output, redirect standard error to standard output, and pipe
|
|
|
|
|
it to your pager:
|
|
|
|
|
<pre>
|
|
|
|
|
tidy index.html 2>&1 | less
|
|
|
|
|
</pre>
|
|
|
|
|
<p>
|
|
|
|
|
To have Tidy write the errors to a file instead, either use the
|
|
|
|
|
<code>-f <em>filename</em></code> or <code>-file <em>filename</em></code>
|
|
|
|
|
option, or redirect standard error to a file:</p>
|
|
|
|
|
<pre>
|
|
|
|
|
tidy -o output.html -f errs.txt index.html
|
|
|
|
|
tidy index.html > output.html 2> errs.txt
|
|
|
|
|
</pre>
|
|
|
|
|
<p>Both of those run tidy on the file <b>index.html</b> and write the
|
|
|
|
|
output to the file <b>output.html</b>, while writing any error messages to
|
|
|
|
|
the file <b>errs.txt</b>.
|
|
|
|
|
<p>
|
|
|
|
|
Writing the error messages to a file is especially useful if the file you
|
|
|
|
|
are checking has many errors; reading them from a file instead of the
|
|
|
|
|
console or pager can make it easier to review them.
|
|
|
|
|
<p>You can use the or <code>-m</code> or <code>-modify</code> option to
|
|
|
|
|
modify (in-place) the contents of the input file you are checking; that is,
|
|
|
|
|
to overwrite those contents with the output from Tidy. Example:
|
|
|
|
|
<pre>
|
|
|
|
|
tidy -f errs.txt -m index.html
|
|
|
|
|
</pre>
|
|
|
|
|
<p>That runs tidy on the file <b>index.html</b>, modifying it in place
|
|
|
|
|
and writing the error messages to the file <b>errs.txt</b>.
|
|
|
|
|
<p>
|
|
|
|
|
<b>Caution:</b> If you use the -m option, you should first save a copy of your file.
|
|
|
|
|
<h2 id=options>Options and configuration settings</h2>
|
|
|
|
|
<p>To get a list of available options, use:</p>
|
|
|
|
|
<pre>
|
|
|
|
|
tidy -help
|
|
|
|
|
</pre>
|
|
|
|
|
<p>To get a list of all configuration settings, use:</p>
|
|
|
|
|
<pre>
|
|
|
|
|
tidy -help-config
|
|
|
|
|
</pre>
|
|
|
|
|
<p>To read the help output a page at time, pipe it to a pager:
|
|
|
|
|
<pre>
|
|
|
|
|
tidy -help | less
|
|
|
|
|
tidy -help-config | less
|
|
|
|
|
</pre>
|
|
|
|
|
<p>Single-letter options other than -f may be combined; for example:
|
|
|
|
|
<pre>
|
|
|
|
|
tidy -f errs.txt -imu foo.html
|
|
|
|
|
</pre>
|
|
|
|
|
|
|
|
|
|
<h2 id="config">Using a config file</h2>
|
|
|
|
|
<p>The most convenient way to configure Tidy is by using separate
|
|
|
|
|
config file.
|
|
|
|
|
Assuming you have created a
|
|
|
|
|
Tidy config file named <b>config.txt</b> (the name doesn't matter), you can
|
|
|
|
|
instruct Tidy to use it via the command line option
|
|
|
|
|
<code>-config config.txt</code>; for example:
|
|
|
|
|
<pre>
|
|
|
|
|
tidy -config config.txt file1.html file2.html
|
|
|
|
|
</pre>
|
|
|
|
|
<p>Alternatively, you can name the default config file via the
|
|
|
|
|
environment variable named <b>HTML_TIDY</b>, the value of which is
|
|
|
|
|
the absolute path for the config file.
|
|
|
|
|
<p>You can also set config options on the command line by preceding
|
|
|
|
|
the name of the option immediately (no intervening space) with the string "<code>--</code>";
|
|
|
|
|
for example:</p>
|
|
|
|
|
<pre>
|
|
|
|
|
tidy --break-before-br true --show-warnings false
|
|
|
|
|
</pre>
|
|
|
|
|
<p>You can find documentation for full set of configuration options
|
|
|
|
|
on the
|
|
|
|
|
<a href= "quickref.html">Quick Reference</a>
|
|
|
|
|
page.
|
|
|
|
|
|
|
|
|
|
<h2 id=sample-config>Sample config file</h2>
|
|
|
|
|
<p>The following is an example of a Tidy config file.</p>
|
|
|
|
|
<pre>
|
|
|
|
|
// sample config file for HTML tidy
|
|
|
|
|
indent: auto
|
|
|
|
|
indent-spaces: 2
|
|
|
|
|
wrap: 72
|
|
|
|
|
markup: yes
|
|
|
|
|
output-xml: no
|
|
|
|
|
input-xml: no
|
|
|
|
|
show-warnings: yes
|
|
|
|
|
numeric-entities: yes
|
|
|
|
|
quote-marks: yes
|
|
|
|
|
quote-nbsp: yes
|
|
|
|
|
quote-ampersand: no
|
|
|
|
|
break-before-br: no
|
|
|
|
|
uppercase-tags: no
|
|
|
|
|
uppercase-attributes: no
|
|
|
|
|
char-encoding: latin1
|
|
|
|
|
new-inline-tags: cfif, cfelse, math, mroot,
|
|
|
|
|
mrow, mi, mn, mo, msqrt, mfrac, msubsup, munderover,
|
|
|
|
|
munder, mover, mmultiscripts, msup, msub, mtext,
|
|
|
|
|
mprescripts, mtable, mtr, mtd, mth
|
|
|
|
|
new-blocklevel-tags: cfoutput, cfquery
|
|
|
|
|
new-empty-tags: cfelse
|
|
|
|
|
</pre>
|
|
|
|
|
|
2012-03-17 08:24:01 +00:00
|
|
|
|
<h2 id="new-config">New configuration options</h2>
|
|
|
|
|
<p>The experimental HTML5-aware fork of Tidy adds the following new
|
|
|
|
|
configuration options:
|
|
|
|
|
<ul>
|
|
|
|
|
<li><a href="http://w3c.github.com/tidy-html5/quickref.html#coerce-endtags">coerce-endtags</a>
|
|
|
|
|
<li><a href="http://w3c.github.com/tidy-html5/quickref.html#drop-empty-elements">drop-empty-elements</a>
|
|
|
|
|
<li><a href="http://w3c.github.com/tidy-html5/quickref.html#merge-emphasis">merge-emphasis</a>
|
2012-03-24 10:18:32 +00:00
|
|
|
|
<li><a href="http://w3c.github.com/tidy-html5/quickref.html#omit-optional-tags">omit-optional-tags</a>
|
2012-03-17 08:24:01 +00:00
|
|
|
|
</ul>
|
|
|
|
|
<p>In addition, it also adds a new <code>html5</code> value for the
|
|
|
|
|
<a href="http://w3c.github.com/tidy-html5/quickref.html#doctype">doctype</a>
|
|
|
|
|
configuration option.
|
|
|
|
|
|
2012-03-01 13:53:12 +00:00
|
|
|
|
<h2 id=indenting>Indenting output for readability</h2>
|
|
|
|
|
<p>Indenting the source markup of an HTML document makes the markup easier
|
|
|
|
|
to read. Tidy can indent the markup for an HTML document while recognizing
|
|
|
|
|
elements whose contents should not be indented. In the example below, Tidy
|
|
|
|
|
indents the output while preserving the formatting of the <pre>
|
|
|
|
|
element:</p>
|
|
|
|
|
<p>Input:</p>
|
|
|
|
|
<pre>
|
|
|
|
|
<html>
|
|
|
|
|
<head>
|
|
|
|
|
<title>Test document</title>
|
|
|
|
|
</head>
|
|
|
|
|
<body>
|
|
|
|
|
<p>This example shows how Tidy can indent output while preserving
|
|
|
|
|
formatting of particular elements.</p>
|
|
|
|
|
|
|
|
|
|
<pre>This is
|
|
|
|
|
<em>genuine
|
|
|
|
|
preformatted</em>
|
|
|
|
|
text
|
|
|
|
|
</pre>
|
|
|
|
|
</body>
|
|
|
|
|
</html>
|
|
|
|
|
|
|
|
|
|
</pre>
|
|
|
|
|
<p>Output:</p>
|
|
|
|
|
<pre>
|
|
|
|
|
<html>
|
|
|
|
|
<head>
|
|
|
|
|
<title>Test document</title>
|
|
|
|
|
</head>
|
|
|
|
|
|
|
|
|
|
<body>
|
|
|
|
|
<p>This example shows how Tidy can indent output while preserving
|
|
|
|
|
formatting of particular elements.</p>
|
|
|
|
|
<pre>
|
|
|
|
|
This is
|
|
|
|
|
<em>genuine
|
|
|
|
|
preformatted</em>
|
|
|
|
|
text
|
|
|
|
|
</pre>
|
|
|
|
|
</body>
|
|
|
|
|
</html>
|
|
|
|
|
</pre>
|
|
|
|
|
<p>Tidy’s indenting behavior is not perfect and can sometimes cause your
|
|
|
|
|
output to be rendered by browsers in a different way than the input.
|
|
|
|
|
You can avoid unexpected indenting-related rendering problems by setting
|
|
|
|
|
<code>indent: no</code> or <code>indent: auto</code> in a config file.</p>
|
|
|
|
|
|
|
|
|
|
<h2 id=preserve-indenting>Preserving original indenting not possible</h2>
|
|
|
|
|
<p>Tidy is not capable of preserving the original indenting of the markup
|
|
|
|
|
from the input it receives. That’s because Tidy starts by building a clean
|
|
|
|
|
parse tree from the input, and that parse tree doesn’t contain any
|
|
|
|
|
information about the original indenting. Tidy then pretty-prints the parse
|
|
|
|
|
tree using the current config settings. Trying to preserve the original
|
|
|
|
|
indenting from the input would interact badly with the repair operations
|
|
|
|
|
needed to build a clean parse tree, and would considerably complicate the
|
|
|
|
|
code.</p>
|
|
|
|
|
|
|
|
|
|
<h2 id=encodings>Encodings and character references</h2>
|
|
|
|
|
<p>
|
|
|
|
|
Tidy defaults to assuming you want output to be encoded in UTF-8.
|
|
|
|
|
But Tidy offers you a choice of other character encodings: US ASCII, ISO
|
|
|
|
|
Latin-1, and the ISO 2022 family of 7 bit encodings.
|
|
|
|
|
<p>
|
|
|
|
|
Tidy doesn't yet recognize the use of the HTML <meta> element for
|
|
|
|
|
specifying the character encoding.</p>
|
|
|
|
|
<p>
|
|
|
|
|
The full set of HTML character references are defined. Cleaned-up output
|
|
|
|
|
uses named character references for characters when appropriate. Otherwise,
|
|
|
|
|
characters outside the normal range are output as numeric character
|
|
|
|
|
references.
|
|
|
|
|
|
|
|
|
|
<h2 id=accessibility>Accessibility</h2>
|
|
|
|
|
<p>Tidy offers advice on potential accessibility problems for people using
|
|
|
|
|
non-graphical browsers.
|
|
|
|
|
|
|
|
|
|
<h2 id=presentational-markup>Cleaning up presentational markup</h2>
|
|
|
|
|
<p>Some tools generate HTML with presentational elements such as <font>,
|
|
|
|
|
<nobr>, and <center>.
|
|
|
|
|
Tidy's <code>-clean</code> option will replace those elements with CSS style
|
|
|
|
|
properties.
|
|
|
|
|
<p>Some HTML documents rely on the presentational effects of <p> start
|
|
|
|
|
tags that are not followed by any content. Tidy deletes such <p> tags
|
|
|
|
|
(as well as any headings that don’t have content). So do not use <p>
|
|
|
|
|
tags simply for adding vertical whitespace; instead use CSS, or the
|
|
|
|
|
<br> element. However, note that Tidy won’t discard <p> tags that
|
|
|
|
|
are followed by any nonbreaking space (that is, the &nbsp; named
|
|
|
|
|
character reference).
|
|
|
|
|
|
|
|
|
|
<h2 id=new-tags>Teaching Tidy about new tags</h2>
|
|
|
|
|
<p>You can teach Tidy about new tags by declaring them in the
|
|
|
|
|
configuration file, the syntax is:</p>
|
|
|
|
|
<pre>
|
|
|
|
|
new-inline-tags: <em>tag1, tag2, tag3</em>
|
|
|
|
|
new-empty-tags: <em>tag1, tag2, tag3</em>
|
|
|
|
|
new-blocklevel-tags: <em>tag1, tag2, tag3</em>
|
|
|
|
|
new-pre-tags: <em>tag1, tag2, tag3</em>
|
|
|
|
|
</pre>
|
|
|
|
|
<p>The same tag can be defined as empty and as inline or as empty
|
|
|
|
|
and as block.</p>
|
|
|
|
|
<p>These declarations can be combined to define a new empty
|
|
|
|
|
inline or empty block element. But you are not advised to declare
|
|
|
|
|
tags as being both inline and block.</p>
|
|
|
|
|
<p>Note that the new tags can only appear where Tidy expects inline
|
|
|
|
|
or block-level tags respectively. That means you can’t place
|
|
|
|
|
new tags within the document head or other contexts with restricted
|
|
|
|
|
content models.
|
|
|
|
|
|
|
|
|
|
<h2 id=php-asp-jste>Ignoring PHP, ASP, and JSTE instructions</h2>
|
|
|
|
|
<p>Tidy will gracefully ignore many cases of PHP, ASP, and JSTE
|
|
|
|
|
instructions within element content and as replacements for attributes,
|
|
|
|
|
and preserve them as-is in output; for example:</p>
|
|
|
|
|
<pre>
|
|
|
|
|
<option <% if rsSchool.Fields("ID").Value
|
|
|
|
|
= session("sessSchoolID")
|
|
|
|
|
then Response.Write("selected") %>
|
|
|
|
|
value='<%=rsSchool.Fields("ID").Value%>'>
|
|
|
|
|
<%=rsSchool.Fields("Name").Value%>
|
|
|
|
|
(<%=rsSchool.Fields("ID").Value%>)
|
|
|
|
|
</option>
|
|
|
|
|
</pre>
|
|
|
|
|
<p>But note that Tidy may report missing attributes when those are “hidden”
|
|
|
|
|
within the PHP, ASP, or JSTE code. If you use PHP, ASP, or JSTE code to
|
|
|
|
|
create a start tag, but place the end tag explicitly in the HTML markup, Tidy
|
|
|
|
|
won’t be able to match them up, and will delete the end tag. So in that
|
|
|
|
|
case you are advised to make the start tag explicit and to use PHP, ASP, or
|
|
|
|
|
JSTE code for just the attributes; for example:</p>
|
|
|
|
|
<pre>
|
|
|
|
|
<a href="<%=random.site()%>">do you feel lucky?</a>
|
|
|
|
|
</pre>
|
|
|
|
|
<p>
|
|
|
|
|
Tidy can also get things wrong if the PHP, ASP, or JSTE code includes
|
|
|
|
|
quotation marks; for example:
|
|
|
|
|
</p>
|
|
|
|
|
<pre>
|
|
|
|
|
value="<%=rsSchool.Fields("ID").Value%>"
|
|
|
|
|
</pre>
|
|
|
|
|
<p>Tidy will see the quotation mark preceding <i>ID</i> as ending the
|
|
|
|
|
attribute value, and proceed to complain about what follows.
|
|
|
|
|
<p>Tidy allows you to control whether line wrapping on spaces within
|
|
|
|
|
PHP, ASP, and JSTE
|
|
|
|
|
instructions is enabled; see the <b>wrap-php</b>, <b>wrap-asp</b>,
|
|
|
|
|
and <b>wrap-jste</b> config options.</p>
|
|
|
|
|
|
|
|
|
|
<h2 id=xml>Correcting well-formedness errors in XML markup</h2>
|
|
|
|
|
<p>Tidy can help you to correct well-formedness errors in XML markup. Tidy
|
|
|
|
|
doesn't yet recognize all XML features, though; for example, it doesn't
|
|
|
|
|
understand CDATA sections or DTD subsets.</p>
|
|
|
|
|
|
|
|
|
|
<h2 id="scripts">Using Tidy from scripts</h2>
|
|
|
|
|
<p>If you want to run Tidy from a Perl or other scripting language
|
|
|
|
|
you may find it of value to inspect the result returned by Tidy
|
|
|
|
|
when it exits: 0 if everything is fine, 1 if there were warnings
|
|
|
|
|
and 2 if there were errors. This is an example using Perl:</p>
|
|
|
|
|
<pre>
|
|
|
|
|
if (close(TIDY) == 0) {
|
|
|
|
|
my $exitcode = $? >> 8;
|
|
|
|
|
if ($exitcode == 1) {
|
|
|
|
|
printf STDERR "tidy issued warning messages\n";
|
|
|
|
|
} elsif ($exitcode == 2) {
|
|
|
|
|
printf STDERR "tidy issued error messages\n";
|
|
|
|
|
} else {
|
|
|
|
|
die "tidy exited with code: $exitcode\n";
|
|
|
|
|
}
|
|
|
|
|
} else {
|
|
|
|
|
printf STDERR "tidy detected no errors\n";
|
|
|
|
|
}
|
|
|
|
|
</pre>
|
|
|
|
|
|
|
|
|
|
<h2 id="implementation">Source code</h2>
|
|
|
|
|
<p>The source code for the experimental HTML5 fork of Tidy can be found at
|
|
|
|
|
<a href="https://github.com/w3c/tidy-html5">https://github.com/w3c/tidy-html5</a>.
|
|
|
|
|
|
2012-03-27 16:16:42 +00:00
|
|
|
|
<h2 id=build-cli>Building the tidy command-line tool</h2>
|
2012-03-27 16:07:27 +00:00
|
|
|
|
<p>For Linux/BSD/OSX platforms, you can build and install the
|
|
|
|
|
<code>tidy</code> command-line tool from the source code using the
|
|
|
|
|
following steps.</p>
|
|
|
|
|
|
|
|
|
|
<ol>
|
|
|
|
|
<li><code>make -C build/gmake/</code></li>
|
|
|
|
|
<li><code>make install -C build/gmake/</code></li>
|
|
|
|
|
</ol>
|
|
|
|
|
|
|
|
|
|
<p>Note that you will either need to run <code>make install</code> as root,
|
|
|
|
|
or with <code>sudo make install</code>.</p>
|
|
|
|
|
|
2012-03-27 16:16:42 +00:00
|
|
|
|
<h2 id=build-library>Building the libtidy shared library</h2>
|
2012-03-27 16:07:27 +00:00
|
|
|
|
<p>For Linux/BSD/OSX platforms, you can build and install the
|
|
|
|
|
<code>tidylib</code> shared library (for use in building other
|
|
|
|
|
applications) from the source code using the following steps.</p>
|
|
|
|
|
|
|
|
|
|
<ol>
|
2012-03-27 16:16:42 +00:00
|
|
|
|
<li><code>sh build/gnuauto/setup.sh && ./configure && make</code></li>
|
|
|
|
|
<li><code>make install</code></li>
|
2012-03-27 16:07:27 +00:00
|
|
|
|
</ol>
|
|
|
|
|
|
|
|
|
|
<p>Note that you will either need to run <code>make install</code> as root,
|
|
|
|
|
or with <code>sudo make install</code>.</p>
|
|
|
|
|
|
2012-03-01 13:53:12 +00:00
|
|
|
|
<h2 id=acks>Acknowledgements</h2>
|
|
|
|
|
<p>Dave Raggett has a list of
|
|
|
|
|
<a href="http://www.w3.org/People/Raggett/tidy/#acks">Acknowledgements</a>
|
|
|
|
|
for people who made suggestions or reported bugs for the
|
|
|
|
|
original version of Tidy.
|
|
|
|
|
|
|
|
|
|
<div id=toc-button style="">
|
2012-03-27 16:16:42 +00:00
|
|
|
|
<a class=button href="#" onclick="
|
2012-03-01 13:53:12 +00:00
|
|
|
|
javascript:document.getElementById('toc').className = 'show';
|
2012-03-17 07:48:07 +00:00
|
|
|
|
document.getElementById('toc-button').className = 'hide';
|
|
|
|
|
document.getElementById('quickref-button').className = 'hide';"
|
|
|
|
|
>Show TOC</a>
|
2012-03-01 13:53:12 +00:00
|
|
|
|
</div>
|
|
|
|
|
<div id=toc class=hide>
|
2012-03-27 16:16:42 +00:00
|
|
|
|
<a class=button href="#" onclick="
|
2012-03-01 13:53:12 +00:00
|
|
|
|
javascript:document.getElementById('toc').className = 'hide';
|
2012-03-17 07:48:07 +00:00
|
|
|
|
document.getElementById('toc-button').className = 'show';
|
|
|
|
|
document.getElementById('quickref-button').className = 'show';"
|
|
|
|
|
>Close</a>
|
2012-03-01 13:53:12 +00:00
|
|
|
|
<ol>
|
|
|
|
|
<li><a href="#what-tidy-does">What Tidy does</a>
|
|
|
|
|
<li><a href="#help">How to run Tidy from the command line</a>
|
|
|
|
|
<li><a href="#options">Options and configuration settings</a>
|
|
|
|
|
<li><a href="#config">Using a config file</a>
|
|
|
|
|
<li><a href="#sample-config">Sample config file</a>
|
2012-03-17 08:24:01 +00:00
|
|
|
|
<li><a href="#new-config">New configuration options</a>
|
2012-03-01 13:53:12 +00:00
|
|
|
|
<li><a href="#indenting">Indenting output for readability</a>
|
|
|
|
|
<li><a href="#preserve-indenting">Preserving original indenting not possible</a>
|
|
|
|
|
<li><a href="#encodings">Encodings and character references</a>
|
|
|
|
|
<li><a href="#accessibility">Accessibility</a>
|
|
|
|
|
<li><a href="#presentational-markup">Cleaning up presentational markup</a>
|
|
|
|
|
<li><a href="#new-tags">Teaching Tidy about new tags</a>
|
|
|
|
|
<li><a href="#php-asp-jste">Ignoring PHP, ASP, and JSTE instructions</a>
|
|
|
|
|
<li><a href="#xml">Correcting well-formedness errors in XML markup</a>
|
|
|
|
|
<li><a href="#scripts">Using Tidy from scripts</a>
|
|
|
|
|
<li><a href="#implementation">Source code</a>
|
2012-03-27 16:07:27 +00:00
|
|
|
|
<li><a href="#build-cli">Building the tidy command-line tool</a>
|
|
|
|
|
<li><a href="#build-library">Building the tidylib shared library</a>
|
2012-03-01 13:53:12 +00:00
|
|
|
|
<li><a href="#acks">Acknowledgements</a>
|
|
|
|
|
</ol>
|
|
|
|
|
</div>
|
2012-03-17 07:48:07 +00:00
|
|
|
|
|
|
|
|
|
<div id=quickref-button style="">
|
|
|
|
|
<a class=button href="quickref.html">QuickRef</a>
|
|
|
|
|
</div>
|