This is particularly for the anchor tag which in html5 mode is parsed in
ParseBlock. That is retain a leading space, in case it needs to be
moved to in front of the block to keep space rendering.
Revert TidyTag_A to HTML5 mode, but allow the table to be modified if the
DOCTYPE given is found to NOT be HTML5, through a service TY_(AdjustTags).
Care is taken to clear any previous hash cached tags.
At present this only effects the anchor tag, but could be applied to
others that need to change their parsing due to an identified DOCTYPE.
The fix for #111 added an end tag for all StartEnd tags, when outputting
HTML5, but there should be some exceptions to this.
Added a new service, isVoidElement(node) for the void elements. Perhaps
this service could be further optimised.
With this fix introduced two new services, FindNodeById and
FindNodeWithId. The former does a total tree search for a TidyTagId.
Maybe there is a way to optimise this search...
Also change the uint badForm from an on/off to a bit field, so could be
extended to other document format errors.
This is a set of kludgy fixes for MathML attribute and entities support.
It is intended that a full HTML5 entity table be added at some time, but
at present ALL entities are accepted as written when within the math
element.
Likewise all attributes are accepted on MathML elements without any check
of their name or value, even if they match attributes outside MathML.
And in the pprinter such entities are written as is from the lexer, using
a new PPrintMathML service added, using the new mode OtherNameSpace.
It is hoped all these fixes will NOT effect tidy outside the math element.
ALL fixes in the set a clearly marked '#130 - MathML attr and entity fix!'
for easy searching, and improving if possible.
As predicted the previous fix had adverse consequences on say script text,
which then lost the indent, and was reverted.
This introduces a new service, nodeIsTextLike, which naturally returns yes
if it is text, but also is an AspTag.
Maybe other text like nodes need to be added.
An immense thanks to Ger Hobbelt who had already done this
in his github.com/GerHobbelt/htmltidy fork.
The two sources have diverges so was not a simple cut
an paste. But again thanks Ger for this.
This is a simple but profound change in pprint.c.
Since leading space is preserved on script code, after tidy indents
the code once, a second run on that tidied file would add more indent
to already indented code. This fix should be carefully checked,
and removed if there are other bad consequences.
Bump the version point to 4 for this change.