Commit graph

93 commits

Author SHA1 Message Date
Geoff McLane c8751f60e7 Issue #286 - use AddByte for internal transfer 2015-10-20 15:04:18 +02:00
Geoff McLane d75c82275d Issue #285 - Add a ResetTags func to erset html5 mode before each document 2015-10-14 16:55:35 +02:00
Geoff McLane adbad0379e Issue #65 - if nonested then no endtag needed to decrement.
This is only if nonested is on, then a <script> tag has not incremented
the nested, so likewise no need to treat an escaped close tag <\/script>
as an end tage to decrement nested.
2015-10-08 17:06:03 +02:00
Geoff McLane 7e69ceb3d1 Issue #281 - only warn BAD_CDATA_CONTENT if inserting an escape. 2015-10-07 16:17:42 +02:00
Geoff McLane b63c1090c2 option to avoid incrementing nested comtainers.
This is in the GetCDATA function. If the container is script or style and
this option is on, avoid bumping nested.

This addresses issues #65 (1642186) and #280.

All attempts at parsing script data are now abandoned as a bad direction.
2015-10-07 15:11:25 +02:00
Geoff McLane b4efe7464a small enhancement of debug only code 2015-10-05 15:08:20 +02:00
Geoff McLane 6c1a2acea2 #273 - avoid xhtml doctype flip/flop 2015-09-27 17:36:57 +02:00
Christopher Brannon 94b0647c08 Issue #65, fix for ignoring cdata. 2015-09-24 18:13:57 -07:00
Geoff McLane 04ca419080 Issue #64 - Try hard to skip '<![CDATA[ ... ]]>' 2015-09-24 14:21:55 +02:00
Geoff McLane 96589c6f57 #65 Skip esc'd esc, and only for script containers 2015-09-21 12:33:53 +02:00
Geoff McLane eda37c5adb Issue #65 - avoid new quotes if in quotes 2015-09-19 14:58:42 +02:00
Geoff McLane d541405a2a Eventually complete a 2007 fix 2015-09-16 13:17:50 +02:00
Geoff McLane 9960f7c6dd Protext agains a NULL node in the Debug only code 2015-09-12 13:06:14 +02:00
Geoff McLane 66e288a8e2 Issue #239 - no warn for apos enitity in html5++ mode 2015-08-22 14:03:02 +02:00
Geoff McLane e79137de7f Issue #238 - only except the pre element 2015-08-22 14:00:18 +02:00
Geoff McLane 4246c2c462 Issue #230: Need to KEEP this newline char sometimes.
This is a case where the lexer, in GetTokenfromStream, does NOT eat any
trailing newline after a LEX_STARTTAG: case...

So far have identified pre, script, style as NEEDING this user newline
character for later pprint output. Any others?
2015-07-15 19:41:02 +02:00
Geoff McLane 3a524f1710 Issue #207 - deal with 2 cases of an unambiguous ampersand.
html5 allows a naked ampersand unquoted, and now tidy will not issue a
warning. This only deals with a & b, and P&<li>O</li>

More may need to be done for other cases.
2015-06-24 13:10:27 +02:00
Geoff McLane c18f27a587 Issue #217 - avoid len going negative, ever... 2015-06-03 20:26:03 +02:00
Geoff McLane c1a3100cb9 add conveninet break point based on row and column 2015-05-12 17:13:23 +02:00
Geoff McLane f5eb2cf26a Issue #196 - expand comment and bump version.
Thanks to @willydee for this PR.
2015-04-11 15:25:07 +02:00
Geoff McLane fd7b4f8589 just some more DEBUG on text nodes 2015-03-06 19:28:52 +01:00
Geoff McLane c0cad3aeba Issue #167 - further fixes for HTML5 mode 2015-03-06 19:13:06 +01:00
Geoff McLane 0dc68d6cb1 Issue #167 & #169 - default to HTML5 mode.
Revert TidyTag_A to HTML5 mode, but allow the table to be modified if the
DOCTYPE given is found to NOT be HTML5, through a service TY_(AdjustTags).
Care is taken to clear any previous hash cached tags.

At present this only effects the anchor tag, but could be applied to
others that need to change their parsing due to an identified DOCTYPE.
2015-03-06 12:55:24 +01:00
Geoff McLane 0aa81eb256 Issue #130 - MathML attr and entity fix!
This is a set of kludgy fixes for MathML attribute and entities support.

It is intended that a full HTML5 entity table be added at some time, but
at present ALL entities are accepted as written when within the math
element.

Likewise all attributes are accepted on MathML elements without any check
of their name or value, even if they match attributes outside MathML.

And in the pprinter such entities are written as is from the lexer, using
a new PPrintMathML service added, using the new mode OtherNameSpace.

It is hoped all these fixes will NOT effect tidy outside the math element.

ALL fixes in the set a clearly marked '#130 - MathML attr and entity fix!'
for easy searching, and improving if possible.
2015-02-22 18:58:55 +01:00
Geoff McLane 2172a498f6 Issue #153 - fix for endif section no conforming to what tidy expects 2015-02-05 19:01:34 +01:00
Geoff McLane 66951a562a add row/col to DEBUG output 2015-02-05 18:24:02 +01:00
Geoff McLane 1be5ccbb63 Issue #130 - initial MathML support 2015-02-05 12:21:08 +01:00
Geoff McLane 59974e0bb0 When adding meta element for Tidy prefer library version over date 2015-01-28 12:15:27 +01:00
Geoff McLane ec4d4cd1f1 Issue #92 - OLD problem of ins and del
These are marked as CM_INLINE, but also CM_BLOCK,
so should not be stacked for insertion
2015-01-28 11:50:06 +01:00
Jim Derry edb185a308 Use a hash table for anchors #64 2014-11-22 19:39:06 +08:00
Jim Derry 6aaf826476 Restart with geoffmcl's fork 2014-11-22 15:42:28 +08:00
Peter Kelly 7fc3255542 Applied hash table optimisation to RemoveAnchorByNode. This function now takes
the anchor name as a parameter, so it can look in the correct bin.

In the case of FreeAttrs, we have the name already (since we found a name or
id attribute). In the case of FixAnchors, the anchor name could come from
either the name or id attribute, so we call the function separately for each
case, passing the appropriate attribute value.
2012-08-20 10:06:30 +07:00
Craig Barnes ce27a729dc Remove CVS info blocks 2012-08-08 17:27:29 +01:00
Ryan Kistner 8025154e30 Check for NULL pointer before calling strcasecmp in GetVersFromFPI 2012-04-23 11:51:58 +09:00
Michael[tm] Smith 3a9a794d8b Minor formatting edit. 2012-03-15 14:12:41 +09:00
Michael[tm] Smith 0c8b587067 Added --doctype=html5 option value. Fixes #17. 2012-03-15 14:11:01 +09:00
Michael[tm] Smith 879e6cf909 Put "experimental" in the meta@generator output too. 2012-03-14 20:05:12 +09:00
Michael[tm] Smith c331917c31 Make the doctype handling work the way it should. 2012-03-14 19:38:18 +09:00
Michael[tm] Smith 1c4d43ad2a Deal with version reporting better. 2012-03-01 17:22:03 +09:00
Michael[tm] Smith 73834b8412 Correct meta@name=generator output. 2012-02-10 15:40:33 +09:00
Michael[tm] Smith b26db41c86 Do not mess with <!doctype html>. Fixes #2. 2012-02-10 15:33:21 +09:00
Michael[tm] Smith 264c9bc043 HTML IDs can contain anything except whitespace.
Introduced TY_(IsHTMLSpace)(uint c), which checks to see if c is one of the
chars that the HTML spec (and browsers) treat as a space in attribute
values: 0x020 (space), 0x009 (tab), 0x00a (LF), 0x00c (FF), or 0x00d (CF).
Can't use ANSI C isspace(int c) here because like standard functions for
many other langs, it also treats 0x00b as a space.
2012-01-02 16:12:51 +09:00
Michael[tm] Smith b92d7aab88 new 2011-11-17 11:44:16 +09:00