Commit graph

530 commits

Author SHA1 Message Date
Geoff McLane 8dda04f1df Issue #379 - Care about 'ix' going negative.
How this lasted so long in the code is a mystery! But of course it will
only be a read out-of-bounds if testing the first character in the lexer,
and it is a spacey char.

A big thanks to @gaa-cifasis for running ASAN tests on Tidy.
2016-03-06 17:36:51 +01:00
Geoff McLane 8eee85cb9e Issue #380 - Experimental patch in issue-380 branch 2016-03-05 17:39:14 +01:00
Geoff McLane 0e6ed639d6 Issue #380 - Add more MSVC debug 2016-03-04 19:28:49 +01:00
Geoff McLane d091027089 Issue #377 add debug only output of constrained versions 2016-03-03 20:21:35 +01:00
Geoff McLane 7bdc31af76 Issue #377 - Table summary attribute also applies to XHTML5 2016-02-29 19:58:55 +01:00
Geoff McLane 24c62cf0df Issue #314 - Avoid head warning if show-body-only 2016-02-29 18:49:15 +01:00
Geoff McLane 23e689d145 Issue #373 - Merge branch 'issue-373' of github.com:htacg/tidy-html5 into issue-373
Conflicts: version.txt - set version 5.1.41issue-373
2016-02-18 15:18:39 +01:00
Geoff McLane 8c13d270ed Merge branch 'master' of github.com:htacg/tidy-html5 2016-02-18 13:58:23 +01:00
Geoff McLane b91d52592b Fix to K&R C to compile with MSVC 2016-02-18 13:57:47 +01:00
Jim Derry 63c0327de1 Fixed typo in output strings. 2016-02-18 15:40:10 +08:00
Jim Derry e00f419f5d Discovered some missing strings from tidyErrorFilterKeysStruct. 2016-02-18 10:19:57 +08:00
Jim Derry da8205b2dc Regen'd POT, POs, and headers in order to capture documentation changes in all of them. 2016-02-17 20:07:00 +08:00
Jim Derry 7fbe76be0b Finished semantic html. 2016-02-17 20:02:38 +08:00
Jim Derry a78daccd3c Through TidyIndentSpaces. 2016-02-17 17:43:09 +08:00
Jim Derry a16e89c4f8 Updated translator comments. 2016-02-17 17:27:57 +08:00
Jim Derry d30c2d7747 XSL for man handles <var>. Updated comment and sample string. 2016-02-17 17:20:02 +08:00
Jim Derry cc59efb23d Add a xml-error-strings service to console app providing symbols developers can use with TidyErrorFilter3. 2016-02-17 12:35:20 +08:00
Jim Derry bc1e54d5b5 Externalize the TidyReportFilter3 error codes, and provide iterators to loop through them. 2016-02-17 12:27:11 +08:00
Jim Derry 720d5c25d2 Squelch compiler warning default type. 2016-02-17 10:56:21 +08:00
Jim Derry 97abad0c05 Bump to 5.1.39 for merging.
Merge branch 'master' into attrdict_phase2
2016-02-16 11:11:36 +08:00
Jim Derry 3431dd05a4 Merge branch 'master' into attrdict_phase1
Bump version to 5.1.38
2016-02-16 11:07:32 +08:00
Jim Derry 1e4f7dd0f1 Merge pull request #368 from htacg/issue-341
Issue #341
2016-02-16 10:18:26 +08:00
Geoff McLane 9cf97d536b Issue #373 - Avoid a null added to output.
This bug was first openned in 2009 by Christophe Chenon, as bug sf905 but
the patch provided then never made it into the source.

Now appears fixed, 7 years later!
2016-02-15 13:02:10 +01:00
Geoff McLane a4f425546f Improve MSVC DEBUG output.
Previous only output the first 8 characters, followed by an elipse if more
than 8. Now return first up to 19 chars. If nore than 19, return first 8,
followed by an elipse, followed by the last 8 characters.

This is in the get_text_string service, which is only used if MSVC and not
NDEBUG.
2016-02-14 18:17:46 +01:00
Jim Derry c62127b9bd Default to NO at this point. 2016-02-13 12:33:02 +08:00
Jim Derry 8b5771cf24 Word2000
Added messages that would otherwise be missed in post-processing, after cleanup.
2016-02-13 12:26:19 +08:00
Jim Derry 2cdedb4a63 Forgot one file... 2016-02-13 11:53:53 +08:00
Jim Derry 896b00238b Forgot one file... 2016-02-13 11:53:40 +08:00
Jim Derry 2ade3357a9 Phase 2
This is a MUCH SANER approach to what I was trying to do (now that I screwed up enough internals to understand some of them!
At this point there are zero exit state reversions, and zero markup reversions! There are still 21 errout reversions; I'll
annotate and adjust as necessary.
2016-02-13 11:31:16 +08:00
Jim Derry e947d296e4 Handle some issues with misusing VERS_HTML5 in the doctype. 2016-02-12 20:49:14 +08:00
Jim Derry c81a151da5 Add VERS_STRICT to identify future strict document types. 2016-02-12 20:46:49 +08:00
Jim Derry 74604fd52b Hard-coded checks are redundant with updates to attrdict.c. 2016-02-12 20:44:03 +08:00
Jim Derry 429703dce4 Because the previous effort #350 grew too fast and there was a LOT of side effects to
my changes, I'm starting over with this. Comments in the PR thread.

This commit reduces the size of attrdict.c while causing only a single errout
regression that is justified.
2016-02-12 19:34:19 +08:00
Geoff McLane 03a643f781 Issue #341 - No token can be inserted if istacksize == 0! 2016-02-08 15:12:23 +01:00
Geoff McLane 7d0d8a853a Issue #345 - discard leading spaces in href 2016-02-01 20:07:55 +01:00
Geoff McLane 7f0d5c31e6 If no doctype, allow user doctype to reset table - Issue #342 2016-02-01 19:44:30 +01:00
Geoff McLane c1f94c066c Tidy up some debug only code.
After @sria91 added #360 merge, added a little more improvement...
2016-01-30 20:51:27 +01:00
Srikanth Anantharam 9a0af48a4e fixed a NULL node bug in debug build 2016-01-30 22:03:52 +05:30
Jim Derry 9ae15f45a7 Consistent tabs
Fixed tabs in template file, and regen'd all related files.
2016-01-30 15:51:54 +08:00
Jim Derry 53f2a2da2a msgunfmt works properly with escaped hex. 2016-01-30 15:51:53 +08:00
Martin von Gagern 17e50f2642 Encode UTF-8 strings to hex escapes in header files 2016-01-30 15:51:53 +08:00
Jim Derry bf70824cc2 - Add TidyReportFilter3, which removes translation strings completely from the equation. It would be a good idea to deprecate TidyReportFilter2, which is vulnerable to changing strings in Tidy source.
- Documentation reminders for future enum changes.
- Documentation updates.
2016-01-30 15:51:53 +08:00
Jim Derry d505869910 Localization Support added to HTML Tidy
- Languages can now be added to Tidy using standard toolchains.
- Tidy's help output is improved with new options and some reorganization.
2016-01-30 15:51:53 +08:00
Jim Derry 26e7d9d4b0 Fixes Mac OS X encoding issues and harmonizes output across platforms.
Previously Tidy produced different output based on the compilation target, NOT based on
the file encoding and specified options. Every platform was equal except Mac OS. Now unless
the encoding is specifically set to a Mac file type, all encoding assumptions are the same
across platforms.
2015-12-31 13:57:34 +08:00
Geoff McLane 78f2d52cdd Issue #308 - remove bad warn, bad assert, and free discarded 2015-12-05 15:03:41 +01:00
Geoff McLane 9caecb80cf Revert "Fix for head closing tag not reported (#327)"
This reverts commit 61cfcb1555.

This added an inconsistent warning about a missing optional close tag. In
general tidy does not report such optional close tags. See issue #327 for
some discussion on this.
2015-12-05 12:59:43 +01:00
Geoff McLane 3b13cd8076 Merge branch 'mingw-build' 2015-12-03 19:18:07 +01:00
Jim Derry 61cfcb1555 Fix for head closing tag not reported (#327) 2015-11-29 13:21:49 +08:00
Jim Derry 873794162a Callback added to XML printer, too; fixed off-by-one error. 2015-11-29 07:39:33 +08:00
Geoff McLane dc969f30d5 Issue #311 - small changes for MinGW32 build 2015-11-28 15:14:53 +01:00
Jim Derry 4adc07fd65 Removed the one callback per line filter. Library user can filter this himself. 2015-11-28 15:43:34 +08:00
Jim Derry dcd8f16f73 Tidying progress callback implemented. 2015-11-28 15:34:23 +08:00
Jim Derry 34d456aa80 Make pretty printer keep track of line numbers as it prints. 2015-11-28 14:16:17 +08:00
Jim Derry 9834cc17ad Style cleanup for previous commit. 2015-11-27 09:45:26 +08:00
Jim Derry 1c963acb58 Merge branch 'master' into fix_img_alt 2015-11-27 09:36:32 +08:00
Jim Derry 933fc3d236 - Addresses #320
- Different error output depending on whether or not the `alt-text` option was given a value.
2015-11-26 13:23:43 +08:00
Jim Derry 63234735d8 Allows null value css-prefix to be used in a config file without issuing a warning. 2015-11-26 11:21:48 +08:00
Ben Bullock 71d9638448 Don't push back non-A tokens. 2015-11-25 18:00:45 +09:00
Christopher Brannon 1ef5ba7968 Fix a tiny buffer overflow. 2015-11-23 12:28:00 -08:00
Geoff McLane b58aa1c26a Issue #307 - add a ref link in comments 2015-11-22 20:43:12 +01:00
Geoff McLane 2388fb0175 Issue #307, #167, #169 - regression of nestd anchors 2015-11-22 18:46:00 +01:00
Geoff McLane bbc72a9297 Issue #306 - fix an old typo hidden by a cast!
Thanks to @benkasminbullock for spotting this fix.
2015-11-18 20:01:21 +01:00
Geoff McLane e2feed485c gcc warning - if 0 an unused static table 2015-11-18 17:06:13 +01:00
Geoff R. McLane b98061ff62 fix gcc warning parentheses in pprint.c 2015-11-18 16:47:58 +01:00
Geoff McLane 768ad46968 Issue #304 - remove duplicated TidyAttr_ARIA_ORIENTATION 2015-11-17 15:06:23 +01:00
Shane McCarron c0b769c5c7 Initial cut at RDFa support (again)
New branch that implements support for RDFa attributes.  Should be
cleaner than my first attempt in PR #299 - also references issue #209
2015-11-16 11:29:23 -06:00
Paul Howarth baad0b0064 Don't mangle the output filename
Attached patch works for me, and shouldn't affect any other option
processing.
2015-11-11 11:28:47 +01:00
Geoff McLane c68ad42482 Revert 22a1922c35 2015-11-07 14:50:10 +01:00
Shane McCarron c572e3e3c8 Initial cut at supporting RDFa attributes. 2015-11-06 12:19:05 -06:00
Geoff McLane 800b91e576 Issue #65 - effect name change to skip-nested, and default to on 2015-11-05 15:19:39 +01:00
Jim Derry 32ce272f75 Fix indent-with-tabs for library use. 2015-11-04 12:44:15 +08:00
Jim Derry dec6356a6f Deleted multiple equal id attributes. 2015-11-02 15:31:47 +08:00
Jim Derry d0ac990636 More description beautification. 2015-11-02 12:06:37 +08:00
Jim Derry 807fed4ff6 Documentation improvements. 2015-11-01 19:05:03 +08:00
Jim Derry 2613f02dc5 More documentation beautification. 2015-10-31 22:03:33 +08:00
Jim Derry 565d2ec232 Documentation beautification underway. 2015-10-31 18:30:02 +08:00
Jim Derry cf3c0293c0 Additional tests with our troublesome option. 2015-10-31 14:45:51 +08:00
Jim Derry 8c5fae8c09 - documentation/quickref.xsl
- Includes <p> support
  - Matches the description class name in quickref.include.xsl
  - Styles <br /> to enforce vertical spacing (in the reference table only).
- documentation/style.css
  - Styles <br /> to enforce vertical spacing (in the reference table only).
- documentation/tidy1.xsl.in
  - Includes <p> support.
  - Better manages line breaks with .sp1 instead of .br.
- src/localize.c
  - Legibility to the troublesome `drop-font-tags` description.
2015-10-30 23:58:43 +08:00
Jim Derry 709ac8cb4c Support HTML in descriptions. 2015-10-30 18:17:40 +08:00
Jim Derry 09b0698c56 Typo. 2015-10-30 12:58:11 +08:00
Jim Derry a3138cb142 URL cleanup. 2015-10-30 12:23:20 +08:00
Jim Derry 2d0f971747 Update documentation to address #288. 2015-10-30 10:19:47 +08:00
Geoff McLane c8751f60e7 Issue #286 - use AddByte for internal transfer 2015-10-20 15:04:18 +02:00
Geoff McLane d75c82275d Issue #285 - Add a ResetTags func to erset html5 mode before each document 2015-10-14 16:55:35 +02:00
Geoff McLane adbad0379e Issue #65 - if nonested then no endtag needed to decrement.
This is only if nonested is on, then a <script> tag has not incremented
the nested, so likewise no need to treat an escaped close tag <\/script>
as an end tage to decrement nested.
2015-10-08 17:06:03 +02:00
Geoff McLane 7e69ceb3d1 Issue #281 - only warn BAD_CDATA_CONTENT if inserting an escape. 2015-10-07 16:17:42 +02:00
Geoff McLane b63c1090c2 option to avoid incrementing nested comtainers.
This is in the GetCDATA function. If the container is script or style and
this option is on, avoid bumping nested.

This addresses issues #65 (1642186) and #280.

All attempts at parsing script data are now abandoned as a bad direction.
2015-10-07 15:11:25 +02:00
Geoff McLane b4efe7464a small enhancement of debug only code 2015-10-05 15:08:20 +02:00
Geoff McLane 6c1a2acea2 #273 - avoid xhtml doctype flip/flop 2015-09-27 17:36:57 +02:00
Christopher Brannon 94b0647c08 Issue #65, fix for ignoring cdata. 2015-09-24 18:13:57 -07:00
Geoff McLane 04ca419080 Issue #64 - Try hard to skip '<![CDATA[ ... ]]>' 2015-09-24 14:21:55 +02:00
Geoff McLane 96589c6f57 #65 Skip esc'd esc, and only for script containers 2015-09-21 12:33:53 +02:00
Geoff McLane eda37c5adb Issue #65 - avoid new quotes if in quotes 2015-09-19 14:58:42 +02:00
Geoff McLane d541405a2a Eventually complete a 2007 fix 2015-09-16 13:17:50 +02:00
Geoff McLane 9960f7c6dd Protext agains a NULL node in the Debug only code 2015-09-12 13:06:14 +02:00
Srikanth Anantharam be9f1d4203 using _fileno(fout) instead of fout->_file makes it more portable across different MSVC versions 2015-09-11 00:27:17 +05:30
Geoff McLane c48680cc01 Issue #180 - fix indenting when -omit used 2015-09-10 15:01:48 +02:00
Geoff McLane 66e288a8e2 Issue #239 - no warn for apos enitity in html5++ mode 2015-08-22 14:03:02 +02:00
Geoff McLane e79137de7f Issue #238 - only except the pre element 2015-08-22 14:00:18 +02:00
Geoff McLane 1d67dc940a Merge branch 'Andrew-Dunn-patch-1' into issue-228.
That is reordering windows includes per #234

In general the order of includes should be system <headers>,
then local "headers", except perhaps for the ocassional local
"version" or "config" header...

Resolved conflicts in src/pprint.c by reverting to current master, and in
version.txt by increasing the version.
2015-08-10 18:49:13 +02:00
Andrew Dunn dfdffd0cb3 Reordered Windows Includes
Moved the <windows.h> include above the "streamio.h" include to fix compilation with the latest Windows SDK.

<winnt.h> now has the following struct. In particular the `CR` member of this struct conflicts with a define in streamio.h.

    typedef struct _IMAGE_ARM64_RUNTIME_FUNCTION_ENTRY {
        DWORD BeginAddress;
        union {
            DWORD UnwindData;
            struct {
                DWORD Flag : 2;
                DWORD FunctionLength : 11;
                DWORD RegF : 3;
                DWORD RegI : 4;
                DWORD H : 1;
                DWORD CR : 2; // This line causes a compile error because CR is redefined in streamio.h
                DWORD FrameSize : 9;
            } DUMMYSTRUCTNAME;
        } DUMMYUNIONNAME;
    } IMAGE_ARM64_RUNTIME_FUNCTION_ENTRY, * PIMAGE_ARM64_RUNTIME_FUNCTION_ENTRY;
2015-08-07 17:06:33 +10:00
Geoff McLane cbae924a40 Oops, missed setting 'type' for TidyVertSpace.
This was evidenced by an 'assert' failure, that the type was not an 'int'!

And also in the -xml-help output, thus effecting the tidy.1 manual page
for this new feature --vertical-space auto, which produces almost single
line html output.

This 'fix' began in the issue-228 branch - see Issue #231
2015-07-31 13:39:06 +02:00
Geoff McLane 38ef5bfe85 Issue #232 remove CM_HEAD from 'object' tag 2015-07-30 14:50:15 +02:00
Geoff McLane ae620a63a2 merge @camoy fix #158 to this branch 2015-07-17 19:00:16 +02:00
Geoff McLane d26cd72084 Add macros to get TidyVertSpace config, and implement 2015-07-15 20:58:00 +02:00
Geoff McLane 154a61543b Expand xml TidyVertSpace text to include tri-state 2015-07-15 20:56:22 +02:00
Geoff McLane 16580e0926 Revert TidyVertSpace to 'no', and make AutoBool option 2015-07-15 20:54:50 +02:00
Geoff McLane 4246c2c462 Issue #230: Need to KEEP this newline char sometimes.
This is a case where the lexer, in GetTokenfromStream, does NOT eat any
trailing newline after a LEX_STARTTAG: case...

So far have identified pre, script, style as NEEDING this user newline
character for later pprint output. Any others?
2015-07-15 19:41:02 +02:00
Cameron Moy d50391a984 Fix #158 - remove inserted newlines in pre 2015-07-13 16:31:52 -04:00
Geoff McLane cb2543efac Merge branch 'master' of https://github.com/stencila/tidy-html5 into issue-228 2015-07-13 19:11:30 +02:00
Nokome Bentley 991630e523 Changes default for vertical-space to yes
Makes this more similar (but not the same) as the previous default
behaviour.
2015-07-13 15:56:15 +12:00
Nokome Bentley b6bcf0408c Applies "smart" new lines to start of script like tags 2015-07-13 15:49:07 +12:00
Nokome Bentley f6979787d1 Adds "smart" line flushing functions.
See in-code comments for more details
2015-07-13 15:40:59 +12:00
Folkert van Heusden 784c7d7f79 Added methods for deleteing nodes and/or attributes.
This is useful when e.g. writing an HTML cleaner.
2015-07-12 18:34:35 +00:00
Geoff McLane 1e70fc6f15 Rename two headers. Issues #224 #223 #221
But this seemed a good time to release 5.0.0.RC1...
2015-06-30 20:06:02 +02:00
Geoff McLane 3a524f1710 Issue #207 - deal with 2 cases of an unambiguous ampersand.
html5 allows a naked ampersand unquoted, and now tidy will not issue a
warning. This only deals with a & b, and P&<li>O</li>

More may need to be done for other cases.
2015-06-24 13:10:27 +02:00
Geoff McLane 3aa50740da Issue #215 - only issue warning if NOT HTML5 mode 2015-06-21 19:49:44 +02:00
Geoff McLane e71bda718f Add TIDY_CALL to tidyLibraryVersion func. 2015-06-09 20:04:49 +02:00
Geoff McLane 18880eab55 Issue #218 - Do NOT allocate a 1 byte null String buffer.
This is when setting a String config value through say tidyOptSetValue
using say tidyOptSetValue(tdoc,id,"").

If the length of the new string is zero then do not allocate a 1 byte
buffer, set it to 0, for the option. Any previous buffer has already been
released.

This means API functions like tidyOptSaveSink will not return erroneous
null String values!
2015-06-08 13:52:00 +02:00
Geoff McLane 3f72b6e335 Issue #210 - Add new warning for summary attr in table if HTML5.
This new warning will only be seen if the document remains in HTML5 mode,
where the summary attribute is obsolete. The W3C validator flags this as
an error, and suggests 'Consider describing the structure of the table in
a caption element or in a figure element containing the table; or simplify
the structure of the table so that no description is needed'.

At the same time this patch also restored the old warning if the document
is HTML4--, if the table element lacks a summary attribute. This has been
a tidy warning since the beginning of time, although the W3C validator
does not presently flag this.
2015-06-06 11:20:35 +02:00
Geoff McLane 326f2414fd Issue #212 - Further fix to set MixedContent in some cases.
In certain circumstances a leading space has to be preverved to allow it
to be used to create a text space node to insert before this element to
preserve the view in a browser.

And added a note asking why is ParseTag called with a hardcoded
IgnoreWhitespace when some effort above has set the mode variable to
MixedContent in certain cases, but need to think about this 2nd change.

Also added some MSVC Debug output when this leading text is used to insert
such a created text node before the element just to be reminded of this
special event.
2015-06-04 13:12:05 +02:00
Geoff McLane a278b04a19 Add debug display of text modes.
Note this ONLY effects a MSVC Debug build!
2015-06-04 12:59:02 +02:00
Geoff McLane c18f27a587 Issue #217 - avoid len going negative, ever... 2015-06-03 20:26:03 +02:00
Geoff McLane 0fb7ccdfc6 Add some mem alloc and free debug to chase Issue #217
Such debug is OFF by default, and only added by defining DEBUG_MEMORY. And
is only available for the Debug configuration compiled with MSVC, but this
could be easily extended...
2015-06-03 20:24:41 +02:00
Geoff McLane 944b412fe6 Need extra include if UNICODE is defines 2015-06-02 20:44:00 +02:00
Geoff McLane b8bc88522c small fix for indent-with-tabs to have a default xml value 2015-05-25 16:48:39 +02:00
Denis Denisov 5a28d5f010 5.0.0
htacg/tidy-html5#190
2015-05-24 23:49:00 +03:00
Geoff McLane d923dd7b2d Issue #108 - first cut new option --indent-with-tabs yes. 2015-05-22 16:06:12 +02:00
Geoff McLane 5d5e689f1a For issue #212, retain mixed mode block parsing.
This is particularly for the anchor tag which in html5 mode is parsed in
ParseBlock. That is retain a leading space, in case it needs to be
moved to in front of the block to keep space rendering.
2015-05-13 12:35:06 +02:00
Geoff McLane 963caf0741 add counter for in ParseBlock 2015-05-12 17:14:09 +02:00
Geoff McLane c1a3100cb9 add conveninet break point based on row and column 2015-05-12 17:13:23 +02:00
Geoff McLane b2b9f1d6f2 spelling error noted in exploration of #207 in localize.c 2015-04-26 19:19:55 +02:00
Dmitry Ivanov 9a3f85d44c Support build with MinGW 4.9.1 2015-04-26 13:18:46 +03:00
Geoff McLane 2f6b3d49b6 Merge pull request #202 from aerilon/master
Please pull fix for #198 and #199
2015-04-22 21:24:12 +02:00
Geoff McLane f5eb2cf26a Issue #196 - expand comment and bump version.
Thanks to @willydee for this PR.
2015-04-11 15:25:07 +02:00
willydee 253a7e54c3 Fix for #196: HTML5 allows block elements in <CAPTION> 2015-04-11 15:06:35 +02:00
Arnaud Lacombe c05661df11 Issue #199 - Add support for html5's template tag 2015-04-10 15:50:07 -07:00
Geoff McLane e78c0105d3 Indicated by #191, why show doctype warning if omitted in output 2015-04-08 18:45:31 +02:00
Geoff McLane 5cbd3ee95b From issue #191, saw need to revert to 'master' branch 2015-04-08 17:55:12 +02:00
Geoff McLane 3585d4c31a Issue #186 - Move FreeLexer() to near last 2015-03-19 19:14:27 +01:00
Geoff McLane 79ac8b2554 Issue #185 - Treat elements ids as case-sensitive if in HTML5 mode 2015-03-13 19:47:28 +01:00
Geoff McLane 66a597f5b7 related to issue #180 - remove additional line unless 'classic' 2015-03-10 12:27:29 +01:00
Geoff McLane 9caab688f1 debug - avoid duplicae output if to stdout 2015-03-09 16:12:59 +01:00
Geoff McLane fd7b4f8589 just some more DEBUG on text nodes 2015-03-06 19:28:52 +01:00
Geoff McLane c0cad3aeba Issue #167 - further fixes for HTML5 mode 2015-03-06 19:13:06 +01:00
Geoff McLane 389ce17814 add attr to dbg_show_node 2015-03-06 18:36:01 +01:00
Geoff McLane 0dc68d6cb1 Issue #167 & #169 - default to HTML5 mode.
Revert TidyTag_A to HTML5 mode, but allow the table to be modified if the
DOCTYPE given is found to NOT be HTML5, through a service TY_(AdjustTags).
Care is taken to clear any previous hash cached tags.

At present this only effects the anchor tag, but could be applied to
others that need to change their parsing due to an identified DOCTYPE.
2015-03-06 12:55:24 +01:00
Geoff McLane 606ffebd47 Issue #168 - Fix for access test 5.2.1.2 2015-03-04 19:38:59 +01:00
Geoff McLane 86f626cd67 Issue #167 - revert anchor tag to inline only 2015-02-28 20:30:56 +01:00
Geoff McLane 4b2943edb3 Issue #162 - fix for this while hopefully maintaining #111 fix.
The fix for #111 added an end tag for all StartEnd tags, when outputting
HTML5, but there should be some exceptions to this.

Added a new service, isVoidElement(node) for the void elements. Perhaps
this service could be further optimised.
2015-02-24 17:51:59 +01:00
Geoff McLane cfffe7765f Issue #166 - repeated main element.
With this fix introduced two new services, FindNodeById and
FindNodeWithId. The former does a total tree search for a TidyTagId.

Maybe there is a way to optimise this search...

Also change the uint badForm from an on/off to a bit field, so could be
extended to other document format errors.
2015-02-24 15:04:19 +01:00
Geoff McLane a5629443e6 Just improve some debug output 2015-02-24 13:20:26 +01:00
Geoff McLane 70d7e58d8d Add macro nodeIsMAIN(n) 2015-02-22 20:53:14 +01:00
Geoff McLane 0aa81eb256 Issue #130 - MathML attr and entity fix!
This is a set of kludgy fixes for MathML attribute and entities support.

It is intended that a full HTML5 entity table be added at some time, but
at present ALL entities are accepted as written when within the math
element.

Likewise all attributes are accepted on MathML elements without any check
of their name or value, even if they match attributes outside MathML.

And in the pprinter such entities are written as is from the lexer, using
a new PPrintMathML service added, using the new mode OtherNameSpace.

It is hoped all these fixes will NOT effect tidy outside the math element.

ALL fixes in the set a clearly marked '#130 - MathML attr and entity fix!'
for easy searching, and improving if possible.
2015-02-22 18:58:55 +01:00
Frédéric Wang fe51244d4a Use HTML5 mapping for entities &lang; and &rang; (http://www.w3.org/TR/xml-entity-names/#diff-xhtml1). #130 2015-02-21 19:33:24 +01:00
Geoff McLane b144b834cd Add a show_all_nodes debug service 2015-02-19 19:14:40 +01:00
Geoff McLane 6e3b293985 Issue #130 - Add TidyAttr_DISPLAY for math tag 2015-02-13 18:37:07 +01:00
Jim Derry 04f68a032f Changed text to point to html-tidy.org 2015-02-13 19:17:25 +08:00
Geoff McLane cff3fdd308 Issue #133 - hopefully a better fix.
As predicted the previous fix had adverse consequences on say script text,
which then lost the indent, and was reverted.

This introduces a new service, nodeIsTextLike, which naturally returns yes
if it is text, but also is an AspTag.

Maybe other text like nodes need to be added.
2015-02-12 15:24:38 +01:00
Geoff McLane 5d2cbd10dc Revert "Issue #133 - ever increasing indent!"
This reverts commit 0f80c08355.

This commit had other BAD consequences
2015-02-12 14:56:51 +01:00
Geoff McLane ea50bd30e7 add comment only for potential fix of Issue #8 2015-02-10 15:32:05 +01:00
Geoff McLane 279c29bf8d Issue #20 - revert to 880221e fix this issue 2015-02-10 15:28:56 +01:00
Geoff McLane cbd9eb7903 Issue #155 - issue warnings unless --show-body-only yes 2015-02-07 13:56:13 +01:00
Geoff McLane b26291ec6b Issue #151 - Initial implementation of picture element.
TODO: check, verify the picture attribute list.
2015-02-07 13:42:22 +01:00
Geoff McLane d72e681d32 Issue #152 - add srcset and sizes to img tag 2015-02-06 19:24:04 +01:00
Geoff McLane 2172a498f6 Issue #153 - fix for endif section no conforming to what tidy expects 2015-02-05 19:01:34 +01:00
Geoff McLane 66951a562a add row/col to DEBUG output 2015-02-05 18:24:02 +01:00
Geoff McLane 1be5ccbb63 Issue #130 - initial MathML support 2015-02-05 12:21:08 +01:00
Geoff McLane 698396eaa0 Issue #149 - avoid crash on null attr value 2015-02-03 13:38:20 +01:00
Geoff McLane 885c7caab7 Issue #70 - Initial implmentation of SVG support.
An immense thanks to Ger Hobbelt who had already done this
in his github.com/GerHobbelt/htmltidy fork.

The two sources have diverges so was not a simple cut
an paste. But again thanks Ger for this.
2015-02-02 17:36:27 +01:00
Geoff McLane 63c6671f59 Issue #126 - partial fix for indenting style 2015-02-01 18:35:28 +01:00
Geoff McLane 7a3afd41ca Issue #132 - avoid warning if configured to omit-optional-tags.
Difficult decision to avoid head warning.
2015-02-01 16:05:39 +01:00
Geoff McLane 3e0cfbb88d Merge branch 'develop-500' of github.com:htacg/tidy-html5 into develop-500 2015-02-01 14:40:30 +01:00
Geoff McLane 0f80c08355 Issue #133 - ever increasing indent!
This is a simple but profound change in pprint.c.
Since leading space is preserved on script code, after tidy indents
the code once, a second run on that tidied file would add more indent
to already indented code. This fix should be carefully checked,
and removed if there are other bad consequences.
Bump the version point to 4 for this change.
2015-02-01 14:34:22 +01:00
Jim Derry 3c721d8bf8 #72, avoid showing some warnings when show-body-only. 2015-02-01 21:16:17 +08:00
Jim Derry e6930feb02 Fixes #78 by treating canvas as a block level tag. 2015-02-01 19:28:38 +08:00
Jim Derry 6f9dc2327e Fix for #103 - don't drop empty dd tags. 2015-02-01 18:46:31 +08:00
Jim Derry 5897b66e1e Fixed #42. 2015-02-01 18:09:17 +08:00
Jim Derry de97628f8f Deprecated tidyReleaseDate(). Returns epoch time (tdb). Removed dates from help, manpage, output, cmake, etc. 2015-02-01 14:20:41 +08:00
Geoff McLane 0eb1407818 Issue #102 - fix this regression bug 2015-01-29 18:25:19 +01:00
Geoff McLane 85070acd8c fix windows DLL build 2015-01-28 17:15:44 +01:00
Geoff McLane 59974e0bb0 When adding meta element for Tidy prefer library version over date 2015-01-28 12:15:27 +01:00
Geoff McLane ec4d4cd1f1 Issue #92 - OLD problem of ins and del
These are marked as CM_INLINE, but also CM_BLOCK,
so should not be stacked for insertion
2015-01-28 11:50:06 +01:00
Geoff McLane f83c604ad0 add some more messy debug output 2015-01-27 18:07:56 +01:00
Geoff McLane 9fb90f55d4 Oops, that fix should be after a potential new line in head 2015-01-26 12:23:41 +01:00
Geoff McLane 813f1c8a13 Issue #56 - long outstanding bug on script tag
Added a PCondFlushLine before emitting the script tag
Certainly looks better, but need to check for any regression
2015-01-25 21:03:16 +01:00
Geoff McLane f1297f718c remove a msvc warning 2015-01-25 20:18:36 +01:00
Geoff McLane 60d8271ecf Issue #132 - no warning when inserting a BODY tag 2015-01-25 20:10:41 +01:00
Geoff McLane 27bc767325 Fix some more info messages 2015-01-24 16:39:57 +01:00
Geoff McLane 4a3f5ecf07 make this version 5.0.0 2015-01-22 13:40:50 +01:00
Geoff McLane ce7c70532c Remove what seems like a regression 2015-01-20 18:10:57 +01:00
Geoff McLane f884da577d Small fix for TidyAttr_ASYNC to CH_BOOL 2015-01-20 18:10:13 +01:00
Geoff McLane 3f46000197 add allowfullscreen attribute exception 2015-01-18 20:59:27 +01:00
Geoff McLane 82fc656863 exception attr tabindex can begin with '-' 2015-01-18 14:46:12 +01:00
Jim Derry b8608380a2 Add support for 'role' attribute. #115 2014-11-22 20:44:38 +08:00
Jim Derry 08d490f406 Print end tag for self-closing tag #113 2014-11-22 20:08:00 +08:00
Jim Derry edb185a308 Use a hash table for anchors #64 2014-11-22 19:39:06 +08:00
Jim Derry 9a0b05cb69 Added HTML Microdata (itemprop, etc.) support. 2014-11-22 19:32:30 +08:00
Jim Derry 7754802884 Updated Aria attributes to geoffmcl's added tags; added missing aria-orientation. 2014-11-22 17:39:17 +08:00
Jim Derry e4f7aa0748 Merged changes from andrewle, #96, support Aria attributes. 2014-11-22 17:38:22 +08:00
Jim Derry e279302eaf Added TidyReportFilter2 2014-11-22 15:43:11 +08:00
Jim Derry 6aaf826476 Restart with geoffmcl's fork 2014-11-22 15:42:28 +08:00
Jim Derry 3bcf49cfde Preserve for both filters. 2014-04-27 18:16:13 +08:00
Jim Derry f3cc89d234 Correctly use a copy of args so both filters can work. 2014-04-27 18:11:50 +08:00
Jim Derry 5786a5afb5 Implement TidyReportFilter2 using raw strings to support library localization. 2014-04-27 16:07:20 +08:00
Eberhard Beilharz 3d4c4021ae Print end tag for self-closing tag
When we output HTML5 and we encounter a self-closing tag we have to
output an end tag to be standard conform.

This fixes issue #111.
2014-01-09 17:38:41 +01:00
Andrew Le 27d8ca6a69 Add support for aria attributes
Reference: http://dev.w3.org/html5/markup/aria/aria.html#aria-attrs-all
2013-06-28 16:50:54 -07:00
Peter Kelly 7fc3255542 Applied hash table optimisation to RemoveAnchorByNode. This function now takes
the anchor name as a parameter, so it can look in the correct bin.

In the case of FreeAttrs, we have the name already (since we found a name or
id attribute). In the case of FixAnchors, the anchor name could come from
either the name or id attribute, so we call the function separately for each
case, passing the appropriate attribute value.
2012-08-20 10:06:30 +07:00
Peter Kelly 11a8648818 Use a hash table for anchors 2012-08-20 00:29:16 +07:00
Craig Barnes ce27a729dc Remove CVS info blocks 2012-08-08 17:27:29 +01:00
Michael[tm] Smith 1b4dcd0e43 Update version. 2012-07-21 14:23:38 +09:00
Michael[tm] Smith c63cc3924b Allow block content in address. Fixes #55.
Thanks Craig Barnes.
2012-07-21 14:17:13 +09:00
Michael[tm] Smith fad2769449 Minor cleanup. 2012-07-21 13:37:07 +09:00
John Weldon 46e8e9d254 Better fix than 0d41d42, the gdoc.(c|h) files weren't included in the msvc2010 project 2012-07-12 10:41:05 -07:00
John Weldon 379a54c49d Merge branch 'master' of https://github.com/w3c/tidy-html5 2012-07-12 10:18:55 -07:00
John Weldon 0d41d4247e Fix tidylib.obj : error LNK2019: unresolved external symbol _prvTidyCleanGoogleDocument referenced in function _tidyDocCleanAndRepair
Signed-off-by: Michael[tm] Smith <mike@w3.org>
2012-07-06 15:47:24 +09:00
Michael[tm] Smith 6ef8d92b0f Bring docs up to date with code. Fixes #41.
Thanks ralfjunker.
2012-07-04 16:35:34 +09:00
John Weldon 046a40da34 Fix tidylib.obj : error LNK2019: unresolved external symbol _prvTidyCleanGoogleDocument referenced in function _tidyDocCleanAndRepair 2012-07-03 12:10:53 -07:00
Michael[tm] Smith a61504c57a Merge pull request #40 from stevenle/master
Remove WbrToSpace from being cleaned
2012-06-26 20:01:02 -07:00
Steven Le d942983fb0 Remove WbrToSpace since <wbr> is HTML5 valid. 2012-06-22 10:35:43 -07:00
Michael[tm] Smith 5220d8fe2e Minor quickref formatting problem. 2012-06-20 17:08:44 +09:00
Michael[tm] Smith 68a9e741a1 Merge pull request #39 from stevenle/master
Remove <wbr> as a proprietary tag.
2012-06-20 01:05:42 -07:00
Michael[tm] Smith 664f95ba5b Update the generated API docs. 2012-06-20 16:58:34 +09:00
Michael[tm] Smith a772bbb17f Let's actually commit the -gdoc feature this time. 2012-06-20 16:55:42 +09:00
Michael[tm] Smith 45fce5e3c2 Merge branch 'master' of github.com:w3c/tidy-html5
Forgot to pull.
2012-06-20 16:51:43 +09:00
Michael[tm] Smith 09e310b50c -gdoc opt, clean Google Docs HTML; fr Dave Raggett 2012-06-20 16:48:12 +09:00
Steven Le 57a98b97b6 Changed W3CAttrsFor_VIDEO -> W3CAttrsFor_WBR 2012-06-19 15:42:48 -07:00
Steven Le e8013a74b8 Remove <wbr> as a proprietary tag. 2012-06-19 09:39:06 -07:00
Joaquin Cuenca Abela f91a020894 Accept HTML5 input types. 2012-06-07 16:54:04 +02:00
Michael[tm] Smith f212c3f380 Update version. 2012-04-26 00:31:18 +09:00
Ryan Kistner 8025154e30 Check for NULL pointer before calling strcasecmp in GetVersFromFPI 2012-04-23 11:51:58 +09:00
Michael[tm] Smith 869ab4a4e5 Fix C90 compatibility issue. 2012-04-10 14:57:15 +09:00
Michael[tm] Smith 666d5bb1f4 Another minor editorial change. 2012-04-07 16:49:19 +09:00
Michael[tm] Smith d193420729 Minor editorial update. 2012-04-07 16:41:06 +09:00
Michael[tm] Smith 4408516c32 Clarify doc for wrap-attributes option. 2012-04-07 16:33:10 +09:00
Michael[tm] Smith c66f165f00 Don't line-wrap title attr. Thx Oliver Prygotzki.
Fixes #28.
2012-04-07 15:58:28 +09:00
Michael[tm] Smith d194e8726e Added show-info option. Fixes #6. 2012-04-02 16:41:05 +09:00
Michael[tm] Smith 880221e26d Don't eat whitespace after CM_MIXED end tags.
Fixes #20.
2012-03-29 14:00:24 +09:00
Michael[tm] Smith 1f2162553a Back out fix for #20. 2012-03-29 12:17:13 +09:00
Michael[tm] Smith 895fbb13f0 Minor rewording. 2012-03-24 19:08:04 +09:00
Michael[tm] Smith 2cd21a6693 Added omit-optional-tags option. Fixes #22.
Thanks towolf.
2012-03-24 19:04:46 +09:00
Michael[tm] Smith f5c273910c Merge branch 'master' of github.com:w3c/tidy-html5 2012-03-23 23:52:01 +09:00
Michael[tm] Smith 87d8cb5281 Don't hoist style into head. Relates to #23. 2012-03-23 23:50:41 +09:00
Thies C. Arntzen a3d49a7143 fix <a>hallo</a> world WS destruction 2012-03-23 22:08:11 +09:00
Michael[tm] Smith 4ff3234431 List new options on the index page. 2012-03-17 17:24:01 +09:00
Michael[tm] Smith ddb5702a08 Point to http://w3c.github.com/tidy-html5/ where appropriate. 2012-03-17 17:07:48 +09:00
Michael[tm] Smith 1052c2b81e New merge-emphasis & coerce-endtags options added.
Fixes #19.
2012-03-17 16:26:41 +09:00
Michael[tm] Smith 3ed33a1823 Merge in TidyAttr_XML_LANG change. 2012-03-16 20:55:59 +09:00
Michael[tm] Smith 3a9a794d8b Minor formatting edit. 2012-03-15 14:12:41 +09:00
Michael[tm] Smith 0c8b587067 Added --doctype=html5 option value. Fixes #17. 2012-03-15 14:11:01 +09:00
Michael[tm] Smith a1bb2d24b1 Updated version and quickref. 2012-03-15 11:15:11 +09:00
Michael[tm] Smith bf1c2f67a9 Added drop-empty-elements options. Fixes #19. 2012-03-15 10:58:10 +09:00
Michael[tm] Smith 5b9d25dcf9 source in video. Fixes #8. Fixes #9. 2012-03-15 10:30:11 +09:00
Michael[tm] Smith 879e6cf909 Put "experimental" in the meta@generator output too. 2012-03-14 20:05:12 +09:00
Michael[tm] Smith c331917c31 Make the doctype handling work the way it should. 2012-03-14 19:38:18 +09:00
Michael[tm] Smith 40f486ce5a Force-remake of the README.md file. 2012-03-01 23:00:01 +09:00
Michael[tm] Smith 701a17400a Made minor build changes. 2012-03-01 18:17:51 +09:00
Michael[tm] Smith ccf2a6c7fe Added the API docs. 2012-03-01 17:54:20 +09:00
Michael[tm] Smith 1c4d43ad2a Deal with version reporting better. 2012-03-01 17:22:03 +09:00
Michael[tm] Smith 47ef78487d embed, keygen & wbr are not proprietary. 2012-02-29 01:19:37 +09:00
Michael[tm] Smith f4edfc693b Merge pull request #10 from stevenle/master
Empty <progress> tags no longer stripped. Fixes #10.
2012-02-28 08:18:25 -08:00
John Schember 8727af8a7c Fix format string warnings. 2012-02-26 11:53:53 -05:00
Steven Le 722ae0b360 Empty <progress> no longer stripped. 2012-02-24 16:26:09 -08:00
Steven Le b554dc12ef Empty <progress> no longer stripped. 2012-02-24 16:23:32 -08:00
Dominique Hazael-Massieux f6a3bbecdb fix for ISSUE #7: empty canvas no longer stripped 2012-02-24 13:31:23 +01:00
Michael[tm] Smith b0997b2c48 Allow the <a> element to contain block content.
Thanks Steven Le.
http://lists.w3.org/Archives/Public/html-tidy/2012JanMar/0017.html
2012-02-19 20:04:23 +09:00
Michael[tm] Smith 0dbac2535b Update the man page a bit. 2012-02-18 19:40:37 +09:00
Michael[tm] Smith 184f411544 Update general-info message tidy emits at end. 2012-02-18 18:17:43 +09:00
Michael[tm] Smith 6c9895de30 Make UTF-8 the default. 2012-02-16 12:07:03 +09:00
Michael[tm] Smith 4ad0d1f2f7 Don't emit errors for void elements. 2012-02-16 11:39:54 +09:00
Michael[tm] Smith 73834b8412 Correct meta@name=generator output. 2012-02-10 15:40:33 +09:00
Michael[tm] Smith b26db41c86 Do not mess with <!doctype html>. Fixes #2. 2012-02-10 15:33:21 +09:00
Michael[tm] Smith 33ba8038fd Allow noscript in head, & meta + link in head. 2012-02-10 14:44:18 +09:00
Michael[tm] Smith 264c9bc043 HTML IDs can contain anything except whitespace.
Introduced TY_(IsHTMLSpace)(uint c), which checks to see if c is one of the
chars that the HTML spec (and browsers) treat as a space in attribute
values: 0x020 (space), 0x009 (tab), 0x00a (LF), 0x00c (FF), or 0x00d (CF).
Can't use ANSI C isspace(int c) here because like standard functions for
many other langs, it also treats 0x00b as a space.
2012-01-02 16:12:51 +09:00
Michael[tm] Smith c1be54071d Make action not required on form. 2011-12-10 12:21:10 +09:00
Michael[tm] Smith 4fdc30c120 summary attribute is not required on table in HTML5 2011-11-21 12:34:05 +09:00
Michael[tm] Smith 34305a13d1 report missing href & rel for link elements 2011-11-20 20:58:35 +09:00
Michael[tm] Smith 2144093509 script does not require a type attribute 2011-11-20 19:42:54 +09:00
Michael[tm] Smith 6c1695fb5a style doesn't need type; meta doesn't need content 2011-11-17 16:06:29 +09:00
Michael[tm] Smith b92d7aab88 new 2011-11-17 11:44:16 +09:00