Commit graph

113 commits

Author SHA1 Message Date
Geoff McLane 8843199370 Issue #456 - Merge branch 'meta-charset' of tidy-html5-marco.
This pulls the work done by @marcoscaceres WIP #458 into the issue-456
branch, to complete the new add-meta-charset option.
2017-05-13 16:02:26 +02:00
Jim Derry 29766afcfd Initial take on issue 365. This is based off of the simplification of the
parser and picklist system. Console application needs to be updated to fix
the description, as it shows autobool, and for some reason on the current
system I'm not getting assertion failures.
2017-05-11 18:12:56 -04:00
Jim Derry 7112fba553 Merge pull request #549 from htacg/issue_391
Address #391. Tested on macOS and Win10.
2017-05-11 15:24:44 -04:00
Jim Derry aeb9a24fab Refactor Picklists and Option Parsers
This PR refactors how picklists and option parsers are implemented in LibTidy,
making is vastly easier to implement new picklists in the future, as well as
modify some of the existing picklists such that they have more logical names.

Picklist arrays are now arrays of structures that include the possible strings
capable of setting a particular option value, and a new parser has been written
to work with these structures.

In addition, several of the existing parsers were removed, as they are now
redundant, and a couple of the remaining parsers were refactored to take
advantage of the new parser.

In effect, this means that:

- New parsers don't have to be written in the majority of cases where new
  options are added that exceed yes/no/auto.
- Some of the existing options can have more meaningful names than yes/no/auto,
  in a backward compatible way. For example, vertical-spacing "auto" currently
  in no way reflects "auto" when used.
2017-05-11 14:40:21 -04:00
Jim Derry acaab679c5 Merge pull request #547 from htacg/issue_352
Attempt to address issue #352.
2017-05-08 17:36:52 -04:00
Jim Derry ce105dcf09 Address #391. Tested on macOS and Win10.
- Add a check upon opening a file for validity of the file.
- Add a new message to indicate that the path is not a file.
2017-05-07 17:04:53 -04:00
Jim Derry fd77312175 Attempt to address issue #352. This patch correctly address the specific issues
in #352, but I'm worried that there's some over-reach here.

Currently only implemented as a warning, with no switch to turn it off, which
maintains current behavior other than the warning.

In general, we're treating any string as a complete URL, rather than breaking
URL's into component parts. Thus the `IsURLCodePoint()` check includes a few
other generic characters that strictly speaking aren't valid codepoints, but
are valid as escape characters and delimiters.

When addressing #338, I ran into a similar situation in not having a built-in
method to separate path components (although a simple generalized solution was
good enough in that case).

Thus without introducing a new structure and functions to deconstruct a URL
into scheme, authority, path, parameters, etc., some variation of this patch
will have to be used to address #352.
2017-05-06 18:54:42 -04:00
Jim Derry 09d1802298 Merge branch 'next' into deprecations 2017-05-06 14:34:48 -04:00
Jim Derry 49b833f63b WIP 2017-05-03 18:16:03 -04:00
lhchavez a19d271f47 Add a flag to warn on proprietary attributes
This change adds the TidyWarnPropAttrs flag (default=on) that emits a
warning every proprietary attribute it finds.
2017-04-15 03:17:16 +00:00
Jim Derry d1e0b22be7 Removed TidyDropFontTags. Note that POs and POT were _not_ updated. 2017-04-04 14:42:47 -04:00
Jim Derry 5f05add439 Continue the documentation effort!
- Many, many updates to the public header files.
- tidyenum.h was reorganized substantially in order to better generate
  documentation with Doxygen.
- This was also a good time to clean up all of the various enums for languages
  and strings. Everything is simple and in a single enum now, other than a
  couple of cases (TidyOptionId, for example, doesn't need to be redefined).
- A full and complete audit of the strings meant some opportunities to delete
  useless strings.
- Reorganized the order of the strings in language_en.h in order to better
  find things when programmers want to make changes. There are a lot fewer
  internal "sections" now, and everything has been painstakingly sorted within
  the remaining sections.
- Consequently rebased all of the PO's, POT, and other language files.
- Updated several of the READMEs with the newest information.
- Made the READMEs easier to copy into the Doxygen project by changing some of
  the code format for compatibility, mainly the use of tildes instead of
  backslashes for code blocks.
- Added tidyGetMessageCode() to message API. Despite the huge diff, this is the
  only externally-visible change, other than removing some enums (but not their
  values!).
- Passing `next` tests on Mac, Linux, Win10.
2017-03-22 16:05:13 -04:00
Jim Derry 929575afb4 Picklist enums should all start at zero for external LibTidy user compatibility.
Restore the new custom-tags enum to this state, and add separate string keys.
Updated PO's to reflect; no change to headers.
2017-03-20 12:20:51 -04:00
Jim Derry a4f752f274 Implement TODO:
- tidyDetectedHtmlVersion()
- tidyDetectedXhtml()
- added two new fields to W3C_Doctypes[] in order to simplify this.
- added TY_(HTMLVersionNumberFromCode)() to enable lookup.
- Implement tidyDetectedGenericXml()
- Added a warning message if an XML declaration exists but the document is not
  XHTML.
- Remove dead commented code.
- Updated POs and POT. Headers not affected, but translators should check
  their translations.
- Testing is clean on Mac OS X, Ubuntu 16.04, and Windows 10.
2017-03-19 15:41:51 -04:00
Jim Derry 89e4a0215a Fixed the enum, which has TidyCustomTags and TidyUseCustomTags documented backwards. 2017-03-18 11:20:56 -04:00
Jim Derry 0c5550b06f I think the messages are where I want them to be. Will generate test cases
for comparison. Also regen'd all pots and language headers.
2017-03-15 17:36:05 -04:00
Jim Derry 5606f32f13 WIP; messaging much more logical, except @todo noted. 2017-03-14 21:50:10 -04:00
Jim Derry 66de84bc2b - Add support for the is attribute.
- Add support for autonomous custom elements.
2017-03-13 13:45:32 -04:00
Jim Derry 11178d775b Massive Revamp of the Messaging System
This is a rather large refactoring of Tidy's messaging system. This was done
mostly to allow non-C libraries that cannot adequately take advantage of
arg_lists a chance to query report filter information for information related
to arguments used in constructing an error message.

Three main goals were in mind for this project:

- Don't change the contents of Tidy's existing output sinks. This will ensure
  that changes do no affect console Tidy users, or LibTidy users who use the
  output sinks directly. This was accomplished 100% other than some improved
  cosmetics in the output. See tidy-html5-tests repository, the `refactor` and
  `more_messages_changes` branches for these minor diffs.
- Provide an API that is simple and also extensible without having to write new
  error filters all the time. This was accomplished by adding the new message
  callback `TidyMessageCallback` that provides callback functions an opaque
  object representing the message, and an API to query the message for wanted
  details. With this, we should never have to add a new callback routine again,
  as additional API can simply be written against the opaque object.
- The API should work the same as the rest of LibTidy's API in that it's
  consistent and only uses simple types with wide interoperability with other
  languages. Thanks to @gagern who suggested the model for the API in #409.
  Although the API uses the "Tidy" way off accessing data via an iterator
  rather than an index, this can be easily abstracted in the target language.

There are two *major* API breaking changes:

- Removed TidyReportFilter2
  - This was only used by one application in the entire world, and was a hacky
    kludge that served its purpose. TidyReportCallback (né TidyReportFilter3)
    is much better. If, for some reason, this affects you, I recommend using
    TidyReportCallback instead. It's a minor change for your application.
- Renamed TidyReportFilter3 to TidyReportCallback
  - This name is much more semantic, and much more sensible in light of
    improved callback system. As the name implies, it remains capable of
    *only* receiving callbacks for Tidy "reports."

Introducing TidyMessageCallback, and a new message interrogation API.

- As its name implies, it is able to capture (and optionally suppress) *all*
  of Tidy's output, including the dialogue messages that never make it to
  the existing report filters.
- Provides an opaque `TidyMessage` and an API that can be used to query against
  it to find the juicy goodness inside.
  - For example, `tidyGetMessageOutput( tmessage )` will return the complete,
    localized message.
  - Another example, `tidyGetMessageLine( tmessage )` will return the line the
    message applies to.
- You can also get information about the individual arguments that make up a
  message. By using the `tidyGetMessageArguments( tmessage )` itorator and
  `tidyGetNextMessageArgument` you will obtain an opaque `TidyMessageArgument`
  which has its own interrogation API. For example:
    - tidyGetArgType( tmessage, &iterator );
    - tidyGetArgFormat( tmessage, &iterator );
    - tidyGetArgValueString( tmessage, &iterator );
    - …and so on.

Other major changes include refactoring `messages.c` to use the new message
"object" directly when emitting messages to the console or output sinks. This
allowed replacement of a lot of specialized functions with generalized ones.

Some of this generalizing involved modifications to the `language_xx.h` header
files, and these are all positive improvements even without the above changes.
2017-03-13 13:28:57 -04:00
Jim Derry 4dc8a2cf9a Bump version to 5.5.5 for this fiasco, and fix poor planning and unfortunate
merge.
  - Sort all of the existing options and re-indent per Tidy standards. This is
    simply for cosmetic effect.
  - Allow the iterator to return all options again, even "internal" options.
    Things are too embedded with N_TIDY_OPTIONS, etc., to try to hide them.
  - Instead, simply add documentation to LibTidy users that they shouldn't use
    internal options.
  - Also added `TidyInternalCategory` to `TidyConfigCategory` without adding a
    new field to the struct. API users should check for this category before
    use.
  - Defined a two character macro for `TidyInternalCategory` for use in
    `option_defs[]`.
  - Changed struct `option_defs[]` to reflect the new category for affected
    options.
  - Removed string indicating * refers to internal options, since it no longer
    applies.
  - Regen'd all strings for previous point.
  - `tidy.c` now checks for `TidyInternalCategory` everywhere in order to
    suppress output.
2017-03-10 09:13:21 -05:00
Jim Derry ac242e9ea4 hotfix 2017-03-09 19:56:16 -05:00
Jim Derry 005127c733 Address issue #472. 2017-03-08 15:37:01 -05:00
Jim Derry c54c10f857 - Removed deprecated options:
- TidySlideStyle
  - TidyBurstSlides

- Added documentation for TidyEmacsFile, since it's a valid option.

- Because TidyEmacsFile is a valid option, tweaked tidy.c so that it can
  be specified in a configuration file without being overwritten by the console
  app. Why a user might do this is dumb, but who are we to stop them.
2017-02-18 18:30:41 -05:00
Jim Derry edc548095c Removed language as tidy config option; it is only CLI option. 2017-02-18 17:16:35 -05:00
Jim Derry 165acc4f3e Several foundational changes preparing for release of 5.4 and future 5.5:
- Consolidated all output string definitions enums into `tidyenum.h`, which
    is where they belong, and where they have proper visibility.
  - Re-arranged `messages.c/h` with several comments useful to developers.
  - Properly added the key lookup functions and the language localization
    functions into tidy.h/tidylib.c with proper name-spacing.
  - Previous point restored a *lot* of sanity to the #include pollution that's
    been introduced in light of these.
  - Note that opaque types have been (properly) introduced. Look at the updated
    headers for `language.h`. In particular only an opaque structure is passed
    outside of LibTidy, and so use TidyLangWindowsName and TidyLangPosixName
    to poll these objects.
  - Console application updated as a result of this.
  - Removed dead code:
    - void TY_(UnknownOption)( TidyDocImpl* doc, char c );
    - void TY_(UnknownFile)( TidyDocImpl* doc, ctmbstr program, ctmbstr file );
  - Redundant strings were removed with the removal of this dead code.
  - Several enums were given fixed starting values. YOUR PROGRAMS SHOULD NEVER
    depend on enum values. `TidyReportLevel` is an example of such.
  - Some enums were removed as a result of this. `TidyReportLevel` now has
    matching strings, so the redundant `TidyReportLevelStrings` was removed.
  - All of the PO's and language header files were regenerated as a result of
    the string cleanup and header cleanup.
  - Made the interface to the library version and release date consistent.
  - CMakeLists.txt now supports SUPPORT_CONSOLE_APP. The intention is to
    be able to remove console-only code from LibTidy (for LibTidy users).
  - Updated README/MESSAGES.md, which is *vastly* more simple now.
2017-02-17 15:29:26 -05:00
Onni Hakala da27b5e339
Add optional xmlns:xlink attributes as valid to support inline svg
fixes #478
2017-01-09 01:38:16 +02:00
Marcos Caceres 169bd38adf Part 1 - Add basic infra for 'add-meta-charset' option 2016-10-04 17:56:29 +11:00
Geoff R. McLane 951a65bdcf Improve doxy comment formating - no code changes 2016-09-11 17:20:43 +02:00
Marcos Caceres e4ae9c064d Add support for link 'as' attribute (closes #449) 2016-08-23 18:46:04 +10:00
Benjamin Esham 54179386be Add support for the "integrity" attribute
This attribute may be used on "link" and "script" elements. See
http://www.w3.org/TR/2016/REC-SRI-20160623/#element-interface-extensions
2016-07-24 10:24:30 -04:00
Geoff McLane 8745f79177 Issue #418 - Ensure TidyAttrID enum exactly matches table.
The file tidyenum.h has an attribute ID enumeration that must exactly
match the attribute_defs[] table in attrs.c.

Originally some attempt was made to keep this enum in some sort of order
but that should now be totally abandonned. Any 'new' attribute
enumerations should be added just above the last N_TIDY_ATTRIBS, and
likewise in the table, to avoid this problem.
2016-07-01 15:52:02 +02:00
Benjamin Esham 9377f65f89 Add support for the HTML5 "crossorigin" attribute
This attribute can only be used on "link" elements.

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/link#Attributes
2016-06-07 22:20:10 -04:00
Geoff McLane 000c6925bd Issue #348 - Add option 'escape-script', def = yes 2016-03-20 01:01:46 +01:00
Jim Derry 429703dce4 Because the previous effort #350 grew too fast and there was a LOT of side effects to
my changes, I'm starting over with this. Comments in the PR thread.

This commit reduces the size of attrdict.c while causing only a single errout
regression that is justified.
2016-02-12 19:34:19 +08:00
Jim Derry 9ae15f45a7 Consistent tabs
Fixed tabs in template file, and regen'd all related files.
2016-01-30 15:51:54 +08:00
Jim Derry d505869910 Localization Support added to HTML Tidy
- Languages can now be added to Tidy using standard toolchains.
- Tidy's help output is improved with new options and some reorganization.
2016-01-30 15:51:53 +08:00
Shane McCarron c0b769c5c7 Initial cut at RDFa support (again)
New branch that implements support for RDFa attributes.  Should be
cleaner than my first attempt in PR #299 - also references issue #209
2015-11-16 11:29:23 -06:00
Geoff McLane c68ad42482 Revert 22a1922c35 2015-11-07 14:50:10 +01:00
Shane McCarron c572e3e3c8 Initial cut at supporting RDFa attributes. 2015-11-06 12:19:05 -06:00
Geoff McLane 800b91e576 Issue #65 - effect name change to skip-nested, and default to on 2015-11-05 15:19:39 +01:00
Geoff McLane d541405a2a Eventually complete a 2007 fix 2015-09-16 13:17:50 +02:00
Geoff McLane d923dd7b2d Issue #108 - first cut new option --indent-with-tabs yes. 2015-05-22 16:06:12 +02:00
Arnaud Lacombe c05661df11 Issue #199 - Add support for html5's template tag 2015-04-10 15:50:07 -07:00
Geoff McLane 6e3b293985 Issue #130 - Add TidyAttr_DISPLAY for math tag 2015-02-13 18:37:07 +01:00
Geoff McLane b26291ec6b Issue #151 - Initial implementation of picture element.
TODO: check, verify the picture attribute list.
2015-02-07 13:42:22 +01:00
Geoff McLane d72e681d32 Issue #152 - add srcset and sizes to img tag 2015-02-06 19:24:04 +01:00
Geoff McLane 1be5ccbb63 Issue #130 - initial MathML support 2015-02-05 12:21:08 +01:00
Geoff McLane 885c7caab7 Issue #70 - Initial implmentation of SVG support.
An immense thanks to Ger Hobbelt who had already done this
in his github.com/GerHobbelt/htmltidy fork.

The two sources have diverges so was not a simple cut
an paste. But again thanks Ger for this.
2015-02-02 17:36:27 +01:00
Jim Derry b8608380a2 Add support for 'role' attribute. #115 2014-11-22 20:44:38 +08:00
Jim Derry 9a0b05cb69 Added HTML Microdata (itemprop, etc.) support. 2014-11-22 19:32:30 +08:00
Jim Derry 7754802884 Updated Aria attributes to geoffmcl's added tags; added missing aria-orientation. 2014-11-22 17:39:17 +08:00
Jim Derry e4f7aa0748 Merged changes from andrewle, #96, support Aria attributes. 2014-11-22 17:38:22 +08:00
Jim Derry 6aaf826476 Restart with geoffmcl's fork 2014-11-22 15:42:28 +08:00
Andrew Le 27d8ca6a69 Add support for aria attributes
Reference: http://dev.w3.org/html5/markup/aria/aria.html#aria-attrs-all
2013-06-28 16:50:54 -07:00
Craig Barnes ce27a729dc Remove CVS info blocks 2012-08-08 17:27:29 +01:00
Michael[tm] Smith a772bbb17f Let's actually commit the -gdoc feature this time. 2012-06-20 16:55:42 +09:00
Michael[tm] Smith d194e8726e Added show-info option. Fixes #6. 2012-04-02 16:41:05 +09:00
Michael[tm] Smith 2cd21a6693 Added omit-optional-tags option. Fixes #22.
Thanks towolf.
2012-03-24 19:04:46 +09:00
Michael[tm] Smith 1052c2b81e New merge-emphasis & coerce-endtags options added.
Fixes #19.
2012-03-17 16:26:41 +09:00
Michael[tm] Smith 0c8b587067 Added --doctype=html5 option value. Fixes #17. 2012-03-15 14:11:01 +09:00
Michael[tm] Smith bf1c2f67a9 Added drop-empty-elements options. Fixes #19. 2012-03-15 10:58:10 +09:00
Michael[tm] Smith 34305a13d1 report missing href & rel for link elements 2011-11-20 20:58:35 +09:00
Michael[tm] Smith b92d7aab88 new 2011-11-17 11:44:16 +09:00