Commit graph

53 commits

Author SHA1 Message Date
Jim Derry 64fb5640cb MSVC snuck in some tab characters... 2017-09-22 19:27:47 -04:00
Jim Derry c579d5b62c Address #412
Add a TidyInfo message each time an unquoted attribute is found. However,
refer to #412 for discussion before merging this.
2017-09-22 19:01:31 -04:00
Jim Derry cf6f47ca1c Squelch some MSVC 2010 warnings, and reset indentation in tidy.c. No version bump. 2017-09-22 17:27:49 -04:00
Jim Derry 51e2e0f3bd Following the example of the recent changes in the "reports" aspect of Tidy's
output, classify and organize all of the dialogue type of messages. This paves
the way towards formalizing (and expanding!) the footnotes system with much
greater explanatory text, as well as providing much better fine-grained control
over which types of output that Tidy will produce.

Moved STRING_DOCTYPE_GIVEN, STRING_CONTENT_LOOKS, and STRING_NO_SYSID to the
Report paradigm from the Dialogue paradigm, as these are items that are
traditionally TidyInfo and included in the Report table, rather than any type
of dialogue.

At this point, we are exactly passing all tests.
2017-09-19 13:52:27 -04:00
Jim Derry 4509695445 Updated documentation in file.
Simplified the update counting.
2017-09-06 21:25:19 -04:00
Jim Derry d8220c061f Updated the remaining items, including all of the accessibility module items.
Note that there are several regressions in the accessibility test suit that
are not related to output messages. These are a result of previous work, and
these results should be updated in the test suite when this item is merged.
2017-09-04 17:35:57 -04:00
Jim Derry 832b4772ad A bit of organizational cleanup. 2017-09-04 16:49:49 -04:00
Jim Derry bc4388e317 Migrated surrogate errors; removed break after return. 2017-09-04 16:38:07 -04:00
Jim Derry 5b6edb5813 EncodingWarning and MissingAttr migrated. 2017-09-04 16:12:01 -04:00
Jim Derry f49c419908 Implement formatter for encoding reports. 2017-09-04 15:50:45 -04:00
Jim Derry 8cb4198724 Entity errors migrated. 2017-09-04 15:28:08 -04:00
Jim Derry 18754c701d Transitioned formatCustomTagDetected to the general formatter. 2017-09-04 11:44:54 -04:00
Jim Derry e3893eb8b3 Also merged reportBadArgument into standard formatter as above. 2017-09-04 11:40:34 -04:00
Jim Derry be22ad3d03 Move file errors into the standard formatter. Local context is preserved with
braces to not pollute stack for other cases.
2017-09-04 11:35:49 -04:00
Jim Derry 283f8974c3 Migrated reports using formatFileError and formatStandard to flexible messaging system. Migrated old reportNotice() to report(). 2017-09-04 11:24:48 -04:00
Jim Derry 66e4d1f8e6 Migrated reports using formatter formatCustomTagDetected. 2017-09-02 18:04:51 -04:00
Jim Derry 0c8f684a4b Migrated messages using formatter formatBadArgument to new message system. All tests passing. 2017-09-02 18:00:46 -04:00
Jim Derry 46aa9605ee All reports that can use formatAttributeReport are now using it. Moved the
badAccess flag to the point of detection.
2017-09-02 17:29:56 -04:00
Jim Derry 00178113c8 A *complete* inventory of every message has been completed, and the dispatchTable
reflects such. Some fleshed in report formatters are included with cases for
several of Tidy's reports, but nothing is yet enabled. All reporting is status
quo, and this is just a bunch of dead code at this point.
2017-09-02 16:47:14 -04:00
Jim Derry 83263466f2 Cleanup ReportNotice() a bit by introducing an HTMLVersion() function. 2017-09-02 12:54:02 -04:00
Jim Derry 951ed381a3 Restore message logic. No bump. 2017-08-31 13:45:01 -04:00
Jim Derry e5a05ae5a8 Address merge conflicts. 2017-08-31 13:15:28 -04:00
Jim Derry 2c82cfa23b Inventoried current error strings, and removed/commented out several:
- BAD_COMMENT_CHARS
  - BAD_XML_COMMENT
  - DTYPE_NOT_UPPER_CASE
  - ENCODING_IO_CONFLICT
  - INCONSISTENT_NAMESPACE
  - INCONSISTENT_VERSION
  - INDICATE_CHANGES_IN_LANGUAGE
  - UNESCAPED_ELEMENT
  - XML_ATTRIBUTE_VALUE
Re-sorted new tidy options.
All tests passing.
Bump version to reflect strings that are externally accessible to API.
2017-08-31 12:57:58 -04:00
Jim Derry 38814f9e3b Sort message labels for simpler inventorying. 2017-08-31 10:57:54 -04:00
Jim Derry e1cbafd647 Handle message outlook properly in messageOut(). 2017-08-31 10:44:16 -04:00
Jim Derry e5eb09198d Begin migration towards "one output function to rule them all." Consolidated
the basic reporting functions that share the same signature. This also resulted
in eliminating a string, and adding a new string to disambiguate between
errors and warnings.
2017-08-30 20:01:44 -04:00
Jim Derry 1562c42c2e Merge branch 'next' into issue-456
Manually fixed merge commits.
2017-08-28 15:17:10 -04:00
Geoff McLane 50859e8258 Issue #567 - add option, messages, and fix node iteration.
Add option TidyStyleTags, --fix-style-tags, Bool, to turn off
this action.

Add warning messages MOVED_STYLE_TO_HEAD, and FOUND_STYLE_IN_BODY.

Fully iterate ALL nodes in the body, in search of style tags...

Changes to be committed:
	modified:   include/tidyenum.h
	modified:   src/clean.c
	modified:   src/config.c
	modified:   src/language_en.h
	modified:   src/message.c
2017-06-28 20:41:46 +02:00
Geoff McLane 97292646f6 Issue #456 - Add 'Info:' message when charset replaced 2017-06-05 17:16:53 +02:00
Geoff McLane a4770daa2b Issue #456 - Add 'Info:' message, when meta added.
It also fixes the addition of the constant 'http-equiv="Content-Type"
attribute.
2017-06-04 20:44:02 +02:00
Geoff McLane eb127a5c5b Issue #550 - K&R/MSVC10 fix - message.c 2017-05-30 18:14:58 +02:00
Geoff McLane ec03beb361 Issue #552 - remove no 'case default:' warning in most gcc versions
Seems too small for a version bump. Closes #552
2017-05-19 18:38:01 +02:00
Jim Derry 7112fba553 Merge pull request #549 from htacg/issue_391
Address #391. Tested on macOS and Win10.
2017-05-11 15:24:44 -04:00
Jim Derry ce105dcf09 Address #391. Tested on macOS and Win10.
- Add a check upon opening a file for validity of the file.
- Add a new message to indicate that the path is not a file.
2017-05-07 17:04:53 -04:00
Jim Derry fd77312175 Attempt to address issue #352. This patch correctly address the specific issues
in #352, but I'm worried that there's some over-reach here.

Currently only implemented as a warning, with no switch to turn it off, which
maintains current behavior other than the warning.

In general, we're treating any string as a complete URL, rather than breaking
URL's into component parts. Thus the `IsURLCodePoint()` check includes a few
other generic characters that strictly speaking aren't valid codepoints, but
are valid as escape characters and delimiters.

When addressing #338, I ran into a similar situation in not having a built-in
method to separate path components (although a simple generalized solution was
good enough in that case).

Thus without introducing a new structure and functions to deconstruct a URL
into scheme, authority, path, parameters, etc., some variation of this patch
will have to be used to address #352.
2017-05-06 18:54:42 -04:00
Geoff McLane 219a5c797b Issue #524 - Remove obsolete message 2017-04-09 02:08:03 +02:00
Jim Derry 5f05add439 Continue the documentation effort!
- Many, many updates to the public header files.
- tidyenum.h was reorganized substantially in order to better generate
  documentation with Doxygen.
- This was also a good time to clean up all of the various enums for languages
  and strings. Everything is simple and in a single enum now, other than a
  couple of cases (TidyOptionId, for example, doesn't need to be redefined).
- A full and complete audit of the strings meant some opportunities to delete
  useless strings.
- Reorganized the order of the strings in language_en.h in order to better
  find things when programmers want to make changes. There are a lot fewer
  internal "sections" now, and everything has been painstakingly sorted within
  the remaining sections.
- Consequently rebased all of the PO's, POT, and other language files.
- Updated several of the READMEs with the newest information.
- Made the READMEs easier to copy into the Doxygen project by changing some of
  the code format for compatibility, mainly the use of tildes instead of
  backslashes for code blocks.
- Added tidyGetMessageCode() to message API. Despite the huge diff, this is the
  only externally-visible change, other than removing some enums (but not their
  values!).
- Passing `next` tests on Mac, Linux, Win10.
2017-03-22 16:05:13 -04:00
Jim Derry 929575afb4 Picklist enums should all start at zero for external LibTidy user compatibility.
Restore the new custom-tags enum to this state, and add separate string keys.
Updated PO's to reflect; no change to headers.
2017-03-20 12:20:51 -04:00
Jim Derry a4f752f274 Implement TODO:
- tidyDetectedHtmlVersion()
- tidyDetectedXhtml()
- added two new fields to W3C_Doctypes[] in order to simplify this.
- added TY_(HTMLVersionNumberFromCode)() to enable lookup.
- Implement tidyDetectedGenericXml()
- Added a warning message if an XML declaration exists but the document is not
  XHTML.
- Remove dead commented code.
- Updated POs and POT. Headers not affected, but translators should check
  their translations.
- Testing is clean on Mac OS X, Ubuntu 16.04, and Windows 10.
2017-03-19 15:41:51 -04:00
Jim Derry 13122e8862 Added tidyErrorCodeFromKey()
Added tidyGetMessageDoc()
Improved the Public API documentation.
2017-03-19 08:15:32 -04:00
Jim Derry 0c5550b06f I think the messages are where I want them to be. Will generate test cases
for comparison. Also regen'd all pots and language headers.
2017-03-15 17:36:05 -04:00
Jim Derry 5606f32f13 WIP; messaging much more logical, except @todo noted. 2017-03-14 21:50:10 -04:00
Jim Derry 66de84bc2b - Add support for the is attribute.
- Add support for autonomous custom elements.
2017-03-13 13:45:32 -04:00
Jim Derry 11178d775b Massive Revamp of the Messaging System
This is a rather large refactoring of Tidy's messaging system. This was done
mostly to allow non-C libraries that cannot adequately take advantage of
arg_lists a chance to query report filter information for information related
to arguments used in constructing an error message.

Three main goals were in mind for this project:

- Don't change the contents of Tidy's existing output sinks. This will ensure
  that changes do no affect console Tidy users, or LibTidy users who use the
  output sinks directly. This was accomplished 100% other than some improved
  cosmetics in the output. See tidy-html5-tests repository, the `refactor` and
  `more_messages_changes` branches for these minor diffs.
- Provide an API that is simple and also extensible without having to write new
  error filters all the time. This was accomplished by adding the new message
  callback `TidyMessageCallback` that provides callback functions an opaque
  object representing the message, and an API to query the message for wanted
  details. With this, we should never have to add a new callback routine again,
  as additional API can simply be written against the opaque object.
- The API should work the same as the rest of LibTidy's API in that it's
  consistent and only uses simple types with wide interoperability with other
  languages. Thanks to @gagern who suggested the model for the API in #409.
  Although the API uses the "Tidy" way off accessing data via an iterator
  rather than an index, this can be easily abstracted in the target language.

There are two *major* API breaking changes:

- Removed TidyReportFilter2
  - This was only used by one application in the entire world, and was a hacky
    kludge that served its purpose. TidyReportCallback (né TidyReportFilter3)
    is much better. If, for some reason, this affects you, I recommend using
    TidyReportCallback instead. It's a minor change for your application.
- Renamed TidyReportFilter3 to TidyReportCallback
  - This name is much more semantic, and much more sensible in light of
    improved callback system. As the name implies, it remains capable of
    *only* receiving callbacks for Tidy "reports."

Introducing TidyMessageCallback, and a new message interrogation API.

- As its name implies, it is able to capture (and optionally suppress) *all*
  of Tidy's output, including the dialogue messages that never make it to
  the existing report filters.
- Provides an opaque `TidyMessage` and an API that can be used to query against
  it to find the juicy goodness inside.
  - For example, `tidyGetMessageOutput( tmessage )` will return the complete,
    localized message.
  - Another example, `tidyGetMessageLine( tmessage )` will return the line the
    message applies to.
- You can also get information about the individual arguments that make up a
  message. By using the `tidyGetMessageArguments( tmessage )` itorator and
  `tidyGetNextMessageArgument` you will obtain an opaque `TidyMessageArgument`
  which has its own interrogation API. For example:
    - tidyGetArgType( tmessage, &iterator );
    - tidyGetArgFormat( tmessage, &iterator );
    - tidyGetArgValueString( tmessage, &iterator );
    - …and so on.

Other major changes include refactoring `messages.c` to use the new message
"object" directly when emitting messages to the console or output sinks. This
allowed replacement of a lot of specialized functions with generalized ones.

Some of this generalizing involved modifications to the `language_xx.h` header
files, and these are all positive improvements even without the above changes.
2017-03-13 13:28:57 -04:00
Jim Derry 165acc4f3e Several foundational changes preparing for release of 5.4 and future 5.5:
- Consolidated all output string definitions enums into `tidyenum.h`, which
    is where they belong, and where they have proper visibility.
  - Re-arranged `messages.c/h` with several comments useful to developers.
  - Properly added the key lookup functions and the language localization
    functions into tidy.h/tidylib.c with proper name-spacing.
  - Previous point restored a *lot* of sanity to the #include pollution that's
    been introduced in light of these.
  - Note that opaque types have been (properly) introduced. Look at the updated
    headers for `language.h`. In particular only an opaque structure is passed
    outside of LibTidy, and so use TidyLangWindowsName and TidyLangPosixName
    to poll these objects.
  - Console application updated as a result of this.
  - Removed dead code:
    - void TY_(UnknownOption)( TidyDocImpl* doc, char c );
    - void TY_(UnknownFile)( TidyDocImpl* doc, ctmbstr program, ctmbstr file );
  - Redundant strings were removed with the removal of this dead code.
  - Several enums were given fixed starting values. YOUR PROGRAMS SHOULD NEVER
    depend on enum values. `TidyReportLevel` is an example of such.
  - Some enums were removed as a result of this. `TidyReportLevel` now has
    matching strings, so the redundant `TidyReportLevelStrings` was removed.
  - All of the PO's and language header files were regenerated as a result of
    the string cleanup and header cleanup.
  - Made the interface to the library version and release date consistent.
  - CMakeLists.txt now supports SUPPORT_CONSOLE_APP. The intention is to
    be able to remove console-only code from LibTidy (for LibTidy users).
  - Updated README/MESSAGES.md, which is *vastly* more simple now.
2017-02-17 15:29:26 -05:00
Geoff McLane 7f73d4f429 Issue #483 - Add ReportSurrogateError() service and connect. 2017-02-11 18:33:45 +01:00
Martin von Gagern 04bc8d3195 Pair va_copy calls with va_end
According to the specs, each va_copy call should be matched by a va_end call
to ensure proper cleanup.  Furthermore, since message filters might iterate
over the list of arguments, we should hand a new copy to each filter.
2016-05-17 22:37:32 +02:00
Geoff McLane 5feca8cfd6 Issue #383 - correct another byte-by-byte output to message file.
As in the previous case these messages are already valid utf-8 text, and
thus, if output on a byte-by-byte basis, must not use WriteChar, except
for the EOL char.

Of course this output can be to either a user ouput file, if configured,
otherwise stderr.
2016-03-24 14:15:43 +01:00
Geoff McLane e6f1533d89 Issue #383 - Output message file text byte-by-byte 2016-03-18 18:47:00 +01:00
Jim Derry 2ade3357a9 Phase 2
This is a MUCH SANER approach to what I was trying to do (now that I screwed up enough internals to understand some of them!
At this point there are zero exit state reversions, and zero markup reversions! There are still 21 errout reversions; I'll
annotate and adjust as necessary.
2016-02-13 11:31:16 +08:00