tidy-html5

Author	SHA1	Message	Date
Jim Derry	d4a11b553e	Merge pull request #577 from htacg/issue-572 Issue 572	2017-08-28 10:01:48 -04:00
Jim Derry	f4c64966f0	Added TidyConfigCallback and deprecated TidyOptCallback for consistency with the remainder of the callbacks. TidyConfigCallback is now given a reference to the instance of the TidyDoc that caused the callback to occur. + TidyConfigCallback	2017-08-26 12:47:18 -04:00
Geoff McLane	f7658b2c89	Issue #582 - Remove extra new line in 'classic' mode	2017-08-04 14:23:14 +02:00
Geoff McLane	09f1806834	Issue #572 - discard an all space text node. An earlier patch now passes back an all space text node. Previously this would have been skipped. So add code in ParseList to detect, and discard such a node. Change committed: modified: src/parser.c	2017-07-08 19:45:42 +02:00
Geoff McLane	f26a068809	Issue #572 - More conditions for #396	2017-07-02 21:10:20 +02:00
Geoff McLane	50859e8258	Issue #567 - add option, messages, and fix node iteration. Add option TidyStyleTags, --fix-style-tags, Bool, to turn off this action. Add warning messages MOVED_STYLE_TO_HEAD, and FOUND_STYLE_IN_BODY. Fully iterate ALL nodes in the body, in search of style tags... Changes to be committed: modified: include/tidyenum.h modified: src/clean.c modified: src/config.c modified: src/language_en.h modified: src/message.c	2017-06-28 20:41:46 +02:00
Geoff McLane	d4ca02adfb	Issue #567 - Branch 'issue-567-2' to move all 'style' to 'head'	2017-06-18 20:06:24 +02:00
Geoff McLane	b32e14a8ea	Issue #456 - add new option `show-meta-change`	2017-06-09 03:11:39 +02:00
Geoff McLane	97292646f6	Issue #456 - Add 'Info:' message when charset replaced	2017-06-05 17:16:53 +02:00
Geoff McLane	a4770daa2b	Issue #456 - Add 'Info:' message, when meta added. It also fixes the addition of the constant 'http-equiv="Content-Type" attribute.	2017-06-04 20:44:02 +02:00
Geoff McLane	13b34c9d8b	Issue #456 - BAH! Fix a stupid logic reversal	2017-06-04 15:41:16 +02:00
Geoff McLane	e28ec72301	Merge branch 'next' into issue-456 Continue WIP #456	2017-06-04 14:59:18 +02:00
Geoff McLane	eb127a5c5b	Issue #550 - K&R/MSVC10 fix - message.c	2017-05-30 18:14:58 +02:00
Geoff McLane	722a841ce2	Merge branch 'next' into issue-456 This was to pick up the fix for #395, PR #564, and bumps the version to 5.5.30...	2017-05-29 14:36:14 +02:00
Geoff McLane	4136d85a9c	Issue #395 , #564 - Oops, restore orig char if not closing	2017-05-29 14:26:55 +02:00
Geoff McLane	40e1d64963	Issue #456 - A desparate commit to get this WIP right, but...	2017-05-27 20:13:51 +02:00
Geoff McLane	8a932f96eb	Issue #456 - Oops, incorrect merge conflict	2017-05-27 18:52:49 +02:00
Geoff McLane	049bc6c288	mERGE branch 'next' into issue-456	2017-05-27 18:35:01 +02:00
Geoff McLane	c61b5b7b0c	Merge branch 'next' into issue-395	2017-05-27 18:20:28 +02:00
Geoff McLane	825ad59262	Merge branch 'next' into issue-392	2017-05-27 16:25:24 +02:00
Jim Derry	47c27ecf8e	Generated French header file; bumped to 5.5.26 for updated French language.	2017-05-21 14:29:13 -04:00
Jim Derry	996ddb813d	Merge pull request #554 from htacg/issue-365 Issue 365	2017-05-21 14:24:03 -04:00
Geoff McLane	c9c1d7ae55	Issue #395 - a potential fix	2017-05-21 01:47:36 +02:00
Geoff McLane	6f05041b5e	Issue #392 - a simple fix, but maybe incomplete	2017-05-21 00:18:43 +02:00
Geoff McLane	ec03beb361	Issue #552 - remove no 'case default:' warning in most gcc versions Seems too small for a version bump. Closes #552	2017-05-19 18:38:01 +02:00
Geoff McLane	21f008501a	Issue #456 - Oops, also out of 'lexer.h'	2017-05-15 16:51:34 +02:00
Geoff McLane	a7a4cd6a16	Issue #456 - avoid head work if showing body only	2017-05-15 16:42:49 +02:00
Geoff McLane	f310f1d5de	Issue #456 - Move new TidyMetaCharset to clean	2017-05-15 16:39:53 +02:00
Geoff McLane	6ebd12be67	Issue #456 - More work on this option	2017-05-14 19:08:29 +02:00
Jim Derry	9b2cd06711	Merge branch 'next' into issue-365	2017-05-13 22:27:14 -04:00
Jim Derry	66d0825e58	Merge pull request #557 from htacg/update_langs Update languages against current English.	2017-05-13 22:24:43 -04:00
Jim Derry	5791c55081	Update languages against current English.	2017-05-13 21:07:02 -04:00
Jim Derry	0f1e625324	Address #378 Addresses issue #378 by NOT emitting warnings if `fix-uri` is `no`, for HTML5 documents. This preserves existing behavior for legacy document types.	2017-05-13 20:46:48 -04:00
Jim Derry	d18b21b94c	Merge branch 'next' into issue-365	2017-05-13 19:55:19 -04:00
Jim Derry	b6bf48c24a	Merge pull request #553 from htacg/new_picklists New picklists and parsers	2017-05-13 19:50:20 -04:00
Jim Derry	a399725a1e	Fixed ParseAutoBool error.	2017-05-13 11:39:13 -04:00
Geoff McLane	8843199370	Issue #456 - Merge branch 'meta-charset' of tidy-html5-marco. This pulls the work done by @marcoscaceres WIP #458 into the issue-456 branch, to complete the new add-meta-charset option.	2017-05-13 16:02:26 +02:00
Jim Derry	982504eee0	Case insensitive compare is safe here, and prevents erroneous propriertary attribute errors.	2017-05-12 08:28:11 -04:00
Jim Derry	e7c28636b9	Fixed cause of assertions -- funny, these don't pop up in XCode.	2017-05-12 07:30:20 -04:00
Jim Derry	29766afcfd	Initial take on issue 365. This is based off of the simplification of the parser and picklist system. Console application needs to be updated to fix the description, as it shows autobool, and for some reason on the current system I'm not getting assertion failures.	2017-05-11 18:12:56 -04:00
Jim Derry	7112fba553	Merge pull request #549 from htacg/issue_391 Address #391. Tested on macOS and Win10.	2017-05-11 15:24:44 -04:00
Jim Derry	aeb9a24fab	Refactor Picklists and Option Parsers This PR refactors how picklists and option parsers are implemented in LibTidy, making is vastly easier to implement new picklists in the future, as well as modify some of the existing picklists such that they have more logical names. Picklist arrays are now arrays of structures that include the possible strings capable of setting a particular option value, and a new parser has been written to work with these structures. In addition, several of the existing parsers were removed, as they are now redundant, and a couple of the remaining parsers were refactored to take advantage of the new parser. In effect, this means that: - New parsers don't have to be written in the majority of cases where new options are added that exceed yes/no/auto. - Some of the existing options can have more meaningful names than yes/no/auto, in a backward compatible way. For example, vertical-spacing "auto" currently in no way reflects "auto" when used.	2017-05-11 14:40:21 -04:00
Geoff McLane	f7e7554c95	Close the file before the _WIN32 switch	2017-05-09 19:24:20 +02:00
Jim Derry	acaab679c5	Merge pull request #547 from htacg/issue_352 Attempt to address issue #352.	2017-05-08 17:36:52 -04:00
Geoff McLane	77420b94d0	Fix for 'isalnum' in Windows According to the MSN documentation 'isalnum(c)' is only valid when c equals EOF, or is in the range 0 to 255 inclusive. It states the behavior is undefined outside this range, and in Debug mode triggers an assert dialog.	2017-05-08 18:42:33 +02:00
Jim Derry	ce105dcf09	Address #391 . Tested on macOS and Win10. - Add a check upon opening a file for validity of the file. - Add a new message to indicate that the path is not a file.	2017-05-07 17:04:53 -04:00
Jim Derry	fd77312175	Attempt to address issue #352 . This patch correctly address the specific issues in #352, but I'm worried that there's some over-reach here. Currently only implemented as a warning, with no switch to turn it off, which maintains current behavior other than the warning. In general, we're treating any string as a complete URL, rather than breaking URL's into component parts. Thus the `IsURLCodePoint()` check includes a few other generic characters that strictly speaking aren't valid codepoints, but are valid as escape characters and delimiters. When addressing #338, I ran into a similar situation in not having a built-in method to separate path components (although a simple generalized solution was good enough in that case). Thus without introducing a new structure and functions to deconstruct a URL into scheme, authority, path, parameters, etc., some variation of this patch will have to be used to address #352.	2017-05-06 18:54:42 -04:00
Jim Derry	09d1802298	Merge branch 'next' into deprecations	2017-05-06 14:34:48 -04:00
Geoff McLane	fd2400d55b	Merge pull request #543 from htacg/issue-436 Small documentation change to close #436	2017-05-06 15:44:45 +02:00
Geoff McLane	d4978608e7	Merge pull request #537 from deathbaba/next Correctly process 'bookmarks' in html exported from Google Doc.	2017-05-06 15:35:57 +02:00
Geoff McLane	6839dfe601	Merge pull request #541 from htacg/issue_338 Issue #338 - fix 3 spurious access level 3 warnings...	2017-05-06 15:20:55 +02:00
Geoff McLane	6da0fff256	Merge pull request #532 from lhchavez/add-warn-prop-attrs Add a flag to warn on proprietary attributes	2017-05-06 14:48:36 +02:00
Jim Derry	846b3cde55	Address #436 just to close it.	2017-05-04 13:45:06 -04:00
Geoff McLane	d142527a8e	Issue #338 - Deal with two other spurious access warnings	2017-05-04 17:36:39 +02:00
Jim Derry	49b833f63b	WIP	2017-05-03 18:16:03 -04:00
Jim Derry	8b2f92f625	Issue #338 occurs because the existing routines assume that any URI with an extension is a file, and so links to TLD's ending with .pl, .au, etc., will cause accessibility warnings. This fix attempts to distinguish between URI's that are likely to be files versus links to domains.	2017-05-03 16:15:44 -04:00
Geoff McLane	b03598652f	Issue #461 - alternative patch for this issue	2017-05-02 19:39:16 +02:00
Alexander Zolotarev	87169d8953	Correctly process 'bookmarks' in html exported from Google Doc.	2017-04-19 14:47:27 -10:00
lhchavez	a19d271f47	Add a flag to warn on proprietary attributes This change adds the TidyWarnPropAttrs flag (default=on) that emits a warning every proprietary attribute it finds.	2017-04-15 03:17:16 +00:00
Geoff McLane	d8839485a4	Merge branch 'next' of github.com:htacg/tidy-html5 into next	2017-04-09 02:09:19 +02:00
Geoff McLane	219a5c797b	Issue #524 - Remove obsolete message	2017-04-09 02:08:03 +02:00
Jim Derry	d1e0b22be7	Removed TidyDropFontTags. Note that POs and POT were _not_ updated.	2017-04-04 14:42:47 -04:00
Jim Derry	24afc6a6fa	Fixed some casting issues that Ubuntu object to. - Test on macOS, Win10, Ubuntu. - No version bump for this change.	2017-04-04 14:33:56 -04:00
Geoff McLane	22dcea067e	Issue #335 , maybe #333 , to output indent char, reduce if tab	2017-03-26 16:57:29 +02:00
Geoff McLane	5f88452487	Issue #333 - create exception for span/meta	2017-03-26 16:57:29 +02:00
Jim Derry	5f05add439	Continue the documentation effort! - Many, many updates to the public header files. - tidyenum.h was reorganized substantially in order to better generate documentation with Doxygen. - This was also a good time to clean up all of the various enums for languages and strings. Everything is simple and in a single enum now, other than a couple of cases (TidyOptionId, for example, doesn't need to be redefined). - A full and complete audit of the strings meant some opportunities to delete useless strings. - Reorganized the order of the strings in language_en.h in order to better find things when programmers want to make changes. There are a lot fewer internal "sections" now, and everything has been painstakingly sorted within the remaining sections. - Consequently rebased all of the PO's, POT, and other language files. - Updated several of the READMEs with the newest information. - Made the READMEs easier to copy into the Doxygen project by changing some of the code format for compatibility, mainly the use of tildes instead of backslashes for code blocks. - Added tidyGetMessageCode() to message API. Despite the huge diff, this is the only externally-visible change, other than removing some enums (but not their values!). - Passing `next` tests on Mac, Linux, Win10.	2017-03-22 16:05:13 -04:00
Jim Derry	929575afb4	Picklist enums should all start at zero for external LibTidy user compatibility. Restore the new custom-tags enum to this state, and add separate string keys. Updated PO's to reflect; no change to headers.	2017-03-20 12:20:51 -04:00
Jim Derry	a4f752f274	Implement TODO: - tidyDetectedHtmlVersion() - tidyDetectedXhtml() - added two new fields to W3C_Doctypes[] in order to simplify this. - added TY_(HTMLVersionNumberFromCode)() to enable lookup. - Implement tidyDetectedGenericXml() - Added a warning message if an XML declaration exists but the document is not XHTML. - Remove dead commented code. - Updated POs and POT. Headers not affected, but translators should check their translations. - Testing is clean on Mac OS X, Ubuntu 16.04, and Windows 10.	2017-03-19 15:41:51 -04:00
Jim Derry	13122e8862	Added tidyErrorCodeFromKey() Added tidyGetMessageDoc() Improved the Public API documentation.	2017-03-19 08:15:32 -04:00
Geoff McLane	c8f366b76e	Issue #119 - Remove 3 newline chars, that crept in...	2017-03-18 18:52:48 +01:00
Jim Derry	da55a6e4ac	Removed unused declaration.	2017-03-16 08:00:05 -04:00
Jim Derry	0c5550b06f	I think the messages are where I want them to be. Will generate test cases for comparison. Also regen'd all pots and language headers.	2017-03-15 17:36:05 -04:00
Jim Derry	5606f32f13	WIP; messaging much more logical, except @todo noted.	2017-03-14 21:50:10 -04:00
Jim Derry	66ade9def4	Still noisy, but adds HTML5 dependent output message upon detection.	2017-03-14 16:27:11 -04:00
Jim Derry	ed5a1d84ea	Add TY_(nodeIsAutonomousCustomTag), so we can use it elsewhere.	2017-03-14 15:44:46 -04:00
Jim Derry	8273491e16	Change allowed values for custom-tags, and make y equal to inline.	2017-03-14 15:16:11 -04:00
Jim Derry	66de84bc2b	- Add support for the `is` attribute. - Add support for autonomous custom elements.	2017-03-13 13:45:32 -04:00
Jim Derry	11178d775b	Massive Revamp of the Messaging System This is a rather large refactoring of Tidy's messaging system. This was done mostly to allow non-C libraries that cannot adequately take advantage of arg_lists a chance to query report filter information for information related to arguments used in constructing an error message. Three main goals were in mind for this project: - Don't change the contents of Tidy's existing output sinks. This will ensure that changes do no affect console Tidy users, or LibTidy users who use the output sinks directly. This was accomplished 100% other than some improved cosmetics in the output. See tidy-html5-tests repository, the `refactor` and `more_messages_changes` branches for these minor diffs. - Provide an API that is simple and also extensible without having to write new error filters all the time. This was accomplished by adding the new message callback `TidyMessageCallback` that provides callback functions an opaque object representing the message, and an API to query the message for wanted details. With this, we should never have to add a new callback routine again, as additional API can simply be written against the opaque object. - The API should work the same as the rest of LibTidy's API in that it's consistent and only uses simple types with wide interoperability with other languages. Thanks to @gagern who suggested the model for the API in #409. Although the API uses the "Tidy" way off accessing data via an iterator rather than an index, this can be easily abstracted in the target language. There are two major API breaking changes: - Removed TidyReportFilter2 - This was only used by one application in the entire world, and was a hacky kludge that served its purpose. TidyReportCallback (né TidyReportFilter3) is much better. If, for some reason, this affects you, I recommend using TidyReportCallback instead. It's a minor change for your application. - Renamed TidyReportFilter3 to TidyReportCallback - This name is much more semantic, and much more sensible in light of improved callback system. As the name implies, it remains capable of only receiving callbacks for Tidy "reports." Introducing TidyMessageCallback, and a new message interrogation API. - As its name implies, it is able to capture (and optionally suppress) all of Tidy's output, including the dialogue messages that never make it to the existing report filters. - Provides an opaque `TidyMessage` and an API that can be used to query against it to find the juicy goodness inside. - For example, `tidyGetMessageOutput( tmessage )` will return the complete, localized message. - Another example, `tidyGetMessageLine( tmessage )` will return the line the message applies to. - You can also get information about the individual arguments that make up a message. By using the `tidyGetMessageArguments( tmessage )` itorator and `tidyGetNextMessageArgument` you will obtain an opaque `TidyMessageArgument` which has its own interrogation API. For example: - tidyGetArgType( tmessage, &iterator ); - tidyGetArgFormat( tmessage, &iterator ); - tidyGetArgValueString( tmessage, &iterator ); - …and so on. Other major changes include refactoring `messages.c` to use the new message "object" directly when emitting messages to the console or output sinks. This allowed replacement of a lot of specialized functions with generalized ones. Some of this generalizing involved modifications to the `language_xx.h` header files, and these are all positive improvements even without the above changes.	2017-03-13 13:28:57 -04:00
Jim Derry	4dc8a2cf9a	Bump version to 5.5.5 for this fiasco, and fix poor planning and unfortunate merge. - Sort all of the existing options and re-indent per Tidy standards. This is simply for cosmetic effect. - Allow the iterator to return all options again, even "internal" options. Things are too embedded with N_TIDY_OPTIONS, etc., to try to hide them. - Instead, simply add documentation to LibTidy users that they shouldn't use internal options. - Also added `TidyInternalCategory` to `TidyConfigCategory` without adding a new field to the struct. API users should check for this category before use. - Defined a two character macro for `TidyInternalCategory` for use in `option_defs[]`. - Changed struct `option_defs[]` to reflect the new category for affected options. - Removed string indicating * refers to internal options, since it no longer applies. - Regen'd all strings for previous point. - `tidy.c` now checks for `TidyInternalCategory` everywhere in order to suppress output.	2017-03-10 09:13:21 -05:00
Jim Derry	ac242e9ea4	hotfix	2017-03-09 19:56:16 -05:00
Jim Derry	e27cc262fe	Bring the local vars into the context, which is allowed in C89.	2017-03-09 12:44:48 -05:00
Jim Derry	005127c733	Address issue #472 .	2017-03-08 15:37:01 -05:00
Jim Derry	978756a482	Restore the previous status of `gnu-emacs-file` - Updated strings files to match. - Inhibit internal options from being output via the iterator. Internals should never have the chance to be exposed if they shouldn't be use. - Added tidySetEmacsFile() and TidyGetEmacsFile() to the public API, and use it instead of secret API to set the filename in the console application. The end result is that `gnu-emacs-file` (and also `doctype-mode`) officially no longer exist to CLI users nor to API users, and tidy console behaves properly by using a published API to set the filename for emacs.	2017-03-07 20:11:31 -05:00
Jim Derry	03f0192f51	How did this get back in there???	2017-03-04 15:31:25 -05:00
Jim Derry	74a4fa4049	Merge branch 'next' into clean_deprecations	2017-03-02 11:40:14 -05:00
Jim Derry	3be515b1f9	Merge branch 'next' into messages_squashed	2017-03-02 09:34:58 -05:00
Jim Derry	92621d6f99	MSVC Compatibility - Changed location of pointer operator in declarations. - Updated `CODESTYLE.md` to reflect this. - Updated `API_AND_NAMESPACE.md` to reflect this.	2017-03-02 09:32:02 -05:00
Geoff McLane	a49890ee55	Issue #498 - parser.c - if a <table> in a <table> just close. The previous action was to discard the second, while it is the second table that browsers will render. This conforms to the principle that the html output by tidy should render in a browser like the original html.	2017-02-24 16:20:10 +01:00
Geoff McLane	c4b5904e1c	Issue #497 - lexer.c - Add comment for this PR @seaburg	2017-02-24 14:38:20 +01:00
Geoff McLane	e44f4d1469	Merge pull request #497 from seaburg/fix_value_trimming Fix leading white spaces trimming	2017-02-24 14:30:39 +01:00
Geoff McLane	27fe0548b9	Issue #468 - config.c - use `RAW` encoding for all cases	2017-02-23 16:28:19 +01:00
Geoff McLane	569ae4b435	Issue #329 - lexer.c - do not discard this newline here	2017-02-23 15:27:03 +01:00
Evgeniy Yurtaev	bb1d62d3bd	Fix leading white spaces trimming	2017-02-22 14:34:40 +03:00
Jim Derry	c54c10f857	- Removed deprecated options: - TidySlideStyle - TidyBurstSlides - Added documentation for TidyEmacsFile, since it's a valid option. - Because TidyEmacsFile is a valid option, tweaked tidy.c so that it can be specified in a configuration file without being overwritten by the console app. Why a user might do this is dumb, but who are we to stop them.	2017-02-18 18:30:41 -05:00
Jim Derry	edc548095c	Removed language as tidy config option; it is only CLI option.	2017-02-18 17:16:35 -05:00
Jim Derry	cbb8354f74	Combined leftover attribute API stuff into single, new file.	2017-02-18 16:57:11 -05:00
Jim Derry	f6ce4d130e	Removed deprecated tidyAttrGetSOMETHING from API.	2017-02-18 16:46:20 -05:00
Jim Derry	13c6387f47	Removed deprecated AttributeIsSOMETHING from API.	2017-02-18 16:43:47 -05:00
Jim Derry	a16f36ce53	Removed deprecated NodeIsElementName from API.	2017-02-18 16:33:21 -05:00
Jim Derry	165acc4f3e	Several foundational changes preparing for release of 5.4 and future 5.5: - Consolidated all output string definitions enums into `tidyenum.h`, which is where they belong, and where they have proper visibility. - Re-arranged `messages.c/h` with several comments useful to developers. - Properly added the key lookup functions and the language localization functions into tidy.h/tidylib.c with proper name-spacing. - Previous point restored a lot of sanity to the #include pollution that's been introduced in light of these. - Note that opaque types have been (properly) introduced. Look at the updated headers for `language.h`. In particular only an opaque structure is passed outside of LibTidy, and so use TidyLangWindowsName and TidyLangPosixName to poll these objects. - Console application updated as a result of this. - Removed dead code: - void TY_(UnknownOption)( TidyDocImpl* doc, char c ); - void TY_(UnknownFile)( TidyDocImpl* doc, ctmbstr program, ctmbstr file ); - Redundant strings were removed with the removal of this dead code. - Several enums were given fixed starting values. YOUR PROGRAMS SHOULD NEVER depend on enum values. `TidyReportLevel` is an example of such. - Some enums were removed as a result of this. `TidyReportLevel` now has matching strings, so the redundant `TidyReportLevelStrings` was removed. - All of the PO's and language header files were regenerated as a result of the string cleanup and header cleanup. - Made the interface to the library version and release date consistent. - CMakeLists.txt now supports SUPPORT_CONSOLE_APP. The intention is to be able to remove console-only code from LibTidy (for LibTidy users). - Updated README/MESSAGES.md, which is vastly more simple now.	2017-02-17 15:29:26 -05:00
Jim Derry	e1f066fe14	Merge branch 'empretty_script'	2017-02-13 08:49:13 -05:00
Jim Derry	b7c84b1b57	Merge branch 'surrogates'	2017-02-13 08:49:06 -05:00
Geoff McLane	ea49ca0b1d	Fix license for SPRTF modules. Also correct the coding style to conform to HTML Tidy standard.	2017-02-12 17:38:44 +01:00
Geoff McLane	7f73d4f429	Issue #483 - Add ReportSurrogateError() service and connect.	2017-02-11 18:33:45 +01:00
Geoff McLane	75bc1f06c7	More updates for Issue #483 - Start warning msgs - WIP	2017-02-09 20:55:23 +01:00
Jim Derry	1ac50fccb3	Pretty up output of empty script tags. - No longer break script tags up on two lines if there is content. However output is still subject to the `--wrap` behavior. - Previous behavior intact if there is content. Todo. - Associate this with a new Tidy option.	2017-02-08 13:53:37 -05:00
Geoff McLane	9dc76c1e77	Issue #483 - Some fixes for error condition	2017-02-02 16:43:10 +01:00
Geoff McLane	259d330780	Issue #483 - First cut dealing with 'surrogate pairs'. Only deals with a successful case. TODO: Maybe add a warning/error if the trailing surrogate not found, and maybe consider substituting to avoid invalid utf-8 output.	2017-02-01 13:50:33 +01:00
Geoff McLane	deebc93f97	Merge pull request #480 from onnimonni/feature-fix-xmlns-xlink Add optional xmlns:xlink attributes as valid to support inline svg	2017-01-29 19:17:43 +01:00
Onni Hakala	da27b5e339	Add optional xmlns:xlink attributes as valid to support inline svg fixes #478	2017-01-09 01:38:16 +02:00
Marcos Caceres	91da8c6f74	style: ansi conforming comments	2016-12-20 16:51:09 +11:00
Geoff McLane	fd0ccb2bbf	Bad, repeated node iteration! closes #459	2016-10-30 23:37:31 +01:00
Marcos Caceres	aff76bec38	fix(lexer.c): fixes from initial review	2016-10-17 17:00:58 +11:00
Marcos Caceres	523d58b004	refactor: ask for charset and http_equiv attrs	2016-10-06 19:30:23 +11:00
Marcos Caceres	932cc104a6	feat(attrask.c): learn about charset attr	2016-10-06 19:29:56 +11:00
Marcos Caceres	53ee94ddba	fix: incorrect check for first element in head	2016-10-06 19:07:44 +11:00
Marcos Caceres	b1629c4a4f	fix(lexer): bad attribute reporting	2016-10-05 20:22:19 +11:00
Marcos Caceres	2d7ddfef94	Part 2.1 - Bug fixes and warning	2016-10-05 20:14:18 +11:00
Marcos Caceres	cfc22ac46e	Add garvankeeley's suggestions using calloc	2016-10-05 18:54:25 +11:00
Marcos Caceres	040c22c6dc	Part 2 - Implement lexer logic	2016-10-04 21:23:57 +11:00
Marcos Caceres	169bd38adf	Part 1 - Add basic infra for 'add-meta-charset' option	2016-10-04 17:56:29 +11:00
Geoff McLane	d81a9ad901	Merge branch 'issue-428' Conflicts: version.txt This closes #428	2016-09-11 16:57:07 +02:00
Marcos Caceres	e4ae9c064d	Add support for link 'as' attribute (closes #449 )	2016-08-23 18:46:04 +10:00
Geoff McLane	80e57b23bf	Merge branch 'master' into issue-428 Conflicts: version.txt	2016-08-09 00:46:40 +02:00
Geoff McLane	7631f25ed2	rebase issue-428	2016-08-02 18:10:19 +02:00
Adam Majer	50557a4f63	Fix static buffer overrrun (issue #443 ) result[6] is a fixed array of size 6, but in the process of copying data into it, we clobber the last allocated byte. Simplify some of the code by not calling redundant functions.	2016-08-02 11:10:45 +02:00
Benjamin Esham	54179386be	Add support for the "integrity" attribute This attribute may be used on "link" and "script" elements. See http://www.w3.org/TR/2016/REC-SRI-20160623/#element-interface-extensions	2016-07-24 10:24:30 -04:00
Michal Čihař	10281040ca	Avoid crash in tidyCleanAndRepair if document was not loaded These services can only be used when there is a document loaded, ie a lexer created. But really should not be calling a Clean and Repair service with no doc!	2016-07-07 16:38:05 +02:00
Geoff McLane	685f7a6c5b	Issue #428 - Avoid adding form to input if html5	2016-07-02 20:13:01 +02:00
Geoff McLane	7bec2c2082	Merge pull request #422 from sesom42/master prevent buffer overflow in debug output	2016-06-30 18:32:55 +02:00
Geoff McLane	97700044ce	Merge pull request #410 from gagern/varargs Pair va_copy calls with va_end	2016-06-18 18:53:53 +02:00
Jens Tautenhahn	84fc451a78	prevent buffer overflow in debug output	2016-06-14 15:42:18 +02:00
Benjamin Esham	941b763a8d	Add support for "crossorigin" on audio too	2016-06-08 19:40:15 -04:00
Benjamin Esham	d9d8e92e52	Allow "crossorigin" on img, script, and video tags too	2016-06-07 22:29:57 -04:00
Benjamin Esham	9377f65f89	Add support for the HTML5 "crossorigin" attribute This attribute can only be used on "link" elements. https://developer.mozilla.org/en-US/docs/Web/HTML/Element/link#Attributes	2016-06-07 22:20:10 -04:00
Martin von Gagern	04bc8d3195	Pair va_copy calls with va_end According to the specs, each va_copy call should be matched by a va_end call to ensure proper cleanup. Furthermore, since message filters might iterate over the list of arguments, we should hand a new copy to each filter.	2016-05-17 22:37:32 +02:00
Raphael Ackermann	b704a4d0d4	allow zero LI in UL when html5. fix for #396	2016-04-08 23:08:56 +02:00
Geoff McLane	61a0a331fc	Issue #390 - fix indent with --hide-endtags yes. The problem was, with --hide-endtags yes, a conditional pprint buffer flush had nothing to flush, thus the indent was not adjusted. To track down this bug added a lot of MSVC Debug code, but is only existing if some additional items defined, so has no effect on the release code. This, what feels like a good fix, was first reported about 12 years ago by @OlafvdSpek in SF Bugs 563. Hopefully finally closed.	2016-04-04 18:13:08 +02:00
Geoff McLane	7598fdfff2	avoid DEBUG duplicate newline	2016-04-03 17:54:46 +02:00
Geoff McLane	7777a71913	Issue #369 - Remove Debug asserts	2016-03-31 14:50:03 +02:00
Geoff rpi McLane	086e4c948c	remove gcc comment warning	2016-03-30 15:02:19 +00:00
Geoff McLane	59d6fc7022	Issue #377 - If version XHTML5 available, return that.	2016-03-30 16:28:08 +02:00
Geoff McLane	1830fdb97c	Issue #384 - insert comments	2016-03-30 14:18:04 +02:00
Geoff McLane	4b135d9b47	Merge pull request #384 from seaburg/master Fix skipping parsing character	2016-03-30 14:08:40 +02:00
Geoff McLane	e87f26c247	Merge pull request #388 from htacg/fr.po Merge fr.po to master	2016-03-27 19:54:54 +02:00
Jim Derry	7d2ddee775	Add new `rebase` command to CLI. This is intended to make it very, very easy to update the POT and all of the POs when changes are made to `language_en.h`. Used without an sha-1 hash, untranslated strings (i.e., the "source" strings) are updated in the POT/PO's. However if you specify an --sha=HASH (or -c HASH) option, then the script will use git to examine the `language_en.h` file from that specified commit, determing the strings that have changed, and mark all of these strings as `fuzzy` in the POs. This will serve as a flag to translators that the original has changed. In addition, this `fuzzy` flag will appear in the headers as "(fuzzy) " in the item comments. If a translator edits the header directly, he should remove the "(fuzzy )" in the comment. Then when the PO is rebuilt, the fuzzy flag will be removed automatically. The reverse is also true; if a translator is working with the PO, he or she should clear the fuzzy flag and the comment will be adjusted accordingly in the generated header.	2016-03-25 09:21:21 +08:00
Geoff McLane	8671544beb	Issue #383 - Add a WIP language_fr.h to facilitate testing	2016-03-24 14:15:43 +01:00
Geoff McLane	5feca8cfd6	Issue #383 - correct another byte-by-byte output to message file. As in the previous case these messages are already valid utf-8 text, and thus, if output on a byte-by-byte basis, must not use WriteChar, except for the EOL char. Of course this output can be to either a user ouput file, if configured, otherwise stderr.	2016-03-24 14:15:43 +01:00
Jim Derry	ad7bdee3b9	Added translator comments to new TidyEscapeScripts option, and updated POT and POs to reflect this.	2016-03-24 11:00:47 +08:00
Jim Derry	71d6ca1392	Oops. Didn't commit es changes. This fixes that.	2016-03-23 15:10:07 +08:00
Jim Derry	d54785c933	language help enhancements: - Show the language Tidy is using. - Update the POT and POs with the modified string. - Regen language_es.h, which uses the string. Note that the new header uses the new commentless behavior that's still pending in another branch. In addition the proper c style hints have been added to all PO's, as their previous absense was a bug.	2016-03-23 14:56:36 +08:00
Jim Derry	2cf03f7fa9	Fix two character lang codes not working.	2016-03-23 14:38:17 +08:00
Geoff McLane	000c6925bd	Issue #348 - Add option 'escape-script', def = yes	2016-03-20 01:01:46 +01:00
Geoff McLane	e6f1533d89	Issue #383 - Output message file text byte-by-byte	2016-03-18 18:47:00 +01:00
Evgeniy Yurtaev	7d28b21e60	Fix skipping parsing character	2016-03-17 23:30:11 +03:00
Geoff McLane	8dda04f1df	Issue #379 - Care about 'ix' going negative. How this lasted so long in the code is a mystery! But of course it will only be a read out-of-bounds if testing the first character in the lexer, and it is a spacey char. A big thanks to @gaa-cifasis for running ASAN tests on Tidy.	2016-03-06 17:36:51 +01:00
Geoff McLane	8eee85cb9e	Issue #380 - Experimental patch in issue-380 branch	2016-03-05 17:39:14 +01:00
Geoff McLane	0e6ed639d6	Issue #380 - Add more MSVC debug	2016-03-04 19:28:49 +01:00
Geoff McLane	d091027089	Issue #377 add debug only output of constrained versions	2016-03-03 20:21:35 +01:00
Geoff McLane	7bdc31af76	Issue #377 - Table summary attribute also applies to XHTML5	2016-02-29 19:58:55 +01:00
Geoff McLane	24c62cf0df	Issue #314 - Avoid head warning if show-body-only	2016-02-29 18:49:15 +01:00
Geoff McLane	23e689d145	Issue #373 - Merge branch 'issue-373' of github.com:htacg/tidy-html5 into issue-373 Conflicts: version.txt - set version 5.1.41issue-373	2016-02-18 15:18:39 +01:00
Geoff McLane	8c13d270ed	Merge branch 'master' of github.com:htacg/tidy-html5	2016-02-18 13:58:23 +01:00
Geoff McLane	b91d52592b	Fix to K&R C to compile with MSVC	2016-02-18 13:57:47 +01:00
Jim Derry	63c0327de1	Fixed typo in output strings.	2016-02-18 15:40:10 +08:00
Jim Derry	e00f419f5d	Discovered some missing strings from tidyErrorFilterKeysStruct.	2016-02-18 10:19:57 +08:00
Jim Derry	da8205b2dc	Regen'd POT, POs, and headers in order to capture documentation changes in all of them.	2016-02-17 20:07:00 +08:00
Jim Derry	7fbe76be0b	Finished semantic html.	2016-02-17 20:02:38 +08:00
Jim Derry	a78daccd3c	Through TidyIndentSpaces.	2016-02-17 17:43:09 +08:00
Jim Derry	a16e89c4f8	Updated translator comments.	2016-02-17 17:27:57 +08:00
Jim Derry	d30c2d7747	XSL for man handles <var>. Updated comment and sample string.	2016-02-17 17:20:02 +08:00
Jim Derry	cc59efb23d	Add a `xml-error-strings` service to console app providing symbols developers can use with TidyErrorFilter3.	2016-02-17 12:35:20 +08:00
Jim Derry	bc1e54d5b5	Externalize the TidyReportFilter3 error codes, and provide iterators to loop through them.	2016-02-17 12:27:11 +08:00
Jim Derry	720d5c25d2	Squelch compiler warning default type.	2016-02-17 10:56:21 +08:00
Jim Derry	97abad0c05	Bump to 5.1.39 for merging. Merge branch 'master' into attrdict_phase2	2016-02-16 11:11:36 +08:00
Jim Derry	3431dd05a4	Merge branch 'master' into attrdict_phase1 Bump version to 5.1.38	2016-02-16 11:07:32 +08:00
Jim Derry	1e4f7dd0f1	Merge pull request #368 from htacg/issue-341 Issue #341	2016-02-16 10:18:26 +08:00
Geoff McLane	9cf97d536b	Issue #373 - Avoid a null added to output. This bug was first openned in 2009 by Christophe Chenon, as bug sf905 but the patch provided then never made it into the source. Now appears fixed, 7 years later!	2016-02-15 13:02:10 +01:00
Geoff McLane	a4f425546f	Improve MSVC DEBUG output. Previous only output the first 8 characters, followed by an elipse if more than 8. Now return first up to 19 chars. If nore than 19, return first 8, followed by an elipse, followed by the last 8 characters. This is in the get_text_string service, which is only used if MSVC and not NDEBUG.	2016-02-14 18:17:46 +01:00
Jim Derry	c62127b9bd	Default to NO at this point.	2016-02-13 12:33:02 +08:00
Jim Derry	8b5771cf24	Word2000 Added messages that would otherwise be missed in post-processing, after cleanup.	2016-02-13 12:26:19 +08:00
Jim Derry	2cdedb4a63	Forgot one file...	2016-02-13 11:53:53 +08:00
Jim Derry	896b00238b	Forgot one file...	2016-02-13 11:53:40 +08:00
Jim Derry	2ade3357a9	Phase 2 This is a MUCH SANER approach to what I was trying to do (now that I screwed up enough internals to understand some of them! At this point there are zero exit state reversions, and zero markup reversions! There are still 21 errout reversions; I'll annotate and adjust as necessary.	2016-02-13 11:31:16 +08:00
Jim Derry	e947d296e4	Handle some issues with misusing VERS_HTML5 in the doctype.	2016-02-12 20:49:14 +08:00
Jim Derry	c81a151da5	Add VERS_STRICT to identify future strict document types.	2016-02-12 20:46:49 +08:00
Jim Derry	74604fd52b	Hard-coded checks are redundant with updates to `attrdict.c`.	2016-02-12 20:44:03 +08:00
Jim Derry	429703dce4	Because the previous effort #350 grew too fast and there was a LOT of side effects to my changes, I'm starting over with this. Comments in the PR thread. This commit reduces the size of attrdict.c while causing only a single errout regression that is justified.	2016-02-12 19:34:19 +08:00
Geoff McLane	03a643f781	Issue #341 - No token can be inserted if istacksize == 0!	2016-02-08 15:12:23 +01:00
Geoff McLane	7d0d8a853a	Issue #345 - discard leading spaces in href	2016-02-01 20:07:55 +01:00
Geoff McLane	7f0d5c31e6	If no doctype, allow user doctype to reset table - Issue #342	2016-02-01 19:44:30 +01:00
Geoff McLane	c1f94c066c	Tidy up some debug only code. After @sria91 added #360 merge, added a little more improvement...	2016-01-30 20:51:27 +01:00
Srikanth Anantharam	9a0af48a4e	fixed a NULL node bug in debug build	2016-01-30 22:03:52 +05:30
Jim Derry	9ae15f45a7	Consistent tabs Fixed tabs in template file, and regen'd all related files.	2016-01-30 15:51:54 +08:00
Jim Derry	53f2a2da2a	msgunfmt works properly with escaped hex.	2016-01-30 15:51:53 +08:00
Martin von Gagern	17e50f2642	Encode UTF-8 strings to hex escapes in header files	2016-01-30 15:51:53 +08:00
Jim Derry	bf70824cc2	- Add TidyReportFilter3, which removes translation strings completely from the equation. It would be a good idea to deprecate TidyReportFilter2, which is vulnerable to changing strings in Tidy source. - Documentation reminders for future enum changes. - Documentation updates.	2016-01-30 15:51:53 +08:00
Jim Derry	d505869910	Localization Support added to HTML Tidy - Languages can now be added to Tidy using standard toolchains. - Tidy's help output is improved with new options and some reorganization.	2016-01-30 15:51:53 +08:00
Jim Derry	26e7d9d4b0	Fixes Mac OS X encoding issues and harmonizes output across platforms. Previously Tidy produced different output based on the compilation target, NOT based on the file encoding and specified options. Every platform was equal except Mac OS. Now unless the encoding is specifically set to a Mac file type, all encoding assumptions are the same across platforms.	2015-12-31 13:57:34 +08:00
Geoff McLane	78f2d52cdd	Issue #308 - remove bad warn, bad assert, and free discarded	2015-12-05 15:03:41 +01:00

... 2 3 4 5 6 ...

585 commits