Commit graph

1253 commits

Author SHA1 Message Date
Geoff McLane a49890ee55 Issue #498 - parser.c - if a <table> in a <table> just close.
The previous action was to discard the second, while it is the second
table that browsers will render.

This conforms to the principle that the html output by tidy should render
in a browser like the original html.
2017-02-24 16:20:10 +01:00
Geoff McLane d07134140a Issue #497 - version.txt - Bump to 5.3.20 for this fix 2017-02-24 14:39:46 +01:00
Geoff McLane c4b5904e1c Issue #497 - lexer.c - Add comment for this PR @seaburg 2017-02-24 14:38:20 +01:00
Geoff McLane e44f4d1469 Merge pull request #497 from seaburg/fix_value_trimming
Fix leading white spaces trimming
2017-02-24 14:30:39 +01:00
Geoff McLane 13c92bce38 Issue #468 - version.txt - Bump to 5.3.19 for this fix 2017-02-23 16:29:44 +01:00
Geoff McLane 27fe0548b9 Issue #468 - config.c - use RAW encoding for all cases 2017-02-23 16:28:19 +01:00
Geoff McLane b97b2f0d45 Issue #329 - version.txt - bump to 5.3.18 for this fix 2017-02-23 15:28:40 +01:00
Geoff McLane 569ae4b435 Issue #329 - lexer.c - do not discard this newline here 2017-02-23 15:27:03 +01:00
Evgeniy Yurtaev bb1d62d3bd Fix leading white spaces trimming 2017-02-22 14:34:40 +03:00
Jim Derry c54c10f857 - Removed deprecated options:
- TidySlideStyle
  - TidyBurstSlides

- Added documentation for TidyEmacsFile, since it's a valid option.

- Because TidyEmacsFile is a valid option, tweaked tidy.c so that it can
  be specified in a configuration file without being overwritten by the console
  app. Why a user might do this is dumb, but who are we to stop them.
2017-02-18 18:30:41 -05:00
Jim Derry edc548095c Removed language as tidy config option; it is only CLI option. 2017-02-18 17:16:35 -05:00
Jim Derry cbb8354f74 Combined leftover attribute API stuff into single, new file. 2017-02-18 16:57:11 -05:00
Jim Derry f6ce4d130e Removed deprecated tidyAttrGetSOMETHING from API. 2017-02-18 16:46:20 -05:00
Jim Derry 13c6387f47 Removed deprecated AttributeIsSOMETHING from API. 2017-02-18 16:43:47 -05:00
Jim Derry a16f36ce53 Removed deprecated NodeIsElementName from API. 2017-02-18 16:33:21 -05:00
Jim Derry 81d2d26883 Added a new README/API_AND_NAMESPACE.md, which might help and encourage
new developers to add new functionality to Tidy.
2017-02-17 19:23:26 -05:00
Jim Derry 165acc4f3e Several foundational changes preparing for release of 5.4 and future 5.5:
- Consolidated all output string definitions enums into `tidyenum.h`, which
    is where they belong, and where they have proper visibility.
  - Re-arranged `messages.c/h` with several comments useful to developers.
  - Properly added the key lookup functions and the language localization
    functions into tidy.h/tidylib.c with proper name-spacing.
  - Previous point restored a *lot* of sanity to the #include pollution that's
    been introduced in light of these.
  - Note that opaque types have been (properly) introduced. Look at the updated
    headers for `language.h`. In particular only an opaque structure is passed
    outside of LibTidy, and so use TidyLangWindowsName and TidyLangPosixName
    to poll these objects.
  - Console application updated as a result of this.
  - Removed dead code:
    - void TY_(UnknownOption)( TidyDocImpl* doc, char c );
    - void TY_(UnknownFile)( TidyDocImpl* doc, ctmbstr program, ctmbstr file );
  - Redundant strings were removed with the removal of this dead code.
  - Several enums were given fixed starting values. YOUR PROGRAMS SHOULD NEVER
    depend on enum values. `TidyReportLevel` is an example of such.
  - Some enums were removed as a result of this. `TidyReportLevel` now has
    matching strings, so the redundant `TidyReportLevelStrings` was removed.
  - All of the PO's and language header files were regenerated as a result of
    the string cleanup and header cleanup.
  - Made the interface to the library version and release date consistent.
  - CMakeLists.txt now supports SUPPORT_CONSOLE_APP. The intention is to
    be able to remove console-only code from LibTidy (for LibTidy users).
  - Updated README/MESSAGES.md, which is *vastly* more simple now.
2017-02-17 15:29:26 -05:00
Jim Derry 0bd6ba30b4 Merge branch 'tidy_version'
Note: this is a triple merge. Version bumped only once.
2017-02-13 08:51:04 -05:00
Jim Derry e1f066fe14 Merge branch 'empretty_script' 2017-02-13 08:49:13 -05:00
Jim Derry b7c84b1b57 Merge branch 'surrogates' 2017-02-13 08:49:06 -05:00
Geoff McLane 73bf561645 Bump version to 5.3.16 for SPRTF fixes 2017-02-12 17:40:48 +01:00
Geoff McLane ea49ca0b1d Fix license for SPRTF modules.
Also correct the coding style to conform to HTML Tidy standard.
2017-02-12 17:38:44 +01:00
Geoff McLane 23c4686b0f Merge branch 'surrogates' of github.com:htacg/tidy-html5 into surrogates 2017-02-11 18:34:38 +01:00
Geoff McLane 7f73d4f429 Issue #483 - Add ReportSurrogateError() service and connect. 2017-02-11 18:33:45 +01:00
Jim Derry 45a6062b4a VERSION.md cleanup. 2017-02-10 14:21:24 -05:00
Jim Derry c789ca8311 Cleanup of MESSAGES.md again, this time with correct information. 2017-02-10 10:24:11 -05:00
Jim Derry 91e27b14f3 Cleaned up MESSAGES.md just a bit per Geoff's request. 2017-02-09 16:46:18 -05:00
Geoff McLane 75bc1f06c7 More updates for Issue #483 - Start warning msgs - WIP 2017-02-09 20:55:23 +01:00
Geoff McLane 3ca117550a Initial start on a README/MESSAGES.md 2017-02-09 20:54:11 +01:00
Jim Derry 1ac50fccb3 Pretty up output of empty script tags.
- No longer break script tags up on two lines if there is content. However
    output is still subject to the `--wrap` behavior.
  - Previous behavior intact if there is content.

Todo.

  - Associate this with a new Tidy option.
2017-02-08 13:53:37 -05:00
Geoff McLane 6a83918d33 Add README for new 'attributes' and 'elements', 'tags' 2017-02-05 17:27:28 +01:00
Geoff McLane 9dc76c1e77 Issue #483 - Some fixes for error condition 2017-02-02 16:43:10 +01:00
Geoff McLane 259d330780 Issue #483 - First cut dealing with 'surrogate pairs'.
Only deals with a successful case.

TODO: Maybe add a warning/error if the trailing surrogate not found, and
maybe consider substituting to avoid invalid utf-8 output.
2017-02-01 13:50:33 +01:00
Geoff McLane 10fd44d101 Issue #478 PR #480 - Bump to 5.3.15 2017-01-29 19:21:46 +01:00
Geoff McLane deebc93f97 Merge pull request #480 from onnimonni/feature-fix-xmlns-xlink
Add optional xmlns:xlink attributes as valid to support inline svg
2017-01-29 19:17:43 +01:00
Geoff McLane 0cbbd55535 Issue #463, a step in #460, bump to v.5.3.14 for this merge 2017-01-09 17:07:13 +01:00
Geoff McLane cdf3f8846c Merge pull request #463 from marcoscaceres/ansi_compliance
style: ansi conforming comments
2017-01-09 16:59:43 +01:00
Onni Hakala da27b5e339
Add optional xmlns:xlink attributes as valid to support inline svg
fixes #478
2017-01-09 01:38:16 +02:00
Geoff McLane 2243510592 Issue #469 #473 Bump to 5.3.13 2017-01-08 18:24:17 +01:00
Eric Bréchemier 7593d7b58f Merge documentation of "command-line" and "configuration" options (Issue #469) (#473)
* Track tidy.1 before merging duplicate sections

I am adding the file to the git repository to track and review
the changes to this generated file. I will then update the XSLT
transformation which produces this file to remove duplicate sections.
As a first step, I will stop outputting duplicate sections; I will
then merge them into existing sections. I will commit the changes
to the generated file at each step.

Related issue: #469

* Also track changes in text rendering of the man page tidy.1

The rendering to text was done with following command:

  /usr/bin/groff -Tascii -mandoc -c tidy.1

This format should make the review of differences more readable.

Related issue: #469

* Remove duplicate sections: temporarily discard detailed options

Related issue: #469

* Generalize command line given in SYNOPSIS

The new SYNOPSIS expresses the fact that multiple files can
be provided as argument, and that options and files can be mixed
(options apply only to the files specified after, not the ones before).

It does not explain that there are actually two types of options; this
shall be detailed afterwards: simple options (aka standard options) start
with single dash while configuration options start with a double dash.
Only the latter can be defined in configuration files, using their name
without the double dash.

I have also reformatted the terms 'options' and 'file' to be underlined,
to follow conventions that I observed in other man pages (ls, grep, wget...)

Related issue: #469

* Regroup sentences related to options at the start of OPTIONS section

This is an intermediate step before adapting the text to its new
location. I will probably start the section with a paragraph to
introduce the two different kinds of options. Then describe the
"standard" options in more details. Then list the standard options.
Then describe the configuration options in more details. Then list
the configuration options, using a format similar to the one used
for standard options.

Related issue: #469

* Describe "standard" and "expanded" options part of OPTIONS section

The section now starts with a description of both types of options,
and explains that the first part of the section concerns with the
"standard" options while the second part of the section concerns with
the "expanded" options.

More details are provided about "standard" options, which are then
listed individually.

More details are then provided about "expanded" options and their
usage on the command line and in configuration files. The configuration
options are not listed yet. In order to avoid repeating a lot of
information with every separate configuration option, I will first
describe common values and formats; I will then describe each option
more succinctly, like "standard" options.

Related issue: #469

* Remove redundant USAGE section

The fact that the input file defaults to standard input
and the output file to standard output is already indicated
in the DESCRIPTION section. This was the only information
left in this section at this point.

Related issue: #469

* Delete separation line

The line used to separate "standard" usage from "extended" usage.
Both forms are now integrated in the common description of OPTIONS.

Related issue: #469

* Delete DETAILED CONFIGURATION OPTIONS section

The detailed configuration options are now described together
with standard options in a common OPTIONS section.

Related issue: #469

* Delete duplicate SYNOPSIS section

A single generalized SYNOPSIS now encompasses both kinds of options.

Related issue: #469

* Delete WARNING section, no longer relevant

The WARNING referred to a separate section for the description
of "standard" options. They are now described in the same OPTIONS
section as "extended" options.

Related issue: #469

* Copy details of configuration options and file format to OPTIONS

Just before listing all the configuration options, this is the
expected place to describe the "extended" options in more details.
The description was already worded as an introduction to the list
of configuration options. I will update this description after having
compacted entries which describe individual configuration options.

Related issue: #469

* Delete duplicate DESCRIPTION section

This section has been merged into the generalized OPTIONS section.

Related issue: #469

* List configuration options at the end of the OPTIONS section

This list is very long, with lots of duplicate information
repeated for entries of the same type. The description of
configuration options should be compacted to match as closely
as possible the description of "standard" options.

Related issue: #469

* Delete duplicate OPTIONS section

I contained the list of configuration options, which is now included
at the end of the generalized OPTIONS section.

Related issue: #469

* Delete config-section template

The template was now empty. Its contents have been merged
into the cmdline-section template.

Related issue: #469

* Remove redundant sentence

The sentence listed the five categories of configuration options.
This kind of made sense when the options were listed in the following
section. Now that they are listed just below, it has become redundant.

Related issue: #469

* Remove colon ':' at the end of configuration options categories

The categories of "standard" options do not end with a colon;
no title does actually.

Related issue: #469

* Remove extra lines before the list of configuration options

Related issue: #469

* Add double space after period '.  ' where missing

For consistency with usage, sentences within paragraphs shall be
separated by a double space rather than a single space. This was
done in most places in the document, with only a few places missing.

Related issue: #469

* Delete irrelevant comment

The comment refers to cmdline section at the start of the processing
of configuration options. The cmdline options are opposed to
config options in the context of this transformation. They are
provided through two separate XML input files.

Related issue: #469

* Delete extra blank line before sample configuration file

Related issue: #469

* Remove multiple empty lines after heading of each options category

Related issue: #469

* Remove duplicate empty line before 'See also:' lines

Related issue: #469

* Clarify the terms used for both kinds of options

I removed references to "standard" (or regular) command-line options
and "extended" (or detailed) options. I used the terms featured in
the description of the options which output XML files describing
each kind of options:

  -xml-help
        list the command line options in XML format

  -xml-config
        list all configuration options in XML format

The term for single-dash options is now (purely) command-line options
while double-dash options are referred to as configuration options.

Related issue: #469

* Update copyright year to 2016

* Clarify configuration options equivalent to command-line options

I added a paragraph to explain the equivalence of a command-line
option with a configuration option and value, and to explicit the
format used to describe this equivalence in the description of
command-line parameters.

I moved the parentheses, which were on the last line, at the end
of the description, to the first line at the end of the list of
names for the command-line option.

Related issue: #469

* Use underlines (I) instead of bold (B) for option names in config example

This is for consistency with the format used for the option names in
the equivalent command-line example above, and in the other example
of configuration file.

Related issue: #469

* Update copyright year to 2017

* Add double dash before the name of configuration options

This is a first step for the harmonization of the descriptions
of command-line and configuration options.

Related issue: #469

* Reformat logically to separate formatting (bold) from text (option name)

Related issue: #469

* Move Type after name of configuration option

This puts it in the position expected on the command line.

Related issue: #469

* Move default value after config option name and Type

I tried different formats for the default value:

  --clean Boolean:no
  --clean Boolean[no]

and more formats after I realized that the 'default' value is
not applied when the value is omitted, but when the option is
not used at all:

  --clean Boolean (initially: no)
  --clean Boolean (unset: no)

I selected the less confusing format:

  --clean Boolean (no if unset)

which is self-explanatory.

Related issue: #469

* Clarify that a configuration option cannot be used without a value

For example, using --clean without a value is not equivalent to
using -clean option:

  curl -s https://www.google.com | tidy --clean 2>&1 1>/dev/null | head -n 1

results in:

  Config: missing or malformed argument for option: clean

Related issue: #469

* Add double dash before option names in 'See also' sections

This is consistent with the format used at the top of the
description of configuration options.

Related issue: #469

* Fix order of items in comment describing documentation of config options

The 'seealso' comes last actually, after the description.

* Break long lines to keep source code readable in a terminal (80 characters)

This makes no change on the text generated by

  /usr/bin/groff -Tascii -mandoc -c tidy.1 > tidy.1.txt

* Only output an empty line when Example section is present

Otherwise, the description starts with an empty line when
no Example section is present.

Related issue: #469

* Simplify matching of example elements with contents

Using a template match instead of a named template,
I will then add rules with higher priority to ignore
examples for certain types of values, which are very
redundant (identical for all options of the same type).

Related issue: #469

* Do not print redundant examples

Examples for Boolean and AutoBool are redundant because they are
described in the main text and identical for all options of that type.

Examples for Tag names are redundant because they are redundant
with the name of the Type, and identical for all options of that type.

Examples for Integer are redundant because they are identical for
all options of that type but one, where the value 0 is followed with
a comment, but even in this case the examples are redundant because
the comment for the value 0 is also included in the description.

Related issue: #469

* Rename 'Examples' section to 'Supported values' to clarify

I also updated the description related to 'Examples' section
in the introduction paragraphs to the configuration options.

Related issue: #469

* Use italics consistently for the names of option types

Related issue: #469

* Use capitalization with no extra style consistently for Type

Previously, a mix of

  * Type set in bold font
  * Type set in regular font
  * "types" (quoted)
  * types (unquoted)

was found. I replaced all instances by Type in regular font.

Related issue: #469

* Consistently use bold format for option values

Both parameter names and values are now in bold,
while keys and values for configuration files are in italics.

Related issue: #469

* Use the same format as other subsections for 'See also'

The subsection is now flush left, in regular font, like
the 'Supported values' subsection.

The previous format was less adequate when the list wrapped
to the next line (--new-inline-tags): wrapping started on
the very first column, breaking the alignment of the rest of
the description.

Related issue: #469

* Consistently indent with 2 spaces, use a single line between templates

Parts of the file were indented with 2 spaces, others with 3 spaces.
Parts of the templates were separated with two empty lines, others
with a single one.

* Remove temporary files used for step by step comparisons of man page

Related issue: #469
2017-01-08 18:19:36 +01:00
Marcos Caceres 91da8c6f74 style: ansi conforming comments 2016-12-20 16:51:09 +11:00
Geoff McLane fd0ccb2bbf Bad, repeated node iteration! closes #459 2016-10-30 23:37:31 +01:00
Marcos Caceres aff76bec38 fix(lexer.c): fixes from initial review 2016-10-17 17:00:58 +11:00
Marcos Caceres 523d58b004 refactor: ask for charset and http_equiv attrs 2016-10-06 19:30:23 +11:00
Marcos Caceres 932cc104a6 feat(attrask.c): learn about charset attr 2016-10-06 19:29:56 +11:00
Marcos Caceres 53ee94ddba fix: incorrect check for first element in head 2016-10-06 19:07:44 +11:00
Marcos Caceres b1629c4a4f fix(lexer): bad attribute reporting 2016-10-05 20:22:19 +11:00
Marcos Caceres 2d7ddfef94 Part 2.1 - Bug fixes and warning 2016-10-05 20:14:18 +11:00
Marcos Caceres cfc22ac46e Add garvankeeley's suggestions using calloc 2016-10-05 18:54:25 +11:00
Marcos Caceres 040c22c6dc Part 2 - Implement lexer logic 2016-10-04 21:23:57 +11:00