Update README documents with information for the upcoming 5.4.0 release.

This commit is contained in:
Jim Derry 2017-02-24 11:58:30 -05:00
parent d07134140a
commit bb2cb26372
15 changed files with 318 additions and 296 deletions

View File

@ -1,9 +1,96 @@
# HTML Tidy with HTML5 support
# HTACG HTML Tidy
All READMEs and related materials can be found in [README/][1].
All other READMEs and related materials can be found in [README/][100]. Although all of our materials should be linked in this README, be sure to check this directory for documents weve not yet added to this document.
For build instructions please see [README/README.md][2].
## Building HTML Tidy
[1]: https://github.com/htacg/tidy-html5/tree/master/README
[2]: https://github.com/htacg/tidy-html5/blob/master/README/README.md
- For build instructions please see [README/BUILD.md][115].
## Branches and Versions
Learn about which branches are available, which branch you should use, and how HTML Tidys versioning scheme works.
- Learn about version numbering in [README/VERSION.md][160].
- Learn about our repository branches in [README/BRANCHES.md][110].
## Contributing and Development Guides
We gladly accept PRs! Read about some of our contribution guidelines, and check out some of the additional explanatory documents that will aid your understanding of how to accomplish certain things in HTML Tidy.
### General Contribution Guidelines
These are some general guidelines that will help you help us when it comes to making your own contributions to HTML Tidy.
- Learn about our contributing guidelines in [README/CONTRIBUTING.md][125].
- Understand HTML Tidys source code style in [README/CODESTYLE.md][120].
### Adding Features Guides
When youre ready to add a great new feature, these write-ups may be useful.
- Learn how to add new element attributes to HTML Tidy by reading [README/ATTRIBUTES.md][105].
- Discover how to add new tags to Tidy in [README/TAGS.md][130].
- If you want to add new messages to Tidy, read [README/MESSAGE.md][150].
- Configuration options can be added according to [README/OPTIONS.md][155].
### Language Localization Guides
Tidy supports localization, and welcomes translations into various languages. Please read up on how to localize HTML Tidy.
- The general README for localizing can be found in [/README/LOCALIZE.md][140].
- And [/localize/README.md][145] contains specific instructions for localizing.
## Other Important Links
- site: [http://www.html-tidy.org/][4]
- source: [https://github.com/htacg/tidy-html5][5]
- binaries: [http://binaries.html-tidy.org][6]
- bugs: [https://github.com/htacg/tidy-html5/issues][7]
- list: [https://lists.w3.org/Archives/Public/html-tidy/][8]
- api and quickref: [http://api.html-tidy.org/][9]
[4]: http://www.html-tidy.org/
[5]: https://github.com/htacg/tidy-html5
[6]: http://binaries.html-tidy.org
[7]: https://github.com/htacg/tidy-html5/issues
[8]: https://lists.w3.org/Archives/Public/html-tidy/
[9]: http://api.html-tidy.org/
## History
This repository should be considered canonical for HTML Tidy as of 2015-January-15.
- This repository originally transferred from [w3c.github.com/tidy-html5][20], now redirected to the current site.
- First moved to Github from [tidy.sourceforge.net][21]. Note, this site is kept only for historic reasons, and is not now well maintained.
**Tidy is the granddaddy of HTML tools, with support for modern standards.** Have fun...
[20]: http://w3c.github.com/tidy-html5/
[21]: http://tidy.sourceforge.net
## License
HTML Tidy and LibTidy are free and open source software with a permissive license.
- You can read the complete license in [README/LICENSE.md][135].
[100]: README/
[105]: README/ATTRIBUTES.md
[110]: README/BRANCHES.md
[115]: README/BUILD.md
[120]: README/CODESTYLE.md
[125]: README/CONTRIBUTING.md
[130]: README/TAGS.md
[135]: README/LICENSE.md
[140]: /README/LOCALIZE.md
[145]: /localize/README.md
[150]: README/MESSAGE.md
[155]: README/OPTIONS.md
[160]: README/VERSION.md

View File

@ -1,21 +1,26 @@
# Tidy Element Attributes
This is about adding a **new** `attribute=value` for one or more html `element`, here called `tags`.
This is about adding a **new** HTML attribute to one or more HTML tags, i.e., a new attribute such as `attribute=value`.
Tidy supports a large number of `attributes`, first defined in `tidyenum.h`, to give it a value, then defined in `attrs.c` to give it a unique **string** name, and a `function` to verify the atrribute **value**. Then in `attrdict.c` the attribute is defined, giving what version(s) of html support this attribute. Finally, what tags support this attrinute, is done in `tags.c`, where each attribute is allowed on that tag, or not, in the `tag_defs[]` table.
Tidys large number of attributes are supported via number of files:
- `tidyenum.h` is where you first define a new attribute in order to give it an internal value.
- `attrs.c` is where you give a unique **string** name to the attribute, as well as a **function** to verify the **value**.
- `attrdict.c` further refines the definition of your attribute, specifying which version(s) of HTML support this attribute.
- `tags.c`, finally, determines which tags support the attribute, in the `tag_defs[]` table.
So, to add a new `attribute=value`, on one or more existing tags, consists of the following simple steps -
1. tidyenum.h - Give the attribute an internal name, like `TidyAttr_XXXX`, and thus a value. While there were some initial steps to keep this `TidyAttrId` enumeration alphabetic, now just add the new `TidyAttr_XXXX` just before the last entry 'N_TIDY_ATTRIBS'.
1. `tidyenum.h` - Give the attribute an internal name, like `TidyAttr_XXXX`, and thus a value. While there were some initial steps to keep this `TidyAttrId` enumeration alphabetic, now just add the new `TidyAttr_XXXX` just before the last entry `N_TIDY_ATTRIBS`.
2. attrs.c - Assign the string value of the attribute. Of course this must be unique. And then assign a `function` to verify the attribute value. There are already a considerable number of defined functions to verify specific attribute values, but maybe this new attribute requires a new function, so that should be written, and defined.
2. `attrs.c` - Assign the string value of the attribute. Of course this must be unique. And then assign a `function` to verify the attribute value. There are already a considerable number of defined functions to verify specific attribute values, but maybe this new attribute requires a new function, so that should be written, and defined.
3. attrdict.c - If this attribute only relates to specific `tags`, then it should be added to their list. There are some `general` attributes that are allowed on every, or most tags, so this new attribute and value should be added accordingly.
3. `attrdict.c` - If this attribute only relates to specific tags, then it should be added to their list. There are some general attributes that are allowed on every, or most tags, so this new attribute and value should be added accordingly.
4. tags.c - Now the new attribute will be verified for each tag it is associate with in the `tag_defs[]` table. Like for example the `<button ...>`, `{ TidyTag_BUTTON, ...` has `&TY_(W3CAttrsFor_BUTTON)[0]` assigned.
4. `tags.c` - Now the new attribute will be verified for each tag it is associate with in the `tag_defs[]` table. Like for example the `<button ...>`, `{ TidyTag_BUTTON, ...` has `&TY_(W3CAttrsFor_BUTTON)[0]` assigned.
So, normally, just changing 3 files, `tidyenum.h`, `attrs.c`, and `attrdict.c`, will already adjust `tags.c` to accept a new `attribute=value` for any tag, or all tags. Simple...
Now, one could argue that this is not the **best** way to verify every attribute and value, for every tag, but that is a mute point - that is how tidy does it!
Now, one could argue that this is not the **best** way to verify every attribute and value, for every tag, but that is a moot point - that is how Tidy does it!
; eof 20170205

28
README/BRANCHES.md Normal file
View File

@ -0,0 +1,28 @@
# HTML Tidy Branches
## About Branches
Starting with **HTML Tidy** 5.4.0, HTACG will adopt a new branch management strategy utilizing **master** as the _release branch_, and **next** as the active development branch.
As described thoroughly in our [VERSION.md](VERSION.md) document, this means that **master** will always consist of an even-numbered minor version, and activity will remain relatively quiet unless we backport a critical bug fix from **next**.
The **next** branch, then will host the majority of our development activity, and any contributions and PRs should be again this branch. This means that **next** will always consist of an odd minor version number.
## About Versioning
You can read the specifics about version numbers in our [VERSION.md](VERSION.md) document.
## FAQs
### Which version or branch should I choose?
As described above, the branch is very strongly correlated with the version. If you require a stable API and relatively stable output and dont require the features and enhancements of an odd-numbered **next** version, then you should stick to **master**, even-numbered versions.
On the other hand if you are primarily a console application user, then the API isnt likely as important to you, and you probably want the latest and greatest. If this describes you, you probably want to at least try out **next**.
If you are developing for Tidy, then you _definitely_ want to stick to **next**, even for bug fixes meant for **master**. If its a critical enough bug fix, then one of our friendly team will back-port the fix to **master**.

66
README/BUILD.md Normal file
View File

@ -0,0 +1,66 @@
# HTACG HTML Tidy
## Prerequisites
1. git - [http://git-scm.com/book/en/v2/Getting-Started-Installing-Git][1]
2. cmake - [http://www.cmake.org/download/][2]
3. appropriate build tools for the platform
4. the [xsltproc][3] tool is required to build and install the `tidy.1` man page on Unix-like platforms.
CMake comes in two forms - command line and GUI. Some installations only install one or the other, but sometimes both. The build commands below are only for command line use.
Also the actual build tools vary for each platform. But that is one of the great features of CMake, it can generate variuous 'native' build files. Running `cmake --help` should list the generators available on that platform. For sure one of the common ones is "Unix Makefiles", which needs autotools make installed, but many other generators are supported.
In Windows CMake offers various versions for MSVC. Again below only the command line use of MSVC is shown, but the tidy solution (*.sln) file can be loaded into the MSVC IDE, and the building done in there.
## Build the tidy library and command line tool
1. `cd build/cmake`
2. `cmake ../.. -DCMAKE_BUILD_TYPE=Release [-DCMAKE_INSTALL_PREFIX=/path/for/install]`
3. Windows: `cmake --build . --config Release`
Unix/OS X: `make`
4. Install, if desired:
Windows: `cmake --build . --config Release --target INSTALL`
Unix/OS X: `[sudo] make install`
By default cmake sets the install path to `/usr/local/bin` in Unix. If you wanted the binary in say `/usr/bin` instead, then in 2. above use `-DCMAKE_INSTALL_PREFIX=/usr`.
Also, in Unix if you want to build the release library without any debug `assert` in the code then add `-DCMAKE_BUILD_TYPE=Release` in step 2. This adds a `-DNDEBUG` macro to the compile switches. This is normally added in windows build for the `Release` config.
In Windows the default install is to `C:\Program Files\tidy`, or `C:/Program Files (x86)/tidy`, which is not very useful. After the build the `tidy.exe` is in the `Release` directory, and can be copied to any directory in your `PATH` environment variable for global use.
If you do **not** need the tidy library built as a 'shared' (DLL) library, then in 2. add the command `-DBUILD_SHARED_LIB:BOOL=OFF`. This option is **ON** by default. The static library is always built and linked with the command line tool for convenience in Windows, and so the binary can be run as part of the man page build without the shared library being installed in unix.
See the `CMakeLists.txt` file for other CMake **options** offered.
## Build PHP with the tidy-html5 library
Due to API changes in the PHP source, `buffio.h` needs to be renamed to `tidybuffio.h` in the file `ext/tidy/tidy.c` in PHP's source.
That is - prior to configuring PHP run this in the PHP source directory:
```
sed -i 's/buffio.h/tidybuffio.h/' ext/tidy/*.c
```
And then continue with (just an example here, use your own PHP config options):
```
./configure --with-tidy=/usr/local
make
make test
make install
```
[1]: http://git-scm.com/book/en/v2/Getting-Started-Installing-Git
[2]: http://www.cmake.org/download/
[3]: http://xmlsoft.org/XSLT/xsltproc2.html
; eof

View File

@ -1,18 +1,18 @@
# HTML Tidy Code Style
The source code of **libTidy**, and console app **tidy**, follow the preferences of the original maintainers. Perhaps some of these decisions were arbitrary and based on their sense of aesthetics at the time, but it is good to have all the code looking the same even if it is not exactly what everyone would prefer.
The source code of **libTidy** and console app **tidy** mostly follow the preferences of the original maintainers. Perhaps some of these decisions were arbitrary and based on their sense of aesthetics at the time, but it is good to have all the code looking the same even if it is not exactly what everyone would prefer.
Developers adding code to **Tidy!** are urged to try to follow the existing code style. Code that does not follow these conventions may be accepted, but may be modified as time goes by to best fit the `Tidy Style`.
Developers adding code to HTML Tidy are urged to try to follow the existing code style. Code that does not follow these conventions may be accepted, but may be modified as time goes by to best fit the “Tidy Style.”
There has been a suggestion of using available utilities to make the style consistent, like [Uncrusty](https://github/bengardener/uncrusty) - see [issue #245](https://github.com/htacg/tidy-html5/issues/245), and maybe others...
There has been a suggestion of using available utilities to make the style consistent, like [Uncrustify](https://github.com/uncrustify/uncrustify) - see [issue #245](https://github.com/htacg/tidy-html5/issues/245), and maybe others.
Others have suggested the [AStyle](http://astyle.sourceforge.net/) formatting program with say '-taOHUKk3 -M8' arguments, to conform, but there are a few bugs in AStyle.
Others have suggested the [AStyle](http://astyle.sourceforge.net/) formatting program with say `-taOHUKk3 -M8` arguments, to conform, but there are a few bugs in AStyle.
But again these, and other tools, may not produce code that everybody agrees with... and are presently not formally used in Tidy!
But again, these and other tools may not produce code that everybody agrees with, and are presently not formally used in Tidy!
#### Known Conventions
From reading of the Tidy source, some things are self evident... in no particular order...
From reading of the Tidy source, some things are self evident, in no particular order...
- Use of 4-space indenting, and no tabs.
- No C++ single line comments using `//`.

View File

@ -4,17 +4,27 @@ So you want to contribute to Tidy? Fantastic! Here's a brief overview on how bes
### Support request
If you are having trouble running console `Tidy`, or using the `Tidy Library` API in your own project, then maybe the best places to get help is either via a comment in [Tidy Issues](https://github.com/htacg/tidy-html5/issues), or on the [Tidy Mail Archive](https://lists.w3.org/Archives/Public/html-tidy/) list.
If you are having trouble running console `Tidy`, or using the `LibTidy` API in your own project, then maybe the best places to get help is either via a comment in [Tidy Issues](https://github.com/htacg/tidy-html5/issues), or on the [Tidy Mail Archive](https://lists.w3.org/Archives/Public/html-tidy/) list.
In either place please start with a short subject to describe the issue. If it involves running tidy on a html file, or an API question, make sure to include the version: `$ tidy -v`; what was the configuration used; a small sample input; the output, and the output expected; some sample code, to make quick testing easy.
In either place please start with a short subject to describe the issue. If it involves running Tidy on an html file, or if its an API question, make sure to include:
If you do add a sample html input, then it can also be very helpful if that sample **passes** the W3C [validation](https://validator.w3.org/#validate_by_upload)... tidy attempts to follow all current W3C standards...
- the version: `$ tidy -v`
- what was the configuration used
- a small sample input
- the output
- the _expected_ output expected
- some sample code (if an API question).
These data will make replication of your issue much simpler for us.
If you do add sample HTML input, then it can also be very helpful if that sample **passes** the W3C [validator](https://validator.w3.org/#validate_by_upload). Tidy attempts to follow all current W3C standards.
If you are able to build tidy from [source](https://github.com/htacg/tidy-html5) (requires [CMake](https://cmake.org/download/)), and you can find the problem in the source code, then read on about how you can create a Pull Request (“PR”) to share your code and ideas.
If you are able to build tidy from [source](https://github.com/htacg/tidy-html5), requires [CMake](https://cmake.org/download/), and can find the problem in the code, then read on about how you can create a `Pull Request`... share your code, ideas, ....
### What to change
Here are some examples of things you might want to make a pull request for:
Here are some examples of things you might want to make a PR for:
- New features
- Bug fixes
@ -24,36 +34,36 @@ Here are some examples of things you might want to make a pull request for:
If you have a more deeply-rooted problem with how the program is built or some of the stylistic decisions made in the code, it is best to [create an issue](https://github.com/htacg/tidy-html5/issues/new) before putting the effort into a pull request. The same goes for new features - it might be best to check the project's direction, existing pull requests, and currently open and closed issues first.
Concerning the 'Tidy Code Style', checkout [CODESTYLE.md](CODESTYLE.md), but looking at existing code is the best way to get a good feel for the patterns we use.
Concerning the “Tidy Code Style,” checkout [CODESTYLE.md](CODESTYLE.md), but looking at existing code is the best way to get a good feel for the patterns we use.
### Using Git appropriately
1. Fork the repository to your GitHub account.
2. Optionally create a **topical branch** - a branch whose name is succint but explains what
you're doing, such as "feature/add-new-lines"...
2. Optionally create a **topical branch**, a branch whose name is succint but explains what you're doing, such as "feature/add-new-lines".
3. Make your changes, committing at logical breaks.
4. Push your work to your personal account.
5. [Create a pull request](https://help.github.com/articles/using-pull-requests).
6. Watch for comments or acceptance.
Please note - if you want to change multiple things that don't depend on each
other, it is better to use `branches`, and make sure you check the master branch back out before making more changes - that way we can take in each change seperate. Else github has a tendancy to combine your requests into one.
other, it is better to use `branches`, and make sure you check the master branch back out before making more changes - that way we can take in each change seperately, otherwise Github has a tendancy to combine your requests into one.
If you are a continuing contributor then you will need to `rebase` your fork, to htacg `master`, **before** doing any more work, and likewise branches, otherwise we may not be able to cleanly merge your PR. This is a simple process -
If you are a continuing contributor then you will need to `rebase` your fork, to htacg `next`, **before** doing any more work, and likewise branches, otherwise we may not be able to cleanly merge your PR. This is a simple process:
```
$ git remote add upstream git@github.com:htacg/tidy-html5.git # once only
$ git checkout master
$ git checkout next
$ git status
$ git stash # if not clean
$ git fetch upstream
$ git rebase upstream/master
$ git rebase upstream/next
$ git stash pop # if required, and fix conflicts
$ git push # update the fork master
$ git push # update the fork next
```
This can be repeated for `branches`.
This can be repeated for other branches, too.
### Help Tidy Get Better
It goes without saying **all help is appreciated**. We need to work together to make Tidy! better...
It goes without saying **all help is appreciated**. We need to work together to make Tidy better!

View File

@ -1,19 +0,0 @@
# Tidy HTML Elements
This is about adding a new html **element**, or **tag**.
Tidy tries to support all **elements** supported by the W3C. To add a new supported **element**, the defintion begins in `tidyenum.h`, to give it a value. Then it is added to the `tag_defs[]` table in `tags.c`, where it is given a unique string, supported html versions, attributes support, and a bit `type`.
Note, there are a group of configuration options to add **elements** not yet approved by the W3C. These are [new-blocklevel-tags](http://api.html-tidy.org/tidy/quickref_5.2.0.html#new-blocklevel-tags), [new-empty-tags](http://api.html-tidy.org/tidy/quickref_5.2.0.html#new-empty-tags), [new-inline-tags](http://api.html-tidy.org/tidy/quickref_5.2.0.html#new-inline-tags). and [new-pre-tags](http://api.html-tidy.org/tidy/quickref_5.2.0.html#new-pre-tags). This provides a way to extend the `tag_defs[]` table just for that tidy session.
So, to add a new html `element`, consists of the following simple steps -
1. tidyenum.h - Give the element an internal name, like `TidyTag_XXXX`, and thus a value. While there were some initial steps to keep this `TidyTagId` enumeration alphabetic, now just add the new `TidyTag_XXXX` just before the last entry 'N_TIDY_TAGS'.
2. tags.c - Add a line to the `tag_defs[]` table. This assigns the unique string value of the element. Then the html versions that support the element, a pointer to the attributes supported by that elelment, and a bit field of the elements characteristics, inline, block, etc.
So, just changing 2 files, `tidyenum.h` and `tags.c`, and libTidy will now support that element, tag, as W3C approved. Simple... And at times, there is some case for adding **element** that are still in the `Working Draft` stage, especially when there has bee wide spread support in the community, even before it reaches `REC` stage.
Now, one could argue that this is not the **best** way to verify every attribute and value, for every tag, but that is a mute point - that is how tidy does it!
; eof 20170205

View File

@ -14,6 +14,6 @@ team you will certainly earn the admiration of fellow Tidy users worldwide.
All READMEs (including [instructions][2] on how to localize Tidy) and related
materials can be found in [localize][1].
[1]: https://github.com/htacg/tidy-html5/tree/master/localize
[2]:https://github.com/htacg/tidy-html5/blob/master/localize/README.md
[1]: ../localize
[2]: ../localize/README.md

View File

@ -36,6 +36,5 @@ Where to add this seems a bit of a mess, but in general things are grouped by wh
Depending on which of the output routines you use (consult `message.c`) you may be able to use parameters such as `%u` and `%s` in your format strings. The available data is currently limited to the available message output routines, but perhaps generalizing this in order to make more data available will be a nice focus of Tidy 5.5. Please don't use `printf` for message output within **libTidy**.
In this case I want to add showing the code point(s) in hex, so I need to add that also. **(jim --??)**
eof;

View File

@ -6,7 +6,7 @@ The options can also be listed in xml format. `-xml-help` will output each optio
These options can also be used by application linking with **`libtidy`**. For each option there is a `TidyOptionId` enumeration in the `tidyenum.h` file, and get/set functions for each option type.
This file indicates how to add a new option to tidy. Here adding an option `TidyEscapeScripts`. In essence it consists of 4 steps -
This file indicates how to add a new option to tidy, here adding an option `TidyEscapeScripts`. In essence it consists of 4 steps:
1. Add the option **`ID`** to `tidyenum.h`.
2. Add to the **`table`** `TidyOptionImpl option_defs[]` in `config.c`
@ -15,7 +15,7 @@ This file indicates how to add a new option to tidy. Here adding an option `Tidy
#### 1. Option ID
In `tidyenum.h` the `TidyOptionId` can be in any order, but normally a new option would be added just before the last `N_TIDY_OPTIONS`, which must remain the last. Choosing the id name can be any string, but by convention it will commence with `Tidy` followed by brief descriptive like text.
In `tidyenum.h` the `TidyOptionId` can be in any order, but normally a new option would be added just before the last `N_TIDY_OPTIONS`, which must remain the last. Choosing the id name can be any string, but by convention it will commence with `Tidy` followed by brief descriptive text.
Naturally it can not be the same as any exisitng option. That is, it must be unique. And it will be followed by a brief descriptive special doxygen formatted comment. So for this new option I have chosen -
@ -60,7 +60,7 @@ typedef enum
Care, each of these enumeration strings have been equated to 2 uppercase letters. If you feel there should be another `category` or group then this can be discussed, and added.
The **`name`** can be anything, but should try to be somewhat descriptive of the otpion. Again this string must be unique. It should be lowercase alphanumeric characters, and can contain a `-` separator. Remember this is the name places on the command line, or in a configuration file to set the option.
The **`name`** can be anything, but should try to be somewhat descriptive of the option. Again this string must be unique. It should be lowercase alphanumeric characters, and can contain a `-` separator. Remember this is the name places on the command line, or in a configuration file to set the option.
The **`type`** is one of the following enumeration items -
@ -73,7 +73,7 @@ typedef enum
} TidyOptionType;
```
Care, each of these enumeration strings have been equated to 2 uppercase letters. If you feel there should be another `type` then this can be discussed, but would require other additional things. And also note the `TidyTriState` is the same as a `TidyInteger` except uses its own parser.
Care, each of these enumeration strings have been equated to two uppercase letters. If you feel there should be another `type` then this can be discussed, but would require other additional things. And also note the `TidyTriState` is the same as a `TidyInteger` except uses its own parser.
The next item is the **`default`** value for a boolean, tristate or integer. Note tidy set `no=0` and `yes=1` as its own `Bool` enumeration.

View File

@ -1,124 +1,57 @@
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
<title>
HTML Tidy with HTML5 support
</title>
<style>
h1 {
background-color: #6495ed;
}
code {
background-color: #e0ffff;
}
div {
background-color: #b0c4de;
}
</style>
</head>
<body>
<h1>
HTML Tidy with HTML5 support
</h1>
<h2>
Prerequisites
</h2>
<ol>
<li>
<p>
git - http://git-scm.com/book/en/v2/Getting-Started-Installing-Git
</p>
</li>
<li>
<p>
cmake - http://www.cmake.org/download/
</p>
</li>
<li>
<p>
appropriate build tools for the platform
</p>
</li>
</ol>
<p>
CMake comes in two forms - command line and gui. Some installations only install one or the
other, but sometimes both. The build commands below are only for the command line use.
</p>
<p>
Also the actual build tools vary for each platform. But that is one of the great features of
cmake, it can generate variuous 'native' build files. Running cmake without any parameters will
list the generators available on that platform. For sure one of the common ones is "Unix
Makefiles", which needs autotools make installed, but many other generators are supported.
</p>
<p>
In windows cmake offers various versions of MSVC. Again below only the command line use of MSVC
is shown, but the tidy solution (*.sln) file can be loaded into the MSVC IDE, and the building
done in there.
</p>
<h2>
Build the tidy library and command line tool
</h2>
<ol>
<li>
<p>
<code>cd build/cmake</code>
</p>
</li>
<li>
<p>
<code>cmake ../.. [-DCMAKE_INSTALL_PREFIX=/path/for/install]</code>
</p>
</li>
<li>
<p>
Windows: <code>cmake --build . --config Release</code>
<br>
Unix/OS X: <code>make</code>
</p>
</li>
<li>
<p>
Install, if desired:
<br>
Windows: <code>cmake --build . --config Release --target INSTALL</code>
<br>
Unix/OS X: <code>[sudo] make install</code>
</p>
</li>
</ol>
<p>
By default cmake sets the install path to /usr/local in unix. If you wanted the binary in say
/usr/bin instead, then in 2. above use -DCMAKE<em>INSTALL</em>PREFIX=/usr
</p>
<p>
In windows the default install is to C:\Program Files\tidy5, or C:/Program Files (x86)/tidy5,
which is not very useful. After the build the tidy[n].exe is in the Release directory, and can
be copied to any directory in your PATH environment variable, for global use.
</p>
<p>
If you need the tidy library built as a 'shared' (DLL) library, then in 2. add the command
-DBUILD<em>SHARED</em>LIB:BOOL=ON. This option is OFF by default, so the static library is
built and linked with the command line tool for convenience.
</p>
<h2>
History
</h2>
<p>
This repository should be considered canonical for HTML Tidy as of 2015-January-15.
</p>
<ul>
<li>
<p>
This repository originally transferred from <a href=
"http://w3c.github.com/tidy-html5/">w3c.github.com/tidy-html5</a>.
</p>
</li>
<li>
<p>
First moved to Github from <a href="http://tidy.sourceforge.net">tidy.sourceforge.net</a>.
</p>
</li>
</ul>
</body>
</html>
<h1 id="htacghtmltidy">HTACG HTML Tidy</h1>
<h2 id="prerequisites">Prerequisites</h2>
<ol>
<li><p>git - <a href="http://git-scm.com/book/en/v2/Getting-Started-Installing-Git">http://git-scm.com/book/en/v2/Getting-Started-Installing-Git</a></p></li>
<li><p>cmake - <a href="http://www.cmake.org/download/">http://www.cmake.org/download/</a></p></li>
<li><p>appropriate build tools for the platform</p></li>
<li><p>the <a href="http://xmlsoft.org/XSLT/xsltproc2.html">xsltproc</a> tool is required to build and install the <code>tidy.1</code> man page on Unix-like platforms.</p></li>
</ol>
<p>CMake comes in two forms - command line and GUI. Some installations only install one or the other, but sometimes both. The build commands below are only for command line use.</p>
<p>Also the actual build tools vary for each platform. But that is one of the great features of CMake, it can generate variuous &#8216;native&#8217; build files. Running <code>cmake --help</code> should list the generators available on that platform. For sure one of the common ones is &#8220;Unix Makefiles&#8221;, which needs autotools make installed, but many other generators are supported.</p>
<p>In Windows CMake offers various versions for MSVC. Again below only the command line use of MSVC is shown, but the tidy solution (*.sln) file can be loaded into the MSVC IDE, and the building done in there.</p>
<h2 id="buildthetidylibraryandcommandlinetool">Build the tidy library and command line tool</h2>
<ol>
<li><p><code>cd build/cmake</code></p></li>
<li><p><code>cmake ../.. -DCMAKE_BUILD_TYPE=Release [-DCMAKE_INSTALL_PREFIX=/path/for/install]</code></p></li>
<li><p>Windows: <code>cmake --build . --config Release</code><br/>
Unix/OS X: <code>make</code></p></li>
<li><p>Install, if desired:<br/>
Windows: <code>cmake --build . --config Release --target INSTALL</code><br/>
Unix/OS X: <code>[sudo] make install</code></p></li>
</ol>
<p>By default cmake sets the install path to <code>/usr/local/bin</code> in Unix. If you wanted the binary in say <code>/usr/bin</code> instead, then in 2. above use <code>-DCMAKE_INSTALL_PREFIX=/usr</code>.</p>
<p>Also, in Unix if you want to build the release library without any debug <code>assert</code> in the code then add <code>-DCMAKE_BUILD_TYPE=Release</code> in step 2. This adds a <code>-DNDEBUG</code> macro to the compile switches. This is normally added in windows build for the <code>Release</code> config.</p>
<p>In Windows the default install is to <code>C:\Program Files\tidy</code>, or <code>C:/Program Files (x86)/tidy</code>, which is not very useful. After the build the <code>tidy.exe</code> is in the <code>Release</code> directory, and can be copied to any directory in your <code>PATH</code> environment variable for global use.</p>
<p>If you do <strong>not</strong> need the tidy library built as a &#8216;shared&#8217; (DLL) library, then in 2. add the command <code>-DBUILD_SHARED_LIB:BOOL=OFF</code>. This option is <strong>ON</strong> by default. The static library is always built and linked with the command line tool for convenience in Windows, and so the binary can be run as part of the man page build without the shared library being installed in unix.</p>
<p>See the <code>CMakeLists.txt</code> file for other CMake <strong>options</strong> offered.</p>
<h2 id="buildphpwiththetidy-html5library">Build PHP with the tidy-html5 library</h2>
<p>Due to API changes in the PHP source, <code>buffio.h</code> needs to be renamed to <code>tidybuffio.h</code> in the file <code>ext/tidy/tidy.c</code> in PHP&#8217;s source.</p>
<p>That is - prior to configuring PHP run this in the PHP source directory:
<code>
sed -i 's/buffio.h/tidybuffio.h/' ext/tidy/*.c
</code></p>
<p>And then continue with (just an example here, use your own PHP config options):</p>
<pre><code>./configure --with-tidy=/usr/local
make
make test
make install
</code></pre>
<p>; eof</p>

View File

@ -1,102 +0,0 @@
# HTML Tidy with HTML5 support
## Prerequisites
1. git - http://git-scm.com/book/en/v2/Getting-Started-Installing-Git
2. cmake - http://www.cmake.org/download/
3. appropriate build tools for the platform
4. the [xsltproc](http://xmlsoft.org/XSLT/xsltproc2.html) tool is required to build and install the tidy.1 man page.
CMake comes in two forms - command line and gui. Some installations only install one or the other, but sometimes both. The build commands below are only for the command line use.
Also the actual build tools vary for each platform. But that is one of the great features of cmake, it can generate variuous 'native' build files. Running `cmake --help` should list the generators available on that platform. For sure one of the common ones is "Unix Makefiles", which needs autotools make installed, but many other generators are supported.
In windows cmake offers various versions of MSVC. Again below only the command line use of MSVC is shown, but the tidy solution (*.sln) file can be loaded into the MSVC IDE, and the building done in there.
## Build the tidy library and command line tool
1. `cd build/cmake`
2. `cmake ../.. -DCMAKE_BUILD_TYPE=Release [-DCMAKE_INSTALL_PREFIX=/path/for/install]`
3. Windows: `cmake --build . --config Release`
Unix/OS X: `make`
4. Install, if desired:
Windows: `cmake --build . --config Release --target INSTALL`
Unix/OS X: `[sudo] make install`
By default cmake sets the install path to `/usr/local/bin` in unix. If you wanted the binary in say `/usr/bin` instead, then in 2. above use -DCMAKE_INSTALL_PREFIX=/usr
Also, in unix if you want to build the release library without any debug `assert` in the code then add `-DCMAKE_BUILD_TYPE=Release` in step 2. This adds a `-DNDEBUG` macro to the compile switches. This is normally added in windows build for the `Release` config.
In windows the default install is to `C:\Program Files\tidy`, or `C:/Program Files (x86)/tidy`, which is not very useful. After the build the `tidy.exe` is in the Release directory, and can be copied to any directory in your **PATH** environment variable, for global use.
If you do **not** need the tidy library built as a 'shared' (DLL) library, then in 2. add the command -DBUILD_SHARED_LIB:BOOL=OFF. This option is ON by default. The static library is always built and linked with the command line tool for convenience in windows, and so the binary can be run as part of the man page build without the shared library being installed in unix.
See the `CMakeLists.txt` file for other cmake **options** offered.
## Build PHP with the tidy-html5 library
Due to API changes in the PHP source, "buffio.h" needs to be changed to "tidybuffio.h" in the file ext/tidy/tidy.c.
That is - prior to configuring php run this in the php source directory:
```
sed -i 's/buffio.h/tidybuffio.h/' ext/tidy/*.c
```
And then continue with (just an example here, use your own php config options):
```
./configure --with-tidy=/usr/local
make
make test
make install
```
## Important Links
- site: http://www.html-tidy.org/
- source: https://github.com/htacg/tidy-html5
- binaries: http://binaries.html-tidy.org
- bugs: https://github.com/htacg/tidy-html5/issues
- list: https://lists.w3.org/Archives/Public/html-tidy/
- api and quickref: http://api.html-tidy.org/
## Development
The default branch of this repository is `master`. This is the development branch, hopefully always `stable` source.
It will identify as library version X.odd.X. Use it to help us on the forever `bug` quest, addition of new features, options, ..., etc.
However, if you seek **release** code, then do `git branch -r`, and choose one of the `release/X.even.0` branches for your build and install...
This will always be the latest release branch. Important `bug` fixes thought relevant to this release, pushed back, may bump the library version to X.even.1, ..., etc, but will be remain known as `X.even`...
Some more details of the `Tidy Version` can be found in [VERSION.md](VERSION.md).
Concerning the `Tidy Code Style`, some notes can be found in [CODESTYLE.md](CODESTYLE.md).
If you want to contribute to Tidy, then read [CONTRIBUTING.md](CONTRIBUTING.md).
If you want to add a new configuration **option** to tidy, see [OPTIONS.md](OPTIONS.md).
Tidy is moving towards `localization` of the message string. To help in this effort see [LOCALIZE.md](LOCALIZE.md).
Tidy API documents, and quick reference generation has been moved to its own repo [html-tidy.org.api](https://github.com/htacg/html-tidy.org.api). Likewise, release binary generation has been moved to [html-tidy.org.binaries](https://github.com/htacg/html-tidy.org.binaries). Consult the respective `readmes` there for further details.
## History
This repository should be considered canonical for HTML Tidy as of 2015-January-15.
- This repository originally transferred from [w3c.github.com/tidy-html5](http://w3c.github.com/tidy-html5/), now redirected to the current site.
- First moved to Github from [tidy.sourceforge.net](http://tidy.sourceforge.net). Note, this site is kept only for historic reasons, and is not now well maintained.
**Tidy is the granddaddy of HTML tools, with support for modern standards.** Have fun...
; eof

19
README/TAGS.md Normal file
View File

@ -0,0 +1,19 @@
# Tidy HTML Elements
This is about adding a new HTML **tag**.
Tidy tries to support all **tags** supported by the W3C. To add a new supported **tag**, the defintion begins in `tidyenum.h`, to give it a value. Then it is added to the `tag_defs[]` table in `tags.c`, where it is given a unique string, supported html versions, attributes support, and a bit `type`.
Note, there are a group of configuration options to add **tags** not yet approved by the W3C. These are [new-blocklevel-tags](http://api.html-tidy.org/tidy/quickref_5.2.0.html#new-blocklevel-tags), [new-empty-tags](http://api.html-tidy.org/tidy/quickref_5.2.0.html#new-empty-tags), [new-inline-tags](http://api.html-tidy.org/tidy/quickref_5.2.0.html#new-inline-tags). and [new-pre-tags](http://api.html-tidy.org/tidy/quickref_5.2.0.html#new-pre-tags). This provides a way to extend the `tag_defs[]` table just for that tidy session.
So, adding a new HTML **tag** consists of the following simple steps:
1. `tidyenum.h` - Give the element an internal name, like `TidyTag_XXXX`, and thus a value. While there were some initial steps to keep this `TidyTagId` enumeration alphabetic, now just add the new `TidyTag_XXXX` just before the last entry `N_TIDY_TAGS`.
2. `tags.c` - Add a line to the `tag_defs[]` table. This assigns the unique string value of the element. Then the html versions that support the element, a pointer to the attributes supported by that elelment, and a bit field of the elements characteristics, inline, block, etc.
So, just changing 2 files, `tidyenum.h` and `tags.c`, and libTidy will now support that element, tag, as W3C approved. Simple... And at times, there is some case for adding **tags** that are still in the `Working Draft` stage, especially when there has been wide spread support in the community, even before it reaches `REC` stage.
Now, one could argue that this is not the **best** way to verify every attribute and value, for every tag, but that is a moot point - that is how tidy does it!
; eof 20170205

View File

@ -32,7 +32,7 @@ The minor version tells a lot more about the true version of Tidy that you have,
- **even numbered minor versions** indicate released versions of **HTML Tidy**. We provide binaries for releases, API documentation, and full support including cherry picking bug fixes back to them. In standard parlance, _released_ versions are _stable_ versions, meaning that the API is stable and you can generally expect Tidys output to be the same (other than as a result of bug fixes).
- **odd numbered minor versions** are development versions, or as is considered in many contexts _bleeding edge_ versions. HTACG do not provide binaries, and API documentation is not usually up to date, but you do have access to the latest bug fixes, newest features, and knowledge of where Tidy is going. The downside, though, is that we make absolutely no guarantees that:
- **odd numbered minor versions** are development versions, or as is considered in many contexts _bleeding edge_ or _next_ versions. HTACG do not provide binaries, and API documentation is not usually up to date, but you do have access to the latest bug fixes, newest features, and knowledge of where Tidy is going. The downside, though, is that we make absolutely no guarantees that:
- Output remains the same as in previous release versions.
- Output remains the same as in earlier patch versions in the same development series.
@ -63,20 +63,20 @@ This file consists of two lines of dot (.) separated items. The first being the
2017.01.29
```
When **cmake** is run, this file is read and two MACROS added to the compile flags:
When **CMake** is run, this file is read and two macros are added to the compile flags:
```
add_definitions ( -DLIBTIDY_VERSION="${LIBTIDY_VERSION}" )
add_definitions ( -DRELEASE_DATE="${tidy_YEAR}/${tidy_MONTH}/${tidy_DAY}" )
```
And in `CMakeLists.txt` there is the posibility to define another MACRO, when and if required:
And in `CMakeLists.txt` there is the posibility to define another macro, when and if required:
```
# add_definitions ( -DRC_NUMBER="D231" )
```
These MACROS are put in `static const char` strings in **libTidys**s internal- only `src/version.h` file:
These macros are put in `static const char` strings in **libTidys**s internal- only `src/version.h` file:
```
static const char TY_(release_date)[] = RELEASE_DATE;
@ -96,11 +96,7 @@ TIDY_EXPORT ctmbstr TIDY_CALL tidyReleaseDate(void);
### Git branches
When a `release` is done a release/5.0.0 **branch**, and a similar release/5.0.0 **tag** is created.
At that point the `version.txt` is set to the next, 5.1.0.
That is, the `master` branch will contain the ongoing development. Any subsequent good bug fixes found for some time after that will be carefully tested and push back (cherry picked I think is the correct term) into the release/5.0.0, making it 5.0.1...
Starting with HTML Tidy 5.4.0 release, our branching scheme aligns nicely with our version numbering scheme. Please consult [BRANCHES.md](BRANCHES.md).
Updated: 20170210

View File

@ -259,7 +259,7 @@ localize strings that are actually _different_ from the base language!
### Positional Parameters
Please note that HTML Tidy does not current support positional parameters. Due
Please note that HTML Tidy does not currently support positional parameters. Due
to the nature of most of Tidy's output, it's not expected that they will be
required. In any case, please translate strings so that substitution values are
in the same order as the original string.