Many servers will return errors (e.g. 400/403) to requests that do not
set a User-Agent header. This results in issues in both the link_checker
and load_data components. With the link_checker these are false positive
dead links. In load_data, remote data fails to be fetched. To mitigate
this issue, this sets a default User-Agent of
$CARGO_PKG_NAME/$CARGO_PKG_VERSION
Note that the root cause of this regression from zola v0.9.0 is that
reqwest 0.10 changed their default behavior and no longer sets a
User-Agent by default:
https://github.com/seanmonstar/reqwest/pull/751Fixes#950.
For the site integration tests, we have a file of common code which is
used by multiple files in `tests/`. However, not all functions in
this file are used by all files in `tests/`.
As Cargo compiles each `tests/*.rs` file as a separate crate, this
means that some of these crates end up with unused code. Rust notices
this and prints a warning.
Let's tell Rust that we don't care about dead code in this file so
that the warning is not printed.
* Detect empty links on markdown rendering and issue an error
* Add a test for empty links stopping rendering with an error
* Assert error message is the expected one
When testing for empty links detection compare the error message
to make sure it's the correct error that stopped the process
and not some unrelated issue.
The issue with the check_site test hanging and timing out seems to
be related to a similar reqwest issue, which was ultimately due to
an upstream bug in tokio and may be fixed in tokio 0.2.7 onward.
* Restore #![feature(test)] and extern crate test; statements, which
were mistakenly removed as part of the Rust 2018 edition migration.
* Fix rendering benchmark's usage of RenderContext. 6 parameters were
provided when 5 were expected.
* Treat 304 (Not Modified) requests as valid.
* Add tests for 301-to-200 links, 301-to-404 links, and 500 links.
This helps to test redirections and the previously-added
response.status() checking for non-success status codes in check_url().
* Make names for HTTP mock paths unique, to avoid weird behavior. It
seems like mocks with the same path can potentially bleed between
tests, so you may end up with an unexpected response which causes the
test to sometimes pass and sometimes fail.
* Fix Clippy warnings about String::from(format!()).
Certain tests involving HTTP requests were sometimes hanging
indefinitely, so this uses Mockito for HTTP mocking. This seemingly
resolves the issue and makes these tests more reliable.
The existing can_fail_404_links test has been renamed to
can_fail_unresolved_links, to represent what actually occurs in the
test. The can_fail_404_links test now deals with a proper 404
response.
Just to be clear, the check_site test in the site component will
still create outgoing HTTP requests (due to the URLs used in the
test_site), so this commit only uses HTTP mocking where possible.
The can_fail_404_links() test doesn't encounter a 404 response in
actuality, since the google.comys domain doesn't resolve. When the
test is updated such that the response's status code is a 404, the
test fails because the check_url() function doesn't handle
non-success responses how the test's assertions expect. This commit
updates check_url() to handle non-success responses, treating them
much like errors.
* maybe_slugify() only does simple sanitation if config.slugify is false
* slugify is disabled by default, turn on for backwards-compatibility
* First docs changes for optional slugification
* Remove # from slugs but not &
* Add/fix tests for utf8 slugs
* Fix test sites for i18n slugs
* fix templates tests for i18n slugs
* Rename slugify setting to slugify_paths
* Default slugify_paths
* Update documentation for slugify_paths
* quasi_slugify removes ?, /, # and newlines
* Remove forbidden NTFS chars in quasi_slugify()
* Slugification forbidden chars can be configured
* Remove trailing dot/space in quasi_slugify
* Fix NTFS path sanitation
* Revert configurable slugification charset
* Remove \r for windows newlines and \t tabulations in quasi_slugify()
* Update docs for output paths
* Replace slugify with slugify_paths
* Fix test
* Default to not slugifying
* Move slugs utils to utils crate
* Use slugify_paths for anchors as well
* Add path to `TranslatedContent`
This makes it possible to retrieve the translated page through the `get_page` function.
* Use TranslatedContent::path field in test_site_i18n
Use it with the `get_page` function to get a reference to the page object.
* Compute canonical path before adjusting parent path
* Don't use adjusted `parent` to recalculate `canonical` in `find_language`
* Add regression tests
- Test for correct canonical field when calling `new_page`
- Test for correct canonical field after calling `find_language`
* feat(pagination): Add `total_pages` in paginator object
* feat(pagination): Added doc for `total_pages`
* feat(pagination): Added test for `total_pages`
"[…] `&` normally indicates the start of a character entity reference or
numeric character reference; writing it as `&` […] allows `&` to be
included in the content of an element or in the value of an attribute."
From: https://en.wikipedia.org/wiki/HTML#Character_and_entity_references
Links that start with a scheme (e.g., `tel:18008675309`) inadvertently
had a URL prefix prepended. Previously, only `mailto:` was handled, but
given the sheer number of [registered URI schemes][uri-schemes], a loose
pattern matcher is used to detect schemes instead.
External links, as identified by the renderer, are now limited to `http`
and `https` schemes.
Fixes#747 and fixes#816.
[uri-schemes]: https://www.iana.org/assignments/uri-schemes/uri-schemes.xhtml
* Add anchor existant checking to link_checker component
* Oops, forgot some changes
* Drop scraper dependency and rework tests
* Handle name attributes
The access to translations is not straightforward and requires checks if
language and key exists. It is better to forbit direct access to
attribute and provide method - `get_translation()` - that will handle
all details of key translations.
Remove unit tests that use direct access and test only public method.
* fix the issue of generating the search index for multiple language
* updat docs for generating the search index for multiple language
* fix failed tests
* add tests for the search index of multiple language
Add method get_translation(lang, key) into Config struct that retrieves
translated term from parsed configuration or error when either
desired language or key is missing.
Use the new method in Trans struct implementing global Tera function
trans().
Add unit test to cover both happy and error path for translation
retrieval in both config and templates crate.
These functions expect that file_path can have base_path stripped from
it, but during reloading they can be given relative paths. Maybe this
behaviour varies between the notify backends?
This fixes two zola serve panics on FreeBSD (poll backend).
Clippy is returning some warnings. Let's fix or explicitly ignore
them. In particular:
- In `components/imageproc/src/lib.rs`, we implement `Hash` explicitly
but derive `PartialEq`. We need to maintain the property that two
keys being equal implies the hashes of those two keys are equal.
Our `Hash` implementations preserve this, so we'll explicitly ignore
the warnings.
- In `components/site/src/lib.rs`, we were calling `.into()` on some
values that are already of the correct type.
- In `components/site/src/lib.rs`, we were using `.map(|x| *x)` in
iterator chains to remove a level of indirection; we can instead say
`.copied()` (introduced in Rust v1.36) or `.cloned()`. Using
`.copied` here is better from a type-checking point of view, but
we'll use `.cloned` for now as Rust v1.36 was only recently
released.
- In `components/templates/src/filters.rs` and
`components/utils/src/site.rs`, we were taking `HashMap`s as
function arguments but not generically accepting alternate `Hasher`
implementations.
- In `src/cmd/check.rs`, we use `env::current_dir()` as a default
value, but our use of `unwrap_or` meant that we would always
retrieve the current directory even when not needed.
- In `components/errors/src/lib.rs`, we can use `if let` rather than
`match`.
- In `components/library/src/content/page.rs`, we can collapse a
nested conditional into `else if let ...`.
- In `components/library/src/sorting.rs`, a function takes `&&Page`
arguments. Clippy warns about this for efficiency reasons, but
we're doing it here to match a particular sorting API, so we'll
explicitly ignore the warning.
* Add hard_link_static config option.
* Copy or hardlink file depending on an argument.
Modify the call sites for `copy_file` to account for the extra argument.
* Plug the config setting through to copy_file.
Don't apply the config option to theme's static directory.
* Update documentation.
* Backticks make no sense in this comment.
* Addressing PR comments.
* Be consistent with argument naming.
* chore: Update glob to 0.3
Signed-off-by: Igor Gnatenko <i.gnatenko.brain@gmail.com>
* chore: Update ws to 0.8
Signed-off-by: Igor Gnatenko <i.gnatenko.brain@gmail.com>
* Add check subcommand
* Add some brief documentation for the check subcommand
* Start working on parallel link checks
* Check all external links in Site
* Return *all* dead links in site
Justification for this feature is added in the docs.
Precedent for the precise syntax: Hugo.
Hugo puts this syntax behind a preference named headerIds, and automatic
header ID generation behind a preference named autoHeaderIds, with both
enabled by default. I have not implemented a switch to disable this.
My suggestion for a workaround for the improbable case of desiring a
literal “{#…}” at the end of a header is to replace `}` with `}`.
The algorithm I have used is not identical to [that
which Hugo uses][0], because Hugo’s looks to work at the source level,
whereas here we work at the pulldown-cmark event level, which is
generally more sane, but potentially limiting for extremely esoteric
IDs.
Practical differences in implementation from Hugo (based purely on
reading [blackfriday’s implementation][0], not actually trying it):
- I believe Hugo would treat `# Foo {#*bar*}` as a heading with text
“Foo” and ID `*bar*`, since it is working at the source level; whereas
this code turns it into a heading with HTML `Foo {#<em>bar</em>}`, as
it works at the pulldown-cmark event level and doesn’t go out of its
way to make that work (I’m not familiar with pulldown-cmark, but I get
the impression that you could make it work Hugo’s way on this point).
The difference should be negligible: only *very* esoteric hashes would
include magic Markdown characters.
- Hugo will automatically generate an ID for `{#}`, whereas what I’ve
coded here will yield a blank ID instead (which feels more correct to
me—`None` versus `Some("")`, and all that).
In practice the results should be identical.
Fixes#433.
[0]: a477dd1646/block.go (L218-L234)
* Change the behavior of the template rendering:
* Check if the template bare name is present
* Check if the template is part of a theme
* Fallback to defaults
* Change the behavior of the shortcode rendering:
* Call the template rendering function
* Prepend `__zola_builtins/` to most of the default elements in `ZOLA_TERA`
* Add a test to verify the presence and content of a `404.html` page
from a theme's template