Feature: allow spaces between dashes in filename for page (#1323)

* Allow optional whitespace around dash/underscore in filename

Allow file names that are as follows:

    2021-01-01 - test.md

To be parsed the same as if they were

    2021-01-01-test.md

The slug for both will now just be "test" instead of previously the
first example would have become "2021-01-01-test".

* Add documentation for optional whitespace in filename

* Test that updated regex does not take space after dash
This commit is contained in:
Bert JW Regeer 2021-03-22 12:00:25 -07:00 committed by GitHub
parent 958ec2a758
commit 5bf9ebc43a
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
2 changed files with 38 additions and 3 deletions

View file

@ -25,7 +25,7 @@ lazy_static! {
// Based on https://regex101.com/r/H2n38Z/1/tests
// A regex parsing RFC3339 date followed by {_,-}, some characters and ended by .md
static ref RFC3339_DATE: Regex = Regex::new(
r"^(?P<datetime>(\d{4})-(0[1-9]|1[0-2])-(0[1-9]|[12][0-9]|3[01])(T([01][0-9]|2[0-3]):([0-5][0-9]):([0-5][0-9]|60)(\.[0-9]+)?(Z|(\+|-)([01][0-9]|2[0-3]):([0-5][0-9])))?)(_|-)(?P<slug>.+$)"
r"^(?P<datetime>(\d{4})-(0[1-9]|1[0-2])-(0[1-9]|[12][0-9]|3[01])(T([01][0-9]|2[0-3]):([0-5][0-9]):([0-5][0-9]|60)(\.[0-9]+)?(Z|(\+|-)([01][0-9]|2[0-3]):([0-5][0-9])))?)\s?(_|-)(?P<slug>.+$)"
).unwrap();
}
@ -703,6 +703,41 @@ Hello world
assert_eq!(page.slug, " こんにちは");
}
#[test]
fn can_get_date_from_filename_with_spaces() {
let config = Config::default();
let content = r#"
+++
+++
Hello world
<!-- more -->"#
.to_string();
let res = Page::parse(Path::new("2018-10-08 - hello.md"), &content, &config, &PathBuf::new());
assert!(res.is_ok());
let page = res.unwrap();
assert_eq!(page.meta.date, Some("2018-10-08".to_string()));
assert_eq!(page.slug, "hello");
}
#[test]
fn can_get_date_from_filename_with_spaces_respects_slugification() {
let mut config = Config::default();
config.slugify.paths = SlugifyStrategy::Off;
let content = r#"
+++
+++
Hello world
<!-- more -->"#
.to_string();
let res = Page::parse(Path::new("2018-10-08 - hello.md"), &content, &config, &PathBuf::new());
assert!(res.is_ok());
let page = res.unwrap();
assert_eq!(page.meta.date, Some("2018-10-08".to_string()));
assert_eq!(page.slug, " hello");
}
#[test]
fn can_get_date_from_full_rfc3339_date_in_filename() {
let config = Config::default();

View file

@ -65,12 +65,12 @@ When the article's output path is not specified in the frontmatter, it is extrac
- if the filename is `index.md`, its parent folder name (`bar`) is used as output path
- otherwise, the output path is extracted from `thing` (the filename without the `.md` extension)
If the path found starts with a datetime string (`YYYY-mm-dd` or [a RFC3339 datetime](https://www.ietf.org/rfc/rfc3339.txt)) followed by an underscore (`_`) or a dash (`-`), this date is removed from the output path and will be used as the page date (unless already set in the front-matter). Note that the full RFC3339 datetime contains colons, which is not a valid character in a filename on Windows.
If the path found starts with a datetime string (`YYYY-mm-dd` or [a RFC3339 datetime](https://www.ietf.org/rfc/rfc3339.txt)) followed by optional whitespace and then an underscore (`_`) or a dash (`-`), this date is removed from the output path and will be used as the page date (unless already set in the front-matter). Note that the full RFC3339 datetime contains colons, which is not a valid character in a filename on Windows.
The output path extracted from the file path is then slugified or not, depending on the `slugify.paths` config, as explained previously.
**Example:**
The file `content/blog/2018-10-10-hello-world.md` will yield a page at `[base_url]/blog/hello-world`.
The file `content/blog/2018-10-10-hello-world.md` will yield a page at `[base_url]/blog/hello-world`. With optional whitespace, the file `content/blog/2021-01-23 -hello new world.md` will yield a page at `[base_url]/blog/hello-new-world`
## Front matter