zola/components/site/benches/gen.py
Chris Morgan e25915b231 Support and default to generating Atom feeds
This includes several breaking changes, but they’re easy to adjust for.

Atom 1.0 is superior to RSS 2.0 in a number of ways, both technical and
legal, though information from the last decade is hard to find.
http://www.intertwingly.net/wiki/pie/Rss20AndAtom10Compared
has some info which is probably still mostly correct.

How do RSS and Atom compare in terms of implementation support? The
impression I get is that proper Atom support in normal content websites
has been universal for over twelve years, but that support in podcasts
was not quite so good, but getting there, over twelve years ago. I have
no more recent facts or figures; no one talks about this stuff these
days. I remember investigating this stuff back in 2011–2013 and coming
to the same conclusion. At that time, I went with Atom on websites and
RSS in podcasts. Now I’d just go full Atom and hang any podcast tools
that don’t support Atom, because Atom’s semantics truly are much better.

In light of all this, I make the bold recommendation to default to Atom.

Nonetheless, for compatibility for existing users, and for those that
have Opinions, I’ve retained the RSS template, so that you can escape
the breaking change easily.

I personally prefer to give feeds a basename that doesn’t mention “Atom”
or “RSS”, e.g. “feed.xml”. I’ll be doing that myself, as I’ll be using
my own template with more Atom features anyway, like author information,
taxonomies and making the title field HTML.

Some notes about the Atom feed template:

- I went with atom.xml rather than something like feed.atom (the .atom
  file format being registered for this purpose by RFC4287) due to lack
  of confidence that it’ll be served with the right MIME type. .xml is a
  safer default.

- It might be nice to get Zola’s version number into the <generator>
  tag. Not for any particularly good reason, y’know. Just picture it:

    <generator uri="https://www.getzola.org/" version="0.10.0">
	Zola
    </generator>

- I’d like to get taxonomies into the feed, but this requires exposing a
  little more info than is currently exposed. I think it’d require
  `TaxonomyConfig` to preferably have a new member `permalink` added
  (which should be equivalent to something like `config.base_url ~ "/" ~
  taxonomy.slug ~ "/"`), and for the feed to get all the taxonomies
  passed into it (`taxonomies: HashMap<String, TaxonomyTerm>`).
  Then, the template could be like this, inside the entry:

    {% for taxonomy, terms in page.taxonomies %}
        {% for term in terms %}
            <category scheme="{{ taxonomies[taxonomy].permalink }}"
		term="{{ term.slug }}" label="{{ term.name }}" />
	{% endfor %}
    {% endfor %}

Other remarks:

- I have added a date field `extra.updated` to my posts and include that
  in the feed; I’ve observed others with a similar field. I believe this
  should be included as an official field. I’m inclined to add author to
  at least config.toml, too, for feeds.
- We need to have a link from the docs to the source of the built-in
  templates, to help people that wish to alter it.
2020-04-14 17:27:08 +05:30

177 lines
5 KiB
Python

"""
Generates test sites for use in benchmark.
Tested with python3 and probably does not work on Windows.
"""
import datetime
import os
import random
import shutil
TAGS = ["a", "b", "c", "d", "e", "f", "g"]
CATEGORIES = ["c1", "c2", "c3", "c4"]
PAGE = """
+++
title = "Hello"
date = REPLACE_DATE
[taxonomies]
tags = REPLACE_TAG
categories = ["REPLACE_CATEGORY"]
+++
# Modus cognitius profanam ne duae virtutis mundi
## Ut vita
Lorem markdownum litora, care ponto nomina, et ut aspicit gelidas sui et
purpureo genuit. Tamen colla venientis [delphina](http://nil-sol.com/ecquis)
Tusci et temptata citaeque curam isto ubi vult vulnere reppulit.
- Seque vidit flendoque de quodam
- Dabit minimos deiecto caputque noctis pluma
- Leti coniunx est Helicen
- Illius pulvereumque Icare inpositos
- Vivunt pereo pluvio tot ramos Olenios gelidis
- Quater teretes natura inde
### A subsection
Protinus dicunt, breve per, et vivacis genus Orphei munere. Me terram [dimittere
casside](http://corpus.org/) pervenit saxo primoque frequentat genuum sorori
praeferre causas Libys. Illud in serpit adsuetam utrimque nunc haberent,
**terrae si** veni! Hectoreis potes sumite [Mavortis retusa](http://tua.org/)
granum captantur potuisse Minervae, frugum.
> Clivo sub inprovisoque nostrum minus fama est, discordia patrem petebat precatur
absumitur, poena per sit. Foramina *tamen cupidine* memor supplex tollentes
dictum unam orbem, Anubis caecae. Viderat formosior tegebat satis, Aethiopasque
sit submisso coniuge tristis ubi!
## Praeceps Corinthus totidem quem crus vultum cape
```rs
#[derive(Debug)]
pub struct Site {
/// The base path of the zola site
pub base_path: PathBuf,
/// The parsed config for the site
pub config: Config,
pub pages: HashMap<PathBuf, Page>,
pub sections: HashMap<PathBuf, Section>,
pub tera: Tera,
live_reload: bool,
output_path: PathBuf,
static_path: PathBuf,
pub tags: Option<Taxonomy>,
pub categories: Option<Taxonomy>,
/// A map of all .md files (section and pages) and their permalink
/// We need that if there are relative links in the content that need to be resolved
pub permalinks: HashMap<String, String>,
}
```
## More stuff
And a shortcode:
{{ youtube(id="my_youtube_id") }}
### Another subsection
Gotta make the toc do a little bit of work
# A big title
- hello
- world
- !
```py
if __name__ == "__main__":
gen_site("basic-blog", [""], 250, paginate=True)
```
"""
def gen_skeleton(name, is_blog):
if os.path.exists(name):
shutil.rmtree(name)
os.makedirs(os.path.join(name, "content"))
os.makedirs(os.path.join(name, "static"))
with open(os.path.join(name, "config.toml"), "w") as f:
if is_blog:
f.write("""
title = "My site"
base_url = "https://replace-this-with-your-url.com"
theme = "sample"
taxonomies = [
{name = "tags", feed = true},
{name = "categories"}
]
[extra.author]
name = "Vincent Prouillet"
""")
else:
f.write("""
title = "My site"
base_url = "https://replace-this-with-your-url.com"
theme = "sample"
[extra.author]
name = "Vincent Prouillet"
""")
# Re-use the test templates
shutil.copytree("../../../test_site/templates", os.path.join(name, "templates"))
shutil.copytree("../../../test_site/themes", os.path.join(name, "themes"))
def gen_section(path, num_pages, is_blog):
with open(os.path.join(path, "_index.md"), "w") as f:
if is_blog:
f.write("""
+++
paginate_by = 5
sort_by = "date"
template = "section_paginated.html"
+++
""")
else:
f.write("+++\n+++\n")
day = datetime.date.today()
for (i, page) in enumerate(range(0, num_pages)):
with open(os.path.join(path, "page-{}.md".format(i)), "w") as f:
f.write(
PAGE
.replace("REPLACE_DATE", str(day + datetime.timedelta(days=1)))
.replace("REPLACE_CATEGORY", random.choice(CATEGORIES))
.replace("REPLACE_TAG", str([random.choice(TAGS), random.choice(TAGS)]))
)
def gen_site(name, sections, num_pages_per_section, is_blog=False):
gen_skeleton(name, is_blog)
for section in sections:
path = os.path.join(name, "content", section) if section else os.path.join(name, "content")
if section:
os.makedirs(path)
gen_section(path, num_pages_per_section, is_blog)
if __name__ == "__main__":
gen_site("small-blog", [""], 30, is_blog=True)
gen_site("medium-blog", [""], 250, is_blog=True)
gen_site("big-blog", [""], 1000, is_blog=True)
gen_site("huge-blog", [""], 10000, is_blog=True)
gen_site("extra-huge-blog", [""], 100000, is_blog=True)
gen_site("small-kb", ["help", "help1", "help2", "help3", "help4", "help5", "help6", "help7", "help8", "help9"], 10)
gen_site("medium-kb", ["help", "help1", "help2", "help3", "help4", "help5", "help6", "help7", "help8", "help9"], 100)
gen_site("huge-kb", ["help", "help1", "help2", "help3", "help4", "help5", "help6", "help7", "help8", "help9"], 1000)