Fix blog, new article "gptar-update.md"

This commit is contained in:
Reynir Björnsson 2024-10-29 09:02:43 +01:00
parent a2722c0947
commit dfbb3577ed
2 changed files with 118 additions and 33 deletions

100
posts/gptar-update.md Normal file
View file

@ -0,0 +1,100 @@
---
title: GPTar (update)
date: 2024-10-28
description: libarchive vs hybrid GUID partition table and GNU tar volume header
---
In a [previous post][gptar-post] I describe how I craft a hybrid GUID partition table (GPT) and tar archive by exploiting that there are disjoint areas of a 512 byte *block* that are important to tar headers and *protective* master boot records used in GPT respectively.
I recommend reading it first if you haven't already for context.
After writing the above post I read an excellent and fun *and totally normal* article by Emily on how [she created **executable** tar archives][tar-executable].
Therein I learned a clever hack:
GNU tar has a tar extension for *volume headers*.
These are essentially labels for your tape archives when you're forced to split an archive across multiple tapes.
They can (seemingly) hold any text as label including shell scripts.
What's more is GNU tar and bsdtar **does not** extract these as files!
This is excellent, because I don't actually want to extract or list the GPT header when using GNU tar or bsdtar.
This prompted me to [use a different link indicator](https://github.com/reynir/gptar/pull/1).
This worked pretty great.
Listing the archive using GNU tar I still get `GPTAR`, but with verbose listing it's displayed as a `--Volume Header--`:
```
$ tar -tvf disk.img
Vr-------- 0/0 16896 1970-01-01 01:00 GPTAR--Volume Header--
-rw-r--r-- 0/0 14 1970-01-01 01:00 test.txt
```
And more importantly the `GPTAR` entry is ignored when extracting:
```
$ mkdir tmp
$ cd tmp/
$ tar -xf ../disk.img
$ ls
test.txt
```
## BSD tar / libarchive
Unfortunately, this broke bsdtar!
```
$ bsdtar -tf disk.img
bsdtar: Damaged tar archive
bsdtar: Error exit delayed from previous errors.
```
This is annoying because we run FreeBSD on the host for [opam.robur.coop](https://opam.robur.coop), our instance of [opam-mirror][opam-mirror].
This Autumn we updated [opam-mirror][opam-mirror] to use the hybrid GPT+tar GPTar *tartition table*[^tartition] instead of hard coded or boot parameter specified disk offsets for the different partitions - which was extremely brittle!
So we were no longer able to inspect the contents of the tar partition from the host!
Unacceptable!
So I started to dig into libarchive where bsdtar comes from.
To my surprise, after building bsdtar from the git clone of the source code it ran perfectly fine!
```
$ ./bsdtar -tf ../gptar/disk.img
test.txt
```
I eventually figure out [this change][libarchive-pr] fixed it for me.
I got in touch with Emily to let her know that bsdtar recently fixed this (ab)use of GNU volume headers.
Her reply was basically "as of when I wrote the article, I was pretty sure bsdtar ignored it."
And indeed it did.
Examining the diff further revealed that it ignored the GNU volume header - just not "correctly" when the GNU volume header was abused to carry file content as I did:
```diff
/*
* Interpret 'V' GNU tar volume header.
*/
static int
header_volume(struct archive_read *a, struct tar *tar,
struct archive_entry *entry, const void *h, size_t *unconsumed)
{
- (void)h;
+ const struct archive_entry_header_ustar *header;
+ int64_t size, to_consume;
+
+ (void)a; /* UNUSED */
+ (void)tar; /* UNUSED */
+ (void)entry; /* UNUSED */
- /* Just skip this and read the next header. */
- return (tar_read_header(a, tar, entry, unconsumed));
+ header = (const struct archive_entry_header_ustar *)h;
+ size = tar_atol(header->size, sizeof(header->size));
+ to_consume = ((size + 511) & ~511);
+ *unconsumed += to_consume;
+ return (ARCHIVE_OK);
}
```
So thanks to the above change we can expect a release of libarchive supporting further flavors of abuse of GNU volume headers!
🥳
[gptar-post]: gptar.html
[tar-executable]: https://uni.horse/executable-tarballs.html
[opam-mirror]: https://git.robur.coop/robur/opam-mirror/
[libarchive-pr]: https://github.com/libarchive/libarchive/pull/2127
[^tartition]: Emily came up with the much better term "tartition table" than what I had come up with - "GPTar".

View file

@ -211,16 +211,18 @@ module Page = struct
method charset : string option
method description : string option
method tags : string list
method head_extra : string option
method with_host : string -> 'self
method get_host : string option
end
class page ?title ?description ?charset ?(tags = []) () =
class page ?title ?description ?charset ?(tags = []) ?head_extra () =
object (_ : #t)
method title = title
method charset = charset
method description = description
method tags = tags
method head_extra = head_extra
val host = None
method with_host v = {< host = Some v >}
method get_host = host
@ -233,8 +235,9 @@ module Page = struct
let+ title = optional fields "title" string
and+ description = optional fields "description" string
and+ charset = optional fields "charset" string
and+ tags = optional_or fields ~default:[] "tags" (list_of string) in
new page ?title ?description ?charset ~tags ()
and+ tags = optional_or fields ~default:[] "tags" (list_of string)
and+ head_extra = optional fields "head_extra" string in
new page ?title ?description ?charset ~tags ?head_extra ()
let validate =
let open Data.Validation in
@ -246,6 +249,7 @@ module Page = struct
("title", (option string) p#title);
("charset", (option string) p#charset);
("description", (option string) p#description);
("head_extra", option string p#head_extra);
]
end
@ -373,15 +377,17 @@ module Articles = struct
method title : string option
method description : string option
method articles : (Path.t * Article.t) list
method head_extra : string option
method with_host : string -> 'self
method get_host : string option
end
class articles ?title ?description articles =
class articles ?title ?description ?head_extra articles =
object (_ : #t)
method title = title
method description = description
method articles = articles
method head_extra = head_extra
val host = None
method with_host v = {< host = Some v >}
method get_host = host
@ -418,11 +424,11 @@ module Articles = struct
|>> second
(fetch (module P) ?increasing ~filter ~on ~where ~compute_link path)
>>> lift (fun (v, articles) ->
new articles ?title:v#title ?description:v#description articles)
new articles ?title:v#title ?description:v#description ?head_extra:v#head_extra articles)
let normalize (ident, article) =
let open Data in
record (("url", string @@ Path.to_string ident) :: Article.normalize article)
record (("location", string @@ Path.to_string ident) :: Article.normalize article)
let normalize obj =
let open Data in
@ -431,32 +437,11 @@ module Articles = struct
; ("has_articles", bool @@ ((=) []) obj#articles)
; ("title", option string obj#title)
; ("description", option string obj#description)
; ("head_extra", option string obj#head_extra)
; ("host", option string obj#get_host)
]
end
module Page_with_article = struct
class type t = object ('self)
inherit Page.t
method articles : (Path.t * Article.t) list
end
let normalize_article (ident, article) =
let open Data in
record (("url", string @@ Path.to_string ident) :: Article.normalize article)
let normalize (p : t) =
let open Data in
[
("title", (option string) p#title);
("charset", (option string) p#charset);
("description", (option string) p#description);
("tags", (list_of string) p#tags);
("articles", list_of normalize_article p#articles);
]
end
let is_markdown_file path =
Path.has_extension "md" path ||
Path.has_extension "markdown" path
@ -541,7 +526,7 @@ struct
Articles.compute_index
(module Yocaml_yaml)
~where:is_markdown_file
~compute_link:(Target.as_html @@ Path.abs [ "articles" ])
~compute_link:(Target.as_html @@ Path.abs [ "posts" ])
Source.articles
in
@ -556,10 +541,10 @@ struct
Yocaml_jingoo.render ~strict:true
(List.map (fun (k, v) -> k, Yocaml_jingoo.from v) (Articles.normalize articles))
tpl))
>>> Yocaml_cmarkit.content_to_html ~strict:false ()
>>> Yocaml_jingoo.Pipeline.as_template ~strict:true
(module Articles)
(Source.template "layout.html")
>>> Yocaml_cmarkit.content_to_html ~strict:false ()
>>> drop_first ()
end
@ -594,7 +579,7 @@ struct
~filter
(module Yocaml_yaml)
~where:is_markdown_file
~compute_link:(Target.as_html @@ Path.abs [ "articles" ])
~compute_link:(Target.as_html @@ Path.abs [ "posts" ])
Source.articles
in
@ -609,10 +594,10 @@ struct
Yocaml_jingoo.render ~strict:true
(List.map (fun (k, v) -> k, Yocaml_jingoo.from v) (Articles.normalize articles))
tpl))
>>> Yocaml_cmarkit.content_to_html ~strict:false ()
>>> Yocaml_jingoo.Pipeline.as_template ~strict:true
(module Articles)
(Source.template "layout.html")
>>> Yocaml_cmarkit.content_to_html ~strict:false ()
>>> drop_first ()
end
@ -626,7 +611,7 @@ struct
>>> Articles.fetch
(module Yocaml_yaml)
~where:(Path.has_extension "md")
~compute_link:(Target.as_html @@ Path.abs [ "articles" ])
~compute_link:(Target.as_html @@ Path.abs [ "posts" ])
Source.articles
let rss1 =