Add urlh(1) and urlmt(1)

author: Tom Ryder <tom@sanctum.geek.nz> 2016-08-14 14:25:10 +1200
committer: Tom Ryder <tom@sanctum.geek.nz> 2016-08-14 14:25:10 +1200
commit: f10421a52b3c11e2adbdbd36d87095cab676e656 (patch)
tree: f357d00241aff885f515d982c51756fb3754c9ee
parent: Fix typo (diff)
download: dotfiles-f10421a52b3c11e2adbdbd36d87095cab676e656.tar.gz
dotfiles-f10421a52b3c11e2adbdbd36d87095cab676e656.zip
5 files changed, 67 insertions, 1 deletions
diff --git a/README.markdown b/README.markdown
index 6010dda6..60f68df4 100644
--- a/README.markdown
+++ b/README.markdown
@@ -298,7 +298,7 @@ Installed by the `install-bin` target:
         output.
     *   `sta(1)` runs a command on multiple hosts read from `sls(1)` and prints
         the hostname if the command returns zero.
-*   Three URL-related shortcut scripts:
+*   Five URL-related shortcut scripts:
     *   `hurl(1)` extracts values of `href` attributes of `<a>` tags, sorts
         them uniquely, and writes them to `stdout`; it requires
         [pup](https://github.com/ericchiang/pup).
@@ -307,6 +307,10 @@ Installed by the `install-bin` target:
     *   `urlc(1)` accepts a list of URLs on `stdin` and writes error messages
         to `stderr` if any of the URLs are broken, redirecting, or are insecure
         and have working secure versions; requires `curl(1)`.
+    *   `urlh(1)` prints the values for a given HTTP header from a HEAD
+        response.
+    *   `urlmt(1)` prints the MIME type from the `Content-Type` header as
+        retrieved by `urlh(1)`.
 *   Three RFC-related shortcut scripts:
     *   `rfcf(1)` fetches ASCII RFCs from the IETF website.
     *   `rfct(1)` formats ASCII RFCs.
diff --git a/bin/urlh b/bin/urlh
new file mode 100755
index 00000000..a9ea93fd
--- /dev/null
+++ b/bin/urlh
@@ -0,0 +1,27 @@
+#!/bin/sh
+# Get values for HTTP headers for the given URL
+
+# Check arguments
+if [ "$#" -ne 2 ] ; then
+    printf 'urlt: Need an URL and a header name\n'
+    exit 2
+fi
+url=$1 header=$2
+
+# Run cURL header request
+curl -I -- "$url" |
+
+# Unfold the headers
+unf | 
+
+# Use awk to find any values for the header case-insensitively
+awk -v header="$header" '
+BEGIN {
+    FS=": *"
+    header = tolower(header)
+}
+tolower($1) == header {
+    sub(/^[^ ]*: */, "")
+    print
+}
+'
diff --git a/bin/urlmt b/bin/urlmt
new file mode 100755
index 00000000..465ff588
--- /dev/null
+++ b/bin/urlmt
@@ -0,0 +1,3 @@
+#!/bin/sh
+# Get the MIME type for a given URL
+urlh "$1" Content-Type | sed 's/; .*//'
diff --git a/man/man1/urlh.1 b/man/man1/urlh.1
new file mode 100644
index 00000000..5066c7d0
--- /dev/null
+++ b/man/man1/urlh.1
@@ -0,0 +1,17 @@
+.TH URLH 1 "August 2016" "Manual page for urlh"
+.SH NAME
+.B urlh
+\- search for URL header values by name
+.SH SYNOPSIS
+.B urlh
+https://www.sanctum.geek.nz/
+Content-Type
+.SH DESCRIPTION
+.B urlh
+makes a cURL HEAD request for the given URL, and searches the headers for a key
+matching the given name, case-insensitively. It prints any matching values to
+stdout.
+.SH SEE ALSO
+curl(1), urlmt(1)
+.SH AUTHOR
+Tom Ryder <tom@sanctum.geek.nz>
diff --git a/man/man1/urlmt.1 b/man/man1/urlmt.1
new file mode 100644
index 00000000..843f7d81
--- /dev/null
+++ b/man/man1/urlmt.1
@@ -0,0 +1,15 @@
+.TH URLMT 1 "August 2016" "Manual page for urlmt"
+.SH NAME
+.B urlmt
+\- try to get the MIME type of the document at the given URL
+.SH SYNOPSIS
+.B urlmt
+https://www.sanctum.geek.nz/
+.SH DESCRIPTION
+.B urlmt
+uses urlh(1) to search for a Content-Type header for the given URL, and prints
+it with any trailing data (e.g. charset) trimmed off.
+.SH SEE ALSO
+curl(1), urlh(1)
+.SH AUTHOR
+Tom Ryder <tom@sanctum.geek.nz>
author	Tom Ryder <tom@sanctum.geek.nz>	2016-08-14 14:25:10 +1200
committer	Tom Ryder <tom@sanctum.geek.nz>	2016-08-14 14:25:10 +1200
commit	f10421a52b3c11e2adbdbd36d87095cab676e656 (patch)
tree	f357d00241aff885f515d982c51756fb3754c9ee
parent	Fix typo (diff)
download	dotfiles-f10421a52b3c11e2adbdbd36d87095cab676e656.tar.gz dotfiles-f10421a52b3c11e2adbdbd36d87095cab676e656.zip