A New SSG

Tags: code

To concatenate multiple Markdown (.md) files into a single index.html, you can use Pandoc. This Bash script automates the generation of an index.html file from a set of Markdown files, using Pandoc for conversion.

Step-by-step Explanation

Variable Setup
OUTFILE: Output HTML file (index.html)
BIB_FILE, CSL_FILE, TEMPLATE: Files for bibliography, citation style, and HTML template
Check for Updates
Gets the last modified time of index.html (or sets to 0 if it doesn’t exist).
Finds all Markdown files matching the pattern YYYY-MM-DD-*.md that are newer than index.html.
Early Exit
If no Markdown files are newer than index.html, the script exits early to avoid unnecessary work.
HTML Header
Writes a standard HTML header to index.html.
Markdown Conversion
Loops through all Markdown files (newest first, for consistency).
Uses Pandoc to convert each Markdown file to HTML, applying the template, bibliography, and citation style.
Appends the converted HTML to index.html.
HTML Footer
Adds a closing to the output file.
Completion Message
Prints a message indicating that index.html was generated.

Even if only some files are updated, the script processes all Markdown files for consistent ordering.
Relies on file naming convention (YYYY-MM-DD-*.md) for post ordering.
Requires Pandoc and the specified template/bibliography files to exist.

#!/bin/bash

OUTPUT_FILE="index.html"
BIB_FILE="refs.json"
CSL_FILE="apa.csl"
TEMPLATE="template.html"

# Get last modified time of OUTPUT_FILE, or 0 if it doesn't exist
if [[ -f "$OUTPUT_FILE" ]]; then
  OUTPUT_FILE_TIME=$(stat -f %m "$OUTPUT_FILE")
else
  OUTPUT_FILE_TIME=0
fi

# Find all markdown files newer than OUTPUT_FILE
FILES_TO_PROCESS=()
for file in $(ls -r [0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]-*.md); do
  FILE_TIME=$(stat -f %m "$file")
  if (( FILE_TIME > OUTPUT_FILE_TIME )); then
    FILES_TO_PROCESS+=("$file")
  fi
done

# Exit early if no new or modified files
if [[ ${#FILES_TO_PROCESS[@]} -eq 0 ]]; then
  echo "No new or updated posts. Skipping $OUTPUT_FILE generation."
  exit 0
fi

# Write the HTML header once
cat <<EOF > "$OUTPUT_FILE"
<!DOCTYPE html>
<html lang="en">
<head>
  <meta charset="utf-8" />
  <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" />
  <link rel="stylesheet" href="style.css" />
  <link rel="icon" type="image/x-icon" href="favicon.png">
  <title>Codex</title>
</head>
<body>
EOF

# Process all files (even unmodified ones) in newest-first order for consistency
for file in $(ls -r [0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]-*.md); do
  pandoc "$file" \
    --from markdown \
    --template="$TEMPLATE" \
    --to html \
    --citeproc \
    --bibliography="$BIB_FILE" \
    --csl="$CSL_FILE" \
    >> "$OUTPUT_FILE"
  echo "" >> "$OUTPUT_FILE"
done

# Write the HTML footer
echo "</body></html>" >> "$OUTPUT_FILE"

echo "Generated $OUTPUT_FILE with updated posts."

Summary

This script efficiently rebuilds an HTML index page from Markdown blog posts, only when there are new or updated posts, and ensures consistent output formatting and ordering.