Use Markdown to deliver HTML pages through Apache

If you want to keep it simple one solution of delivering a website is Markdown. Markdown is a super simple to learn and use human-readable way of writing text that can also be understood by a parser. Most parsers allow creating HTML files, some also create other files like PDF.

Apache allows to define actions to perform on files before delivering them to the requesting client. So in order to create HTML pages from markdown you don't need fancy redirect chains or special configuration that parses everything "manually" through PHP for example. All you need is to edit your .htaccess file like so or add it to an existing file:

The main benefit is that you can use Markdown files directly like HTML or PHP files. You just access them and Apache cares about the rest. You don't need to manually redirect all requests or do other fancy stuff. Just set up an action, assign it to .md files and things like https://example.com/my_cool_text.md or using index.md instead of index.html are absolutely possible. (The later by defining DirectoryIndex index.md index.html index.php in your .htaccess file).

Apache Action and AddHandler

With this you define an action called markdown and assign that action via file handler to all files with the suffix .md. The Action sends all requests to all .md files to the given parser script.

Action markdown path/to/parser/script.php
AddHandler markdown .md

In the script you have access to the translated path being the absolute file name of the file that was requested. All you now need to do is parsing the file with one of the many available Markdown-to-HTML parsers.

Handler script

This example code outputs a fully valid HTML page. How to do this with the parser of your choice is entirely up to you.

<?php
    require_once 'my-cool-parser.php';
    $parser = new MyCoolParser();

    $parser->useFile($_SERVER['PATH_TRANSLATED']);
    $parser->setTarget('HTML');
    $parser->parse();
?>

You might want to create the HTML <title> from the first top heading (# My Cool Heading):

$file_content = file_get_contents($_SERVER['PATH_TRANSLATED']);
$title_line = preg_split('#\r?\n#', $file_content, 0)[0];
$title = preg_replace('/^# (.*)$/', '\1', $title_line);

This gets the first heading and from that it gets the actual heading without the number sign in front. This could be used when creating the page. But It's fully up to you what to do in your parser script. Apache simply does not care and delivers whatever your script returns.

Replace specific code with pre-defined HTML

Markdown does not support certain commonly used HTML features. With pre-processing the raw Markdown code, in the examples represented by the $markdown variable. those features can be added in a “Markdown-y” way (i.e. without parser-specific code but already existing base Markdown syntax)

You can add as many images as you want. Each of the images can have an own description and the gallery can have a title and description, too.

The title, the individual image’s information, and the gallery description allows basically everything that can be used in Markdown. But please don’t go too overboard with it :) For adding a literal ``` it needs to be escaped in order to have the gallery properly detected: ```

You can use either gallery or figure. Both behave exactly the same. In both cases the corresponding information is added as data-type. The images section gets the data-style attribute set, being either multi-image or single-image depending on either showing multiple images or one image.

For code blocks in the gallery description, the alternate notation ~~~ has to be used. Individual image information can’s have code blocks because those information is single-line only. It can have other Markdown, though.

The code allows basically everything that can be used in Markdown documents, except a literal ```. For having ``` it has to be escaped like so: \`\`\`

Example

```gallery
[My Cool Gallery]

image_1.jpg This is the first image!
image_2.jpg
image_3.jpg
image_N.jpg As many as you like. *Each image in one line.*

> This is a gallery description.
>
> ~~~
> Here is a code block
> ~~~
>
> The description is shown as `<figcaption>`
```

Result

<figure data-images-count="4" data-type="gallery">
  <h1>My Cool Gallery</h1>
  <section class="images" data-style="multi-image">
    <ol>
      <li>
        <a href="image_1.jpg"><img src="image_1.jpg" /></a>
        <span class="image-info"><p>This is the first image!</p></span>
      </li>
      <li>
        <a href="image_2.jpg"><img src="image_2.jpg" /></a>
      </li>
      [and so on]
    </ol>
  </section>
  <figcaption>
    <p>This is a gallery description.</p>
    <pre><code>Here is a code block
    </code></pre>
    <p>The description is shown as <code>&lt;figcaption&gt;</code></p>
  </figcaption>
</figure>

Newlines in the result code are added for readability purposes in this documentation. The output does not contain newlines outside code blocks.

Code

$markdown = preg_replace_callback(
    '/^```(figure|gallery)\n(?:(?!```).)*```/sim',
    function ($match) {
        $fp = new MarkdownExtra;
        $images = array();
        $title = '';
        $caption = '';

        # Title extraction
        #
        # Using the parser here to prevent escape characters to show up, but
        # also strip HTML tags from the output because the string goes into an
        # h1 tag where paragraphs, etc. are not allowed.
        preg_match('/^\s*\[(.*)\]\s*$/m' ,$match[0], $title);
        $title = isset($title[1])
            ? trim(preg_replace('/<\/?p>/', '', $fp->transform($title[1])))
            : false;

        # Get all lines that are part of the description
        foreach (explode("\n", $match[0]) as $matchline) {
            preg_match('/^>\s*(.*)\s*$/m', $matchline, $capline);
            $caption .= isset($capline[1]) ? trim($capline[1])."\n" : '';
        }

        # Parse figure caption to Markdown
        $caption = $fp->transform($caption);
        $caption = strlen($caption) > 1 ? $caption : false;

        # Get all images and their infos
        foreach (explode("\n", $match[0]) as $matchline) {
            preg_match('/^\s*(?!>|\[|\`|$)\s*(.*)\s*$/m',$matchline,$imageline);
            if (!isset($imageline[1])) continue;
            $parts = explode(' ', $imageline[1], 2);
            array_push($images, array(
                'url' => $parts[0],
                'info' => isset($parts[1]) ? trim($parts[1]) : false
            ));
        }

        # Build figure “head”
        $figure = sprintf('<figure data-images-count="%s" data-type="%s">',
            count($images),
            $match[1]
        );

        $figure .= $title ? '<h1>'.$title.'</h1>' : '';

        $figure .= sprintf('<section class="images" data-style="%s"><ol>',
            count($images) > 1 ? 'multi-image' : 'single-image'
        );

        # Add all images and their infos to the figure
        foreach ($images as $image) {
            $figure .= sprintf('<li><a href="%s"><img src="%s" /></a>',
                $image['url'],
                $image['url']
            );

            $figure .= $image['info']
                ? sprintf('<span class="image-info">%s</span>',
                    $fp->transform($image['info'])
                ) : '';

            $figure .= '</li>';
        }

        # Build figure “foot”
        $figure .= '</ol></section>';
        $figure .= $caption !== false
            ? '<figcaption>'.$caption.'</figcaption>'
            : '';
        $figure .= '</figure>';

        return $figure;
    },
    $markdown
);

Automatically detect YouTube videos

YouTube videos will be automatically embedded when standing alone on a single line and being surrounded by square brackets ([]). If the URL is surrounded by anything else it won’t be transformed. It also won’t be transformed if it is surrounded by anything else than space-like characters.

Example

[https://www.youtube.com/watch?v=VIDEO_ID_HERE]

Result

<iframe
   width="100%"
   height=""
   src="https://www.youtube.com/embed/dQw4w9WgXcQ"
   title="YouTube video player"
   frameborder="0"
   class="youtube"
   allowfullscreen>
</iframe>

The newlines in the resulting iframe code are for design purposes in this documentation only. The actually generated code does not contain any newlines.

Code

$markdown = preg_replace_callback(
    '@^\s*\[https://(www\.)?youtube\.com/watch\?v=([a-zA-Z0-9_-]{11})\]\s*$@m',
    function ($match) {
        return '<iframe width="100%" height="" '.
            'src="https://www.youtube.com/embed/'.$match[2].'" '.
            'title="YouTube video player" frameborder="0" class="youtube" '.
            'allowfullscreen></iframe>';
    },
    $markdown
);