deanjlee.com

Coding Websites: Integrating Sitewide and Page-Specific Content

Creating a website with straight HTML is a bit out of fashion nowadays. This might have something to do with the fact that almost nobody really explains how to do it. What conventional tutorials on the internet, as well as textbooks, tend to explain is how to code a webpage. They read as though merely explaining hyperlinks is sufficient to extrapolate the lesson to a whole website. The problem is that a website, and the pages within, combine local and global content. Local content is specific to the individual page, an article for example. Global content appears across pages, probably on all of them, and includes the site’s title, navigation etc.. So what happens when you create your new folder for the project and start creating the files? You know that, in theory, you will create an HTML file for each article, the contact page etc.. But what about the global content? How are you going to develop the top of the screen independently and then integrate it with your local content? Do you include the <head> tag and its contents in the files for the pages or just the global one(s)? You haven’t even started the site, and you’re already stuffed.

This is the problem I had when I set out to make my site, and frustration mounted as I searched the literature for an authoritative and comprehensive discussion of this problem that didn’t seem to exist. As a result I decided that it was worth writing an article of my own, the article I would have liked someone to write for me, and here it is.

So how exactly do you get from a set of individual files you have produced to a complete website? The answer boils down to the fact that websites fall into two basic categories, static and dynamic. Dynamic websites are generally the norm. Individual components of a site such as blog posts or product profiles are stored in databases on the server. When your web browser requests a page, the various components of that page, both local and global, are ‘dynamically’ assembled into a temporary HTML page that your browser then reads. This file was not the one the author wrote. Indeed, it didn’t exist at all on the server. Dynamic webpages are produced on demand.

Dynamic websites have been a popular choice for bloggers, implemented through content management systems (CMS) like Wordpress. However, for simple blogs and personal websites, as opposed to massive commercial ones like Amazon, dynamic architecture is not necessary. And in fact, it has problems. The most serious problem is security. Server-side scripting and database access open up a significant attack vector. If you upload a page full of PGP instructions and don’t know what you’re doing, you are asking to get hacked. A static page, in contrast, will probably only be hacked if the person running the server doesn’t know what they’re doing, (or, say, if you chose a bad password). Another problem is the overhead involved in assembling the pages. This can make downloading from smartphones a slow and frustrating process.

For reasons like this, static websites are coming back. Static sites are much more straightforward: I upload the page to the server, you, the viewer, download the page from the server. The page does not change in the process. So how does this work for an entire site? As other tutorials explain, the site consists of a multitude of individual pages. A given page contains both content that is unique, like an article, and content that is found on every other page as well, like a navigation system. The navigation system uses hyperlinks which, when clicked, direct the browser to specific pages. Simple case scenario: page A and page B both contain the same navigation code that contains links to page A and to page B.

The fact that all the files contain the navigation system and other sitewide features raises the basic problem: how do you write a sitewide component in just one file? That is, what sort of file is it and what is the HTML command for weaving it into the other files? What is the correct way, as opposed to a dodgy workaround way (iframes anyone?), to have HTML embed one file into another as necessary to produce a static website? Well, technically you can’t. A static website by definition means that the files sitting on the server are exactly what the browser reads. This means that any global material is already contained in all the page files when they are uploaded. The closest thing to an official solution, a traditional one if you like, is to cheat and insert a dynamic component into the site so that local and global content are glued together when necessary. This means that the site is no longer truly static. It is somewhat static but partly dynamic. It is pseudo-static.

How, then, does one create a true static website, without repeatedly copying and pasting global code like an amateur? The answer is this: instead of having the web pages assembled on the fly whenever they are downloaded, have them assembled in advance. What this means is that the files you write are fed into a special program that weaves local and global content together and produces a complete static site as an output, ready to upload. This marvelous tool is called a static site generator (SSG), and is the solution to creating static sites today. Static site generators can do much more than automatically copy your global code into all your page files. They can be used much like CMS to spare you all the tedious coding. But they still allow you to preserve a strict coding approach to your heart’s content.

To sum up, if you want to code a static site, you have two reasonable options. You can forego a strictly static approach and slip in some dynamic scripting code to piece it together. Or you can produce a true static website by relying on dedicated software. Each has its trade-offs, and this depends on your mentality. If you are strongly committed to the idea of coding your site rather than relying on software then SSG is something of a compromise. That being said, it is a serious coder’s tool, and even serious coders deviate from narrow coding more than you might think. In modern software engineering it is essential. If you are committed to the idea of a static site then the obvious compromise of the dynamic approach is that it’s not really static. You can’t have your cake and eat it too. Whichever approach you choose, the following tutorial will explain. It covers both.

Partial Dynamic Approach

Scripting on the internet takes two fundamental forms: client-side and server-side. In client-side scripting, the computation of the script’s instructions is done by the computer accessing the web page, specifically by the web browser. For this purpose, HTML uses Javascript. Now, Javascript does contain an “include” feature that will add your global content to your pages. You could use this, but that would obviously be stupid because your entire website would then depend on Javascript. If the Javascript is switched off, your entire site will break. Load the index page, and all the links providing access to your other pages will be missing. So much for the client-side approach. In server-side scripting, the computation is performed by the server providing the webpage. This makes it the responsibility of the server to process the includes. The major server-side language for includes and much more is PHP, and that is what we’re going to use. (You may have also heard of Server Side Includes (SSI). This was the known method before PHP took over.)

Suppose we have a folder containing four text files:

  • header.php
  • footer.php
  • index.php
  • article.php

Note that the extension is ‘.php’ rather than ‘.html’ because this is necessary for the code to work. ‘index.php’ and ‘article.php’ are the actual webpages, each containing the fundamental page code, from the ‘!DOCTYPE’ declaration (DTD) to the <head> and <body> elements. The <body> element in each file contains the local content, say, a home page and an article. The head’s <title> tag contains the name of the article and so on. ‘header.php’ and ‘footer.php’ are arbitrary files containing only the global content you want to insert above and below the local stuff. ‘header.php’ might consist of a single <div> element containing all the navigation, for example.

More specifically, suppose that all of the local content is contained in the <main> element, and all the global stuff above and below. In ‘index.php’ and ‘article.php’ we add the following line just above the opening <main> tag:

<?php include 'header.php'; ?>

and likewise add the following just below the closing </main> tag:

<?php include 'footer.php'; ?>

Obviously, we are telling the server to splice in the header and footer content above and below the <main> element. All you have to do now is upload the four files to the server and it will serve your complete website: two pages with a common set of navigation and other features. To see this in action before uploading you will have to set up a server yourself. By this I simply mean a server program on your own computer. Install Apache and PHP, copy your pages to the appropriate directory, start Apache, type http://localhost/ into your browser’s address bar, and there it is.

Pure Static Approach

Now for the static approach. One of the leading static site generators is Jekyll, which uses a templating language called Liquid to process web documents. Jekyll has an ‘includes’ feature that can turn the files in the previous section into true static webpages. However, it also offers a much smarter way of producing the same result: templates. Instead of isolating global contents and then adding them to the webpages (i.e. the files that actually contain full HTML structure), you can do the reverse and isolate local contents and add them to a webpage template. This allows you to write all the global markup in one file, right down to the DTD, removing markup repetition completely.

We start off with a slightly simpler set of files in our folder:

  • template.html
  • index.html
  • article.html

Note that the extension is the traditional .html because we’re not using PHP. In practice, I cannot recommend strongly enough that you do all actual content writing in Markdown rather than directly in HTML! These files would be of the form “article.md”, like the file I’m writing right now. ‘index.html’ and ‘article.html’ contain only the local material. ‘template.html’ contains everything else: DTD, head element, navigation, etc.. We tell Jekyll what to do with the local content files by placing metadata at the beginning of each file. For our example, we add this:

---
layout: template
title: "Home"
---

The ‘layout’ part is what links each file to our template. Obviously, the title is specific to the individual file. “Home” is the most likely title for ‘index.html’. All we need to add to ‘template.html’ is two commands:

{{ page.title }}

This goes in the <title> element in the document head, to make the title of each page unique, based on the metadata.

{{ content }}

This goes wherever in the webpage you want the local content to go, most likely somewhere inside a <main> element.

Now the files are ready for processing. For this, we need to create a special sub-directory in our website folder. The name of this directory is “_layouts”, and don’t forget the underscore. Move template.html into this folder. Now all we have to do is run Jekyll from our website folder. Jekyll is a command line tool so once you have installed it you need to open a terminal and change directory to the website folder. Once you’re inside, run Jekyll with the command “jekyll serve”. If all goes well, this is what’s going to happen. Jekyll will automatically create a new directory in your folder called “_site”. It will compile your website and place the output into that directory. Open “_site” and you will see your two web pages ready for upload. At the same time, Jekyll will conveniently create a server environment to host these pages so that you can view them in your browser, just like running Apache, and the results should be identical to those of the PHP method. Be mindful that the “_data” folder is wiped and re-written to whenever Jekyll runs so do not save any work in it.

Back to top