The Protocol I Missed

May 11, 2026

When I first got online in the early 1990s, the internet looked nothing like it does today. There was no Chrome, no HTTPS, no JavaScript. Mostly there was text. You logged in to things over dial-up, you read things in terminal emulators, and if you wanted to find something you used Archie or Veronica. The web existed but it was academic and sparse. What was actually in use -- what people were actively navigating -- was Gopher.

Gopher came out of the University of Minnesota in 1991. It is a document retrieval protocol. You connect to a server on port 70, send a short selector string, and get back either a menu of links or a text file. That is the entire protocol. It predates HTTP by a year, it is vastly simpler, and for a few years in the early 1990s it was the thing people actually used to navigate the internet.

Then the web happened. Within two or three years it was over. Gopher servers went quiet, the Minnesota team stopped development, and by the time I was really using the internet comfortably, gopher was already a ghost. I knew it had existed but I never really used it. Every time I read about it I felt like I had missed something.

I use the internet via terminal a lot. I read RSS feeds in terminal clients, I run a finger server, I have a minimal hand-written HTML blog on purpose. A gopher version of this blog seems obvious. So I made one.

What gopher actually is

The protocol is RFC 1436, published 1993. A client opens a TCP connection to port 70. It sends a line of text called a selector -- an empty line for the root, a path for anything else -- followed by CRLF. The server sends back the content and closes the connection.

Directory listings are called gophermaps. Each line is tab-delimited:

  <type><display text>TABselectorTABhostnameTABport

Type 1 is a submenu, type 0 is a text file, type i is a display line with no link. That is most of what you need. The whole specification fits in your head.

The jail

Like everything else on bsd.peteftw.com, the gopher server runs in a Bastille thin jail. The host already had jails for inbound and outbound mail, the website, and a finger daemon. Adding one more is straightforward.

bastille create gopher 15.0-RELEASE 192.168.1.105
bastille pkg gopher install -y ca_root_nss
bastille cmd gopher mkdir -p /var/gopher

Port 70 forwarding goes through pf. Bastille manages the rdr rules per-jail -- one command writes the rule to the jail's config and loads it into the running pf instance:

bastille rdr gopher tcp 70 70

After a jail restart the rule is active:

rdr pass on re0 inet proto tcp from any to any port = 70 -> 192.168.1.105 port 70

The server

There is a perfectly good gopher daemon in the FreeBSD ports tree called gophernicus. I did not use it. Writing a gopher server from scratch in C++ takes less time than it sounds like it should, and the protocol is simple enough that there is nothing to get wrong. I wanted to understand exactly what was running.

The whole thing is about 130 lines. The interesting parts are the connection handler and the main accept loop:

static void handle(int fd) {
    std::string selector = read_selector(fd);

    if (!selector.empty() && selector.front() == '/')
        selector = selector.substr(1);

    fs::path target = (fs::path(g_root) / selector).lexically_normal();

    // path traversal guard
    std::string root_prefix = g_root + "/";
    std::string target_str  = target.string();
    if (target_str != g_root &&
        target_str.substr(0, root_prefix.size()) != root_prefix) {
        syslog(LOG_WARNING, "traversal attempt: %s", selector.c_str());
        send_error(fd, "Access denied");
        return;
    }

    if (fs::is_directory(target)) {
        fs::path gophermap = target / "gophermap";
        if (fs::exists(gophermap))
            send_file(fd, gophermap);
        else
            send_error(fd, "No gophermap in /" + selector);
    } else if (fs::is_regular_file(target)) {
        send_file(fd, target);
    } else {
        syslog(LOG_INFO, "not found: %s", selector.c_str());
        send_error(fd, "Not found: /" + selector);
    }
}

The main loop forks a child for each connection. SIGCHLD is set to SIG_IGN, which tells the kernel to reap children automatically -- no zombie accumulation, no wait loop needed:

signal(SIGCHLD, SIG_IGN);
signal(SIGPIPE, SIG_IGN);

for (;;) {
    sockaddr_in client{};
    socklen_t   len = sizeof(client);
    int fd = accept(srv, reinterpret_cast<sockaddr *>(&client), &len);
    if (fd < 0) {
        if (errno == EINTR) continue;
        syslog(LOG_ERR, "accept: %s", strerror(errno));
        continue;
    }
    pid_t pid = fork();
    if (pid == 0) {
        close(srv);
        handle(fd);
        close(fd);
        _exit(0);
    }
    close(fd);
}

The binary is installed directly into the jail filesystem and managed with a standard FreeBSD rc script that uses daemon(8) to daemonize and track the pid:

gopherd_start() {
    echo "Starting ${name}."
    /usr/sbin/daemon -P "${pidfile}" -r -f "${command}" ${gopherd_flags}
}
bastille sysrc gopher gopherd_enable=YES
bastille service gopher gopherd start

Converting the blog

The posts on this site are plain HTML with almost no markup -- paragraphs, headings, code blocks, the occasional link. Converting them to gopher text means stripping the tags and reflowing the prose at 70 columns, which is the conventional gopher line width.

I wrote a Perl script that protects <pre> blocks from reflowing, handles <h2> sections by underlining them with dashes, and converts links to text [url] format:

sub emit_body {
    my ($chunk) = @_;

    # h2 -- mark with a sentinel so we can underline after stripping tags
    $chunk =~ s{<h2>(.*?)</h2>}{ "\x00H\x00" . strip($1) . "\x00" }gsei;

    # links: <a href="url">text</a> -> "text [url]" or just url
    $chunk =~ s{<a\s+href="([^"]*)"[^>]*>(.*?)</a>}{
        my ($url, $text) = ($1, strip($2));
        $text = entities($text);
        (lc $url eq lc $text || $text eq '') ? $url : "$text [$url]"
    }gsei;

    $chunk =~ s{<p>}{\n\n}g;
    $chunk = strip($chunk);
    $chunk = entities($chunk);

    for my $part (split /(\x00H\x00.*?\x00)/, $chunk) {
        if ($part =~ /^\x00H\x00(.*?)\x00$/) {
            my $h = $1; $h =~ s/^\s+|\s+$//g;
            print "\n$h\n" . "-" x length($h) . "\n";
        } else {
            for my $para (split /\n{2,}/, $part) {
                $para =~ s/^\s+|\s+$//g;
                $para =~ s/\s+/ /g;
                next unless length $para;
                print wordwrap($para, 70), "\n\n";
            }
        }
    }
}

A second Perl script walks the blog's directory tree, extracts the <h1> title from each post, and writes the gophermap files. A deploy script converts all posts, regenerates the maps, and rsyncs everything to /var/gopher/ in the jail.

The result

The gopherhole is now live at gopher://peteftw.com. The same posts that are here on the web are available over the older protocol, generated from the same HTML source. If you have a gopher client -- ncgopher is a good one -- you can read this blog the way people read things in 1993. I like that.

There is something satisfying about a protocol this simple. The entire server is 130 lines of C++. The client sends a line of text, the server sends a file back, and everyone goes home. No TLS negotiation, no headers, no cookies, no JavaScript. Just text over a socket.

I got online just in time to miss gopher. Better late than never.


back