Lacking Natural Simplicity

Random musings on books, code, and tabletop games.

Converting my pyBloxsom blog into a Nikola blog

Yesterday I decided to try blogging again. I started writing a post at blogger.com, but that was like wading through a rotting whale corpse. Instead I decided to use GitHub Pages and use the static blog/site generator Nikola to generate the content, editing reStructuredText (ReST) files.

I wrote my first post and it was good! Using ReST again was much better than editing in a GUI like blogger.com, and having it hosted by GitHub Pages was more restful than running a machine hosting a website.

But then I thought of all the posts I had in my old blog, before I stopped running machine hosting a website. They were all written in ReST — maybe I could put them up on my new blog?

I took a couple three hours and wrote a shell script to find the old pyBloxsom files and feed them into a python script that I also wrote. Along the way I made sure the files all had #published and #tags lines, in that order, immediately following the title line.

Here's the shell script:

drive-pyblox-to-nikola (Source)

#! /usr/bin/env bash

(cd ~/myblog &&
     find notentries/ entries/ -type f -name \*.rst |
         ~/comp/tkbtools/Scripts/pyblox-to-nikola)

Here's the python script:

pyblox-to-nikola (Source)

#! /usr/bin/env python3.7
import os
import os.path
import sys
from datetime import datetime
# datetime.strptime ('2019-11-05 20:32:24', '%Y-%m-%d %H:%M:%S')
# dt.strftime ('%Y-%m-%d %H:%M:%S UTC-05:00')
entries_prefix = 'entries/'
notentries_prefix = 'notentries/'
published_prefix = '#published '
tags_prefix = '#tags '
files_read = 0
for filename in sys.stdin:
    filename = filename.rstrip ()
    basename = os.path.basename  (filename)
    dirname = os.path.dirname (filename)
    if dirname.startswith (entries_prefix):
        category = dirname[len(entries_prefix):]
    elif dirname.startswith (notentries_prefix):
        category = dirname[len(notentries_prefix):]
    else:
        category = ''
    (slug, _) = os.path.splitext (basename)
    print ('filename: %s\nbasename: %s\ndirname: %s\ncategory: %s\nslug: %s' %
           (filename, basename, dirname, category, slug))
    inf = open (filename, 'r')
    files_read = files_read + 1
    title = inf.readline ()
    title = title.rstrip ()
    published = inf.readline ()
    published = published.strip ()
    if published.startswith (published_prefix):
        published = published[len(published_prefix):]
    else:
        raise ('Unknown line should be #published', published)
    published_date = datetime.strptime (published, '%Y-%m-%d %H:%M:%S')
    nikola_date = published_date.strftime ('%Y-%m-%d %H:%M:%S UTC-05:00')
    datepath = published_date.strftime ('%Y/%m/%d')
    newdir = os.path.join ('/Users/tkb/nikola/newblog/posts', datepath)
    os.makedirs (newdir, exist_ok=True)
    tags = inf.readline ()
    tags = tags.rstrip ()
    if tags.startswith (tags_prefix):
        tags = tags[len(tags_prefix):]
    else:
        raise ('Unknown line should be #tags', tags)
    tags = tags.lower ()
    outfname = os.path.join (newdir, basename)
    print ('outfname: %s' % outfname)
    outf = open (outfname, 'w')
    outf.write ('.. title: %s\n' % title)
    outf.write ('.. slug: %s\n' % slug)
    outf.write ('.. date: %s\n' % nikola_date)
    outf.write ('.. tags: %s\n' % tags)
    outf.write ('.. category: %s\n' % category)
    outf.write ('.. link: \n')
    outf.write ('.. description: \n')
    outf.write ('.. type: text\n')
    outf.write ('\n')
    for line in inf:
        outf.write (line)
    inf.close ()
print ('\n\nFiles Read: %d' % files_read)

There were 810 reStructuredText files to process. Once that was done, I had to work through those files multiple times finding all the broken internal links, since many of them were absolute links to my old blog or other pages on my old website. I did grep-find in Emacs multiple times to find all the occurances of my old website's hostname (which went through a couple of variations over time), then looked for site relative links that started with /~tkb, a tedious but not too difficult process.

Print Friendly and PDF

Comments

Comments powered by Disqus