Pride, Prejudice, Dungeons, and Dragons

2 minute read

I’ve worked a bit in the past with Markov chains, but recently they’ve become really popular for creating twitter bots. However, I wanted to try something a little different than the norm so instead of my corpus being my own tweets like the many “ebooks” bots are I decided to start with the Dungeons and Dragons 5th edition Monster Manual.

Getting the source was difficult but I ended up running a PDF scan of the manual through some OCR software to get the basic text out. The core issue with that was it did generate a lot of junk as it occasionally converted dividing lines and on the rare occasion images to text. Luckily it was a decent start and I was able to do a fair amount of cleanup (though I definitely plan on doing more).

What it generated was fairly hilarious on it’s own with this being an example of one of the first sentences it generated:

Though, I wasn’t convinced that it really took the idea far enough. I decided to try and find something else to remix the content with that could up the level of hilarity. Pride and Prejudice almost seemed like a natural choice because it’s freely available on Project Gutenberg, but also the natural of the story lends itself well to the random injection of fantasy ridiculousness.

Once I added Pride and Prejudice in it’s entirety to the corpus and started generating sentences I was immediately amused and laughing. I added in some calls to the twitter API and then my bot was released upon the world.

I’ve made a couple major tweaks to the bot since I first put it on twitter. For one, I started writing out very large sentences to image files so that the entire thing doesn’t get cut off. Secondly, I changed the code so rather than picking randomly from the generated sentences that it picks the longest of the sentences.

All the code is available below, but I don’t think it’s nearly as interesting as the generated text itself which you can see on twitter.

Updated: