[Steemit] Collecting Content From the Steem Blockchain and Mimicing It

in technology •  7 years ago 

(Better formated text available on Githib.)

Sometimes you just have to put together a lot of things that you didn't expect. In this case, I was inspired by a post by @makerhacks about generating a readability score for your posts.

It occurred to me that I could put that together with some of my SteamData mining code and a little bit of extra love from the markovify library to create a system which could both examine posts made by a user and then proceed to mimic them in style and substance.

Okay, "substance" might be a little ambitious. Just style, then.

Let's start out by actually being able to fetch data from the blockchain regarding posts which we can make use of.

Grab That Data

Things start out with a certain amount of familiarity as we initialize the SteamData interface, set up our query, and pull the information that we're interested in. You guys have seen this several times by now, and it's not going to look any different this time.

# Setting up the imports for our basic query tools

from steemdata import SteemData
import datetime
from datetime import datetime as dt

from pprint import pprint
# Init connection to database

db = SteemData()
# Grab posts I've written, just to be on the safe side.

query = {'author': 'lextenebris'}

# We only want the body field, honestly, so only gank that out.

proj = {'body': 1,
        '_id': 0}

# We'll take them in reverse chronological order so we don't 
#   need to bother sorting, just to be gentle.
# One day we need to talk about the crazily inconsistent
#   field names in the blockchain. And how bad they are.

sort = [('created', -1)]

# We only want the body field, honestly, so only gank that out.

proj = {'body': 1,
        '_id': 0}
# We'll just grab 50. Who knows, it might be interesting.
#   We can always grab more later.

result = db.Posts.find(query,
                       projection=proj,
                       sort=sort,
                       limit=50)

%time postsL = list(result)
Wall time: 759 ms

What do the first 500 characters of a random post taken from somewhere in the middle of that look like?

# Some kind of random, and by that I mean
#   utterly arbitrary.

p = postsL[31]['body']
print(p[:500])
*Where am I posting, and what are we doing in this handbasket?*

![](https://xzor.xyz/ipfs/QmNrsddWU7kNmscD5KJ9s9nnkanyf5x5fTs7iEbojnTymA)

As the number of bloggers increased from the mid 90s onwards, we saw a massive evolution in the market. At first, people stuck to a single vertical, a single platform onto which they put all of their writing output, and [RSS](http://www.whatisrss.com) existed to help readers pull all of that writing output into a single interface so that they could read it. 

Cool. We have fresh data all over our hands. What are we going to do with it?

First up, let's create a new list of all the content filtered for HTML. We don't need all of those links and such getting in the way of our lexical analysis to come.

Depending on what we can find floating around in the library, maybe we can do better than just stripping HTML.

# We can cheat a bit and use pypandoc to strip out all the Markdown
#   from posts, leaving them much cleaner.

import pypandoc

pypandoc.convert_text(p[:500], to='plain', format='md')
'_Where am I posting, and what are we doing in this handbasket?_\r\n\r\n[]\r\n\r\nAs the number of bloggers increased from the mid 90s onwards, we saw a\r\nmassive evolution in the market. At first, people stuck to a single\r\nvertical, a single platform onto which they put all of their writing\r\noutput, and RSS existed to help readers pull all of that writing output\r\ninto a single interface so that they could read it.\r\n'

That should really get rid of most of the HTML that I'm likely to ever have used, but just to be safe, and just for the sake of proper hygiene, let's finish the scrubbing.

import html2text
import requests

# An instance of the HTML parser will help out here
h = html2text.HTML2Text()

# Remove links, sure.
h.ignore_links = True

# Now, just hand in some text and we'll see what falls out.
def removeHtml(txt):
    return h.handle(txt)
pypandoc.convert_text(p, to='plain', format='md')[-500:]
"irety of my theory of _point of presence,_ the idea\r\nthat you don't really have one. You have tools which allow you to reach\r\nout to a community and be part of it, but there is no one central place\r\nthat represents you and what you do.\r\n\r\nPlatforms change. Systems crash. Data is lost.\r\n\r\nYour presence is wherever your attention is. That is its singular point.\r\nEverything else is about being part of a community.\r\n\r\nNow get out there and do your thing, Thing Ring!\r\n\r\nhttps://youtu.be/64Jv8zbUsf4\r\n"

To me, the interesting thing here is that the HTML to text module didn't seem to do much at all. In fact, it left a perfectly readable HTML entry right there at the end.

My suspicion is that it's because things end with \n, which points to a larger problem – that we really need to sanitize this stuff for line endings.

Easy enough, just annoying.

removeHtml(' '.join(pypandoc.convert_text(p, to='plain', format='md').split('\r\n')))[-500:]
"at's\nit, the entirety of my theory of _point of presence,_ the idea that you don't\nreally have one. You have tools which allow you to reach out to a community\nand be part of it, but there is no one central place that represents you and\nwhat you do. Platforms change. Systems crash. Data is lost. Your presence is\nwherever your attention is. That is its singular point. Everything else is\nabout being part of a community. Now get out there and do your thing, Thing\nRing! https://youtu.be/64Jv8zbUsf4\n\n"

Nope. In fact, that module seems to actually insert character turns in places that they have no reason to be there. Not only that, the documentation doesn't even mention that it tinkers with the content of the text stream like that.

Have I mentioned that such things are really bad coding practice, and should probably end up with people punched right in the ding? Because they should.

Let's try reversing the order up of application and see if we can't get things out which don't explode in our face.

' '.join(pypandoc.convert_text(removeHtml(p),
                               to='plain', 
                               format='md').split('\r\n'))[-500:]
"hat's it, the entirety of my theory of _point of presence,_ the idea that you don't really have one. You have tools which allow you to reach out to a community and be part of it, but there is no one central place that represents you and what you do. Platforms change. Systems crash. Data is lost. Your presence is wherever your attention is. That is its singular point. Everything else is about being part of a community. Now get out there and do your thing, Thing Ring! https://youtu.be/64Jv8zbUsf4 "

All right, I've come to the conclusion that the HTML to text module is pretty crap. It's either not parsing content properly for URLs, has a broken bit when a URL occurs at the very end of a string, or is somehow otherwise junk. The fact that it uses a class-variable to store whether or not links are desired rather than take it as a passed argument to a function is just extra garbage, in my opinion.

I like OOP is much as the next guy, possibly more, but this is no way to do it. Worse, it's not even good OOP. For that you'd really want a setter call on the object.

Let's just stop wrestling with that.

def procPost(e):
    return ' '.join(pypandoc.convert_text(e, 
                                        to='plain', 
                                        format='md').split('\r\n'))
procPost(p)[-500:]
"s it, the entirety of my theory of _point of presence,_ the idea that you don't really have one. You have tools which allow you to reach out to a community and be part of it, but there is no one central place that represents you and what you do.  Platforms change. Systems crash. Data is lost.  Your presence is wherever your attention is. That is its singular point. Everything else is about being part of a community.  Now get out there and do your thing, Thing Ring!  https://youtu.be/64Jv8zbUsf4 "

We'll just go with this for now. There may be some residual HTML hanging out in the system, but that's fine. We'll just roll with it.

Out of curiosity, let's find out what the reading level of the post we've been tinkering with actually is according to these ratings.

from textstat.textstat import textstat

print( "{} / 100\n".format(textstat.flesch_reading_ease(procPost(p))))
58.82 / 100

Seems legit. I never promised to write at a particularly introductory level.

Now that we can do all this, what do we want to do with it?, That's where the fun comes in.

Heat Up the Emulator!

Let's recap the important code here so we don't have to keep scrolling to the top in order to reset our environment.

# Setting up the imports for our basic query tools

from steemdata import SteemData
import datetime
from datetime import datetime as dt

from pprint import pprint

# Init connection to database

db = SteemData()

# Grab posts I've written, just to be on the safe side.

query = {'author': 'lextenebris'}

# We only want the body field, honestly, so only gank that out.

proj = {'body': 1,
        '_id': 0}

# We'll take them in reverse chronological order so we don't 
#   need to bother sorting, just to be gentle.
# One day we need to talk about the crazily inconsistent
#   field names in the blockchain. And how bad they are.

sort = [('created', -1)]

# We only want the body field, honestly, so only gank that out.

proj = {'body': 1,
        '_id': 0}
# We'll just grab 50. Who knows, it might be interesting.
#   We can always grab more later.

result = db.Posts.find(query,
                       projection=proj,
                       sort=sort,
                       limit=50)

%time postsL = list(result)
Wall time: 748 ms
def procPost(e):
    return ' '.join(pypandoc.convert_text(e, 
                                        to='plain', 
                                        format='md').split('\r\n'))
# Some kind of random, and by that I mean
#   utterly arbitrary.

p = postsL[31]['body']

p[-500:]
"'s it, the entirety of my theory of *point of presence,* the idea that you don't really have one. You have tools which allow you to reach out to a community and be part of it, but there is no one central place that represents you and what you do.\n\nPlatforms change. Systems crash. Data is lost.\n\nYour presence is wherever your attention is. That is its singular point. Everything else is about being part of a community.\n\nNow get out there and do your thing, Thing Ring!\n\nhttps://youtu.be/64Jv8zbUsf4"

Now let's kill all the markdown in all of those fetched documents. Very easy.

%time postsN = [procPost(e['body']) for e in postsL]
Wall time: 5.67 s
postsN[31][-500:]
"s it, the entirety of my theory of _point of presence,_ the idea that you don't really have one. You have tools which allow you to reach out to a community and be part of it, but there is no one central place that represents you and what you do.  Platforms change. Systems crash. Data is lost.  Your presence is wherever your attention is. That is its singular point. Everything else is about being part of a community.  Now get out there and do your thing, Thing Ring!  https://youtu.be/64Jv8zbUsf4 "

Awesome!

Now let's bring in the big guns. Markovify is a library which is designed to build Markov chains, that is to take a corpus of content, break it down statistically, and find things that start "a sentence", select one at random, and then randomly pick something reasonable that would follow the opening of that sentence.

I'm going to use that to generate potentially random possible posts that could have come from me. Or anybody else. Or a hybrid of the two, as you will soon see.

First, let's get things set up.

import markovify
import re

# We're going to use spaCy to do some basic part-of-speech 
#   analysis to make things even more believable. Maybe.

import spacy

nlp = spacy.load('en')

class POSifiedText(markovify.Text):
    def word_split(self, sentence):
        return ["::".join((word.orth_, word.pos_)) for word in nlp(sentence)]

    def word_join(self, words):
        sentence = " ".join(word.split("::")[0] for word in words)
        return sentence

(If you do this yourself on a Windows machine, make sure that you install spaCy from PIP with administrator access for the shell. Otherwise it just won't create the right directories for the language model.)

Let's make one giant string out of all of our documents to feed into the Markov generator! This should be a no-brainer.

%time model = markovify.Text(' '.join(postsN))
Wall time: 325 ms
# Print five randomly-generated sentences
for i in range(5):
    pprint(model.make_sentence())
    
print('\n')

# Print three randomly-generated sentences of no more than 140 characters
for i in range(3):
    pprint(model.make_short_sentence(140))
('Special Instructions - There may be better than having someone on the '
 "leftmost group of Hishen armor in a dungeon, so I'm going to have any excuse "
 'not to put in links, bolding, and italics?')
'Especially eaten with the figure breathe a little.'
("How many more accounts that don't really have the author, the voter, the "
 "date, the weight removed, we don't need to make an nice, thick token which "
 'can be on the figure.')
('So much chrome and glass that you should be put off until tomorrow, at least '
 'seems to be able to step down the account blockchain that this line is '
 'really small, is going to select _Attributes_, which are right there, '
 'waiting to be able to add a meta tag into the game, a _player_ is never in '
 'short order.')
'Though the Emperor knows how you get to vote on outcomes.'


'The only thing that you get a feel for how an individual texture.'
'Really the demons that we can build some edges.'
'The latter has more than enough to move in an earlier bit.'

Oh. Oh dear. That is great. I'm killing myself here. Insert much mad laughter.

But maybe only 50 posts is too small. We have the data. Why don't we go big or go home? How about my last 500 posts?

# 100 posts!

result = db.Posts.find(query,
                       projection=proj,
                       sort=sort,
                       limit=100)

# Just crunch all of the reification at once. Why wait?

%time postsN = [procPost(e['body']) for e in list(result)]
Wall time: 9.68 s
len(postsN)
79

Interesting. It looks like I only have 79 posts in total. It's clearly not picking up comments on other people's posts as well.

No matter, because this is going to suffice. If I want to emulate myself creating comments on other people's posts – that's easy enough with a slightly more complicated query. Overall, it just wouldn't change that much.

Okay, let's throw all of this into the Markov generator!

%time model = markovify.Text(' '.join(postsN))
Wall time: 451 ms
# Print five randomly-generated sentences
for i in range(5):
    pprint(model.make_sentence())
    
print('\n')

# Print three randomly-generated sentences of no more than 140 characters
for i in range(3):
    pprint(model.make_short_sentence(140))
('You must justify the use of the world Challenges - Build the tech - Push an '
 'ICO - Promote the platform was implemented as a result of that was likely to '
 'go to those gears.')
('CENTRAL TRAIT First, you choose to transition into some sort of thing for '
 'final judgment.')
('That gives us a single source which could have been fascinated by the GM, '
 'just as their characters would.')
"If you've never been, you owe it to return as platoon leader."
("Just because you're a _combat accountant._ TRAITS Each character has and "
 'have them executed as a result.')


('People who trust me, because of the printed object just to get this on the '
 "assumption that what they're going to get greater exposure for.")
('This is one place than half the value of that vehicle remains to be able to '
 'run a railroad.')
("Things are looking a whole case for everything that he's selling Steem and a "
 'larger group than just a little bit.')

That is truly seven shades of awesome.

But is it enough? No! Never enough!

What would the Markov-generated joy from a very different writer look like?

I have a sick curiosity. What if we wanted to emulate @jerrybanfield? Surely we can point the system over at him?

query = {'author': 'jerrybanfield'}

result = db.Posts.find(query,
                       projection=proj,
                       sort=sort,
                       limit=100)

%time postsB = [procPost(e['body']) for e in list(result)]
Wall time: 13.6 s
postsB[0][:500]
"[Jerry Banfield]  How does owning our story of why we feel bad and what's going on with us, and sharing those feelings that we want to keep inside actually help us to feel happy more often?  Thank you for reading about day 185 of _Happier People Podcast_ and I hope you enjoy it!   New episodes of #happierpeople podcast are published first at https://dsound.audio/#!/@jerrybanfield  Listen to this on @dsound at https://dsound.audio/#/@jerrybanfield/how-we-feel-better-by-owning-our-shadow-without-p"

Yep, that's Jerry.

%time modelB = markovify.Text(' '.join(postsB))

# Print five randomly-generated sentences
for i in range(5):
    pprint(modelB.make_sentence())
    
print('\n')

# Print three randomly-generated sentences of no more than 140 characters
for i in range(3):
    pprint(modelB.make_short_sentence(140))
Wall time: 679 ms
('The public, especially in Orlando, but hey, we have somewhere to start '
 'before you are a huge sample pack I bought into Steem.')
('Then, that starts feeling like a miracle for musicians being able to give '
 'this a try for the people we love the most good with my divinity when my '
 'soul feels lost in fear and future.')
'We want that intro post and not even looking at being censored.'
'Given that these two keyboards.'
'What might be able to hire for this body to do it again sometime!'


'Every single human being on Steem.'
('If we switch the days and more time in the community and share automatically '
 "on to my account's voting bot.")
'At an equivalent of 60% APR, this is amazing on Steemit.'

Honestly, it's hard to tell them apart.

But you know what comes next. If you were to fuse Jerry and I into one being, what would the resulting writing look like?

hybridModel = markovify.combine([model,  modelB])
for i in range(5):
    pprint(hybridModel.make_sentence())
    
print('\n')

# Print three randomly-generated sentences of no more than 140 characters
for i in range(3):
    pprint(hybridModel.make_short_sentence(140))
("I'm saying this is something that was just feeling annoyed with my friend "
 'Tomas George with digital music masters at digitalmusicmasters.com helped me '
 "a way that I do, I've got this.")
('The funny thing is, when I started this whole row are set up our password on '
 'Steemit. '
 'https://steemitimages.com/DQmamXduj5Jwvir7m9qwB9bgx8tFoXKUqStpGodPKsCHX3P/S107-02.jpg '
 'We will see that some bid bots to help decide who gets to roll 5 on the '
 'Steem blockchain as a post on her right away when you get to know everything '
 "yet, but I am one with 500 views and I've moved it into the account set up "
 'the dialogue for creating it with both steemads and jerrybanfield to help '
 'Steem continue to be terribly useful – except that you keep putting content '
 'that you wonder how I reacted.')
'That comes to our much lighter vehicles.'
'I was looking at this time with one another.'
'I could be a big deal in countries like China where it rained acid all day?'


('But, yes, if you get to know that just makes me look good for one developer '
 'to code it for a good shot of securing it with you!')
('Let me open by saying that it gives me time to implement and demonstrate, '
 'which is primary.')
('He looked over at Block City have put in and that it will take place and a '
 '95% reduction in the C major, I mean the Internet.')

There is no end to it. No end at all. The mind burns at it.

Okay, one more ... @haejin. Let's do Haejin. My mind is already quailing at it.

query = {'author': 'haejin'}

result = db.Posts.find(query,
                       projection=proj,
                       sort=sort,
                       limit=100)

%time postsH = [procPost(e['body']) for e in list(result)]

postsH[0][:500]
Wall time: 12.8 s





"[] --  SUMMARY  This Inverted Bullish Head & Shoulders pattern is quite uncanny. What's exciting is that the pattern is complete and confirmed! Price has breached the Neck LIne! I've used the minimum price run to show a potential for $59.87  []  Beautiful textbook impulse wave showing a near completion of what could be the first of many leg ups. IF the abc red waves mark the correction completion, then we could expect a minor abc retracement.  []  That retracement usually goes to prior wave 4 an"

This might just be unfair, but we're going to do it anyway.

%time modelH = markovify.Text(' '.join(postsH))

# Print five randomly-generated sentences
for i in range(5):
    pprint(modelH.make_sentence())
    
print('\n')

# Print three randomly-generated sentences of no more than 140 characters
for i in range(3):
    pprint(modelH.make_short_sentence(140))
Wall time: 86 ms
('There are other alternate counts and they will be in progress towards the '
 'upper blue line as support.')
'The rise to C. Overall, bulish on the upcoming minor wave 2?'
('The information provided in this blog post and any accompanying material is '
 'for informational purposes only.')
('I believe hitting the $9,451 level would be ideal is a higher high is '
 'recognized.')
'It is imperative that price can be more proximal to the lower triangle line?'


('The information provided in this blog post and any accompanying material is '
 'for informational purposes only.')
('What would be probable IF the abc was carved out abc of the breakout is '
 'needed yet again but at a higher high.')
('The breakout currently looks very three wavish, the broader ABC could still '
 'be in progress.')

I – can't really tell the difference.

Maybe we've found the real source of this sort of thing?

Though only one thing remains to us. The three-headed Cerberus of steem posts!

hybridModel = markovify.combine([model,  modelB, modelH])
for i in range(5):
    pprint(hybridModel.make_sentence())
    
print('\n')

# Print three randomly-generated sentences of no more than 140 characters
for i in range(3):
    pprint(hybridModel.make_short_sentence(140))
('It works in the hand; it needed just a few thousand, we actually could have '
 'first bought it.')
'If you speak multiple languages, both languages there.'
('Bittrex.com Cryptocurrency Exchange Trading Tutorial with Bitcoin I had, and '
 "I don't lose everything or almost everything.")
'Out of Key Again in Ableton Live 9 Suite, I used in modeling things for free.'
('Readers will have many USD/fiat to Steem according to your printer going 60 '
 'to 80 mm/s at the list of voting bots produces a harvest faster!')


"It's _that good._ Mechanically, it's pretty obvious that there is separation."
'What do you have to put it into TS and see that.'
'The price today is over 250,000 lines long.'

My mind. My everlovin' mind.

Epilogue

Markovify is amazing.

Combining it with the ability to pull content from the steem blockchain to remix it into new content is even more fun. I'm really quite surprised and impressed that the things that we're working with here are so lightweight when it comes to how much CPU time it takes.

Also it's been great to continue coding but get away from doing statistical graph analysis for a little bit. Good times, good times.

Who would you like to see turned into a living Markov chain?

While we are doing things, let's get extreme. 100 posts is cool – but what if we make a hybrid out of 1000 posts from two of our most prolific posters? Would we be able to tell the difference?

Ultimate Form!

db = SteemData()
query = {'author': 'jerrybanfield'}

result = db.Posts.find(query,
                       projection=proj,
                       sort=sort,
                       limit=1000)

%time postsB = [procPost(e['body']) for e in list(result)]
Wall time: 5min 55s
%time modelB = markovify.Text(' '.join(postsB))
Wall time: 3.11 s
# Print five randomly-generated sentences
for i in range(5):
    pprint(modelB.make_sentence())
    
print('\n')

# Print three randomly-generated sentences of no more than 140 characters
for i in range(3):
    pprint(modelB.make_short_sentence(140))
('I am planning in my office, those would be worth a million Steem Power A DAY '
 'in Steem power.')
'I learned how to buy in.'
'This RSA key to complete this next URL look?'
('Bitcoin may actually be able to see the value is anywhere from 5% to the '
 'manual upvotes coming later, the rewards are preferred because we feel in '
 "control of our openness in what he's talking about.")
("If I just couldn't wait to place witness votes through setting me as a "
 'witness at steemit.com/~witnesses because this prevents duplicate voting.')


('This is the first few thousand more I am on the craps table.I had my whole '
 'life and to feel better afterwards.')
("I'm not spending any more cryptocurrency stuff, but I've always had an "
 'application on Poloniex, just a few of mine.')
('Then, this worked on writing this with the other stuff I was in a trip to '
 'Magic Kingdom Theme Park in the afternoon before.')
query = {'author': 'haejin'}

result = db.Posts.find(query,
                       projection=proj,
                       sort=sort,
                       limit=1000)

%time postsH = [procPost(e['body']) for e in list(result)]
Wall time: 3min 52s
%time modelH = markovify.Text(' '.join(postsH))
Wall time: 915 ms
# Print five randomly-generated sentences
for i in range(5):
    pprint(modelH.make_sentence())
    
print('\n')

# Print three randomly-generated sentences of no more than 140 characters
for i in range(3):
    pprint(modelH.make_short_sentence(140))
("However, Cryptos have their own personalities as we've seen this with a "
 'financil or investment advice of any kind.')
'The handle would coincide with the Popcorn Sympohony of Altcoins!'
('Once wave 2 of the larger triangle is quite close to being done with the '
 "buyer's and seller's remorse events.")
'The Video has more details!'
'This pair too is expected to resume towards hitting $0.0285 or higher.'


'Since the handle formation is complete, Elliott Waves point to try again!'
('Last night, I shared the below shows STX/BTC with a financil or investment '
 'advice of any kind.')
"Let's see how the white impulse waves 1,2,3,4,5."
hybridModel = markovify.combine([modelB, modelH])
for i in range(5):
    pprint(hybridModel.make_sentence())
    
print('\n')

# Print three randomly-generated sentences of no more than 140 characters
for i in range(3):
    pprint(hybridModel.make_short_sentence(140))
('I am going to have a hard time trying to struggle, we want to tell a story '
 'about it, or you simply run a Steem Python library, “a high quality naming, '
 'just copy and paste hacking attempts from tutorials online combined with an '
 'aim towards mastery and MASSIVE profits can be used during the first person '
 'shooters, Duke Nukem and Quake, and playing with Ableton Live with C Major, '
 'velocity, and note how I ended up stopping doing as far as you can just '
 'customize this deeper and show how my main servers located in the '
 'whitepaper.')
('Then, I would be the most important votes we make our votes, we feel in '
 'control of our community!')
('With a scenario as shown in the 10 tips are helpful and should be able to '
 "track bot upvotes, the majority of the biggest problem I've struggled with "
 'alcohol and drugs were a starter.')
('They show things that could allow all of us just work all day, and I am very '
 'excited about how my life being a father!')
('The correction took the camera and on a bunch of work is more important that '
 'I want to do it.')


('I felt resentful, I had this latent love or when certain exchanges like '
 'Poloniex or Bit t-rex.')
('Please consider reviewing these Tutorials on: Elliott Wave counts of the '
 'sellers are dwindling and once d and e still remain incomplete.')
'I am noticing is the same things.'

Tools

  • Markovify

  • spaCy - Though if you install spaCy through pip, make sure to install the english language network as admin.
    python -m spacy download en

  • SteemData -- Thanks to @furion, as ever.

Authors get paid when people like you upvote their post.
If you enjoyed what you read here, create your account today and start earning FREE STEEM!
Sort Order:  

You seem to be enjoyng this coding stuff... ;)

It's been a while. I should really do some other kinds of posts, but I keep having these strange ideas for coding.

Believe me, I want more than anybody to get away from coding and get back to writing about role-playing games and writing. This is just the stuff that gets lodged in my head sideways.

Markov-Chains for generating sentences are amazing.

I remember, back in the IRC days (okay … the late IRC days of 2008), a bot in a channel I frequented posting random markov sentences every 50 messages other people wrote.
Often it was hilarious, speaking some actual truth, or just complete garbage, especially if new people actually talked to the bot, not knowing it's a bot.

Markovify strapped together with a part of speech system just turns out some amazingly hilarious stuff.

I recognize that my targets in this particular example were a bit on the easy side to parody – though to be fair, that includes myself.

I was just in time to upvote this: cool Lex! :-)

Well, I never promised to be any kind of good programmer. :)

So I don't code myself but I've been around it long enough to look at bits of it and understand what you're doing. I must be a sad geek coz I enjoyed the read. :-)

Markov chains in general just make me laugh. The whole architecture, the whole plan. That there is a convenient, high quality library to do that kind of analysis just pleases me beyond words, and I've been looking for an excuse to use it for a while.

I should look for more interesting writers to mash up; writers with very clear and distinct voices. Because that's maximal comedy value, right there.

Then, I would be the most important votes we make our votes, we feel in 'control' of our community!

These guys LOVE themselves :P

At a certain point, when even machines make fun of you, you have to stand back and assess the decisions that you've made in your life. You may still stand by those decisions, but you must assess them.

Nice post & also i support u ... thanks for sharing...