Update: Steem-Exif-Spider-bot New Voting and Comments Queues

in utopian-io •  7 years ago  (edited)

There are several improvements and features added.

  1. Comments queue
  2. Voting queue
  3. Switch from filesystem streams to in-memory buffers

Comments and Voting queues


In other projects, I have noticed significant efficiency improvements from switching to queues that spread voting/commenting out instead of processing all votes and comments simultaneously. The queue implementations are fairly simple. Just pushing/shifting off an array. It is also managed through a scheduler that shifts items off the array in periodic increments. Ideally, we would want to manage this with an event. That may come in as a future change.

Switching from Filesystem Streams to in-memory Buffers

Prior to this change, JPG data was read via HTTP using streams. The data would be streamed and then written to a file. This file would then be read back in as a buffer. We are essentially skipping a step. Rather than blocking on filesystem I/O for streams, I switched to in-memory buffers. The data is converted from a stream to a buffer and loaded into memory rather than writing to disk first.

Changes

diff --git a/steem-exif-spider-bot/helpers/bot/comment.js b/steem-exif-spider-bot/helpers/bot/comment.js
new file mode 100644
index 0000000..10c2093
--- /dev/null
+++ b/steem-exif-spider-bot/helpers/bot/comment.js
@@ -0,0 +1,61 @@
+const Promise = require('bluebird')
+const steem = require('steem')
+const { user, wif, weight } = require('../../config')
+const schedule = require('node-schedule')
+const Handlebars = require('handlebars')
+const fs = Promise.promisifyAll(require('fs'))
+const path = require('path')
+
+const MINUTE = new schedule.RecurrenceRule();
+MINUTE.second = 1
+
+function loadTemplate(template) {
+    return fs.readFileAsync(template, 'utf8')
+}
+
+
+function execute(comments) {
+
+    if (comments.length() < 1) {
+        return {};
+    }
+
+    const { author, permlink } = comments.shift();
+
+    var context = {
+    }
+
+    return loadTemplate(path.join(__dirname, '..', 'templates', "exif.hb"))
+        .then((template) => {
+            var templateSpec = Handlebars.compile(template)
+            return templateSpec(context)
+        })
+        .then((message) => {
+            var new_permlink = 're-' + author 
+                + '-' + permlink 
+                + '-' + new Date().toISOString().replace(/[^a-zA-Z0-9]+/g, '').toLowerCase();
+            console.log("Commenting on ", author, permlink, type)
+
+            return steem.broadcast.commentAsync(
+                wif,
+                author, // Leave parent author empty
+                permlink, // Main tag
+                user, // Author
+                new_permlink, // Permlink
+                new_permlink,
+                message, // Body
+                { tags: [], app: "steemit-exif-spider-bot/0.1.0" }
+            ).then((results) => {
+                console.log(results)
+                return results
+            })
+            .catch((err) => {
+                console.log("Error ", err.message)
+            })
+        })
+}
+
+module.exports = {
+    execute
+}

Comment queue implementation. Pulls comments off the queue and posts them to the steem blockchain.

diff --git a/steem-exif-spider-bot/helpers/bot/exif.js b/steem-exif-spider-bot/helpers/bot/exif.js
index bc49a21..738c48a 100644
--- a/steem-exif-spider-bot/helpers/bot/exif.js
+++ b/steem-exif-spider-bot/helpers/bot/exif.js
@@ -17,6 +17,9 @@ module.exports = {
     execute
 }
 
+let VOTING = {}
+let COMMENTS = {}
+
 function loadTemplate(template) {
     return fs.readFileAsync(template, 'utf8')
 }
@@ -35,60 +38,32 @@ function processComment(comment) {
             }
             return [];
         })
-        .each((image) => {
-            if (image.indexOf(".jpg") > -1|| image.indexOf(".JPG") > -1) {
-                const dest = tempfile('.jpg');
-                try {
-                    got.stream(image).pipe(fs.createWriteStream(dest))
-                        .on('close', () => {
-                            try {
-                                const input = ExifReader.load(fs.readFileSync(dest));
-                                const tags = []
-                                for (let key in input) {
-                                    const value = input[key];
-                                    if (key != "MakerNote"
-                                        && key.indexOf("undefined") < 0
-                                        && key.indexOf("omment") < 0
-                                        && key.indexOf("ersion") < 0) {
-                                        tags.push({ name: key, value: value.value, description: value.description })
-                                    }
-                                }
-
-                                reply(comment, tags)
-                            }
-                            catch(err) {
-                                if (err.message == "No Exif data") {
-
-                                }
-                            }
-                        })
-                }
-                catch (err) {
-                    console.log("Error ", err)
-                }
-                finally {
-                    fs.unlink(dest, (err) => {
-                        // file deleted
+        .map((image) => {
+            if (image.indexOf(".jpg") > -1 || image.indexOf(".JPG") > -1) {
+                const buffers = [];
+                return got(image, {encoding: null })
+                    .then((response) => {
+                        console.log("Loading ", image);
+                        return ExifReader.load(response.body);
+                    })
+                    .catch((error) => {
+                        console.log("Error ", error);
                     });
+            }
+        })
+        .filter((tags) => tags ? true : false)
+        .each(input => {
+            const tags = []
+            for (let key in input) {
+                const value = input[key];
+                if (key != "MakerNote"
+                    && key.indexOf("undefined") < 0
+                    && key.indexOf("omment") < 0
+                    && key.indexOf("ersion") < 0) {
+                    tags.push({ name: key, value: value.value, description: value.description })
                 }
             }
+            reply(comment, tags)
         })
         .catch((error) => {
             console.log("Error ", error)
@@ -101,51 +76,23 @@ function reply(comment, tags) {
         tags: tags
     }
 
-
-    return loadTemplate(path.join(__dirname, '..', 'templates', 'exif.hb'))
-    .then((template) => {
-        var templateSpec = Handlebars.compile(template)
-        return templateSpec(context)
-    })
-    .then((body) => {
-        console.log("Body ", body)
-        return body;
-    })
-    .then((body) => {
-        var permlink = 're-' + comment.author 
-            + '-' + comment.permlink 
-            + '-' + new Date().toISOString().replace(/[^a-zA-Z0-9]+/g, '').toLowerCase();
-
+    return new Promise((resolve, reject) => {
         console.log("Replying to ", {author: comment.author, permlink: comment.permlink})
-        return steem.broadcast.commentAsync(
-            wif,
-            comment.author, // Leave parent author empty
-            comment.permlink,
-            user, // Author
-            permlink, // Permlink
-            permlink, // Title
-            body, // Body
-            { "app": "steem-exif-spider-bot/0.1.0" }
-        )
-        .catch((err) => {
-            console.log("Unable to process comment. ", err)
-        })
+        COMMENTS.push({ author: comment.author, permlink: comment.permlink })
+
+        return [ comment.author, comment.permlink]
     })
-    .then((response) => {
-        return steem.broadcast.voteAsync(wif, user, comment.author, comment.permlink, weight)
-            .then((results) =>  {
-                console.log(results)
-            })
-            .catch((err) => {
-                console.log("Vote failed: ", err)
-            })
+    .spread((author, permlink) => {
+        VOTING.push({ author: author, permlink: permlink, weight: weight });
     })
     .catch((err) => {
         console.log("Error loading template ", err)
     })
 }
 
-function execute() {
+function execute(voting, comments) {
+    VOTING = voting
+    COMMENTS = comments
 
     steem.api.streamOperations((err, results) => {
         return new Promise((resolve, reject) => {

Moving functionality that will be implemented in comment and voting queues. Switching from filesystem streams to in-memory buffers.

diff --git a/steem-exif-spider-bot/helpers/bot/index.js b/steem-exif-spider-bot/helpers/bot/index.js
index 083546c..64b61ac 100644
--- a/steem-exif-spider-bot/helpers/bot/index.js
+++ b/steem-exif-spider-bot/helpers/bot/index.js
@@ -1,6 +1,27 @@
+const voting_queue = [];
+const comment_queue = [];
+
+const voting = {
+    length: () => { return voting_queue.length },
+    push: (obj) => { return voting_queue.push(obj) },
+    pop: () => { return voting_queue.pop() },
+    shift: () => { return voting_queue.shift() },
+    unshift: (obj) => { return voting_queue.unshift(obj) }
+}
+
+const comments = {
+    length: () => { return comment_queue.length },
+    push: (obj) => { return comment_queue.push(obj) },
+    pop: () => { return comment_queue.pop() },
+    shift: () => { return comment_queue.shift() },
+    unshift: (obj) => { return comment_queue.unshift(obj) }
+}
+
 
 function run() {
-    return require("./exif").execute();
+    require('./comment').execute(comments)
+    require('./vote').execute(voting)
+    require('./exif').execute(voting, comments)
 }

The model for the queue management

diff --git a/steem-exif-spider-bot/helpers/bot/vote.js b/steem-exif-spider-bot/helpers/bot/vote.js
new file mode 100644
index 0000000..58714c8
--- /dev/null
+++ b/steem-exif-spider-bot/helpers/bot/vote.js
@@ -0,0 +1,84 @@
+const Promise = require('bluebird')
+const steem = require('steem')
+const { user, wif, weight } = require('../../config')
+const schedule = require('node-schedule')
+const moment = require('moment');
+
+const MINUTE = new schedule.RecurrenceRule();
+MINUTE.second = 1
+
+const SECONDS_PER_HOUR = 3600
+const PERCENT_PER_DAY = 20
+const HOURS_PER_DAY = 24
+const MAX_VOTING_POWER = 10000
+const DAYS_TO_100_PERCENT = 100 / PERCENT_PER_DAY
+const SECONDS_FOR_100_PERCENT = DAYS_TO_100_PERCENT * HOURS_PER_DAY * SECONDS_PER_HOUR
+const RECOVERY_RATE = MAX_VOTING_POWER / SECONDS_FOR_100_PERCENT
+const DEFAULT_THRESHOLD = 9500
+
+
+function current_voting_power(vp_last, last_vote) {
+    console.log("Comparing %s to %s ", moment().utc().add(7, 'hours').local().toISOString(), moment(last_vote).utc().local().toISOString())
+
+    var seconds_since_vote = moment().utc().add(7, 'hours').local().diff(moment(last_vote).utc().local(), 'seconds')
+    return (RECOVERY_RATE * seconds_since_vote) + vp_last
+}
+
+function time_needed_to_recover(voting_power, threshold) {
+    return (threshold - voting_power) / RECOVERY_RATE
+}
+
+function check_can_vote() {
+    return steem.api.getAccountsAsync([ user]).then((accounts) => {
+        if (accounts && accounts.length > 0) {
+            const account = accounts[0];
+            console.log("Voting threshold for %s: %s", user, DEFAULT_THRESHOLD)
+            console.log("Getting voting power for %d %s", account.voting_power, account.last_vote_time)
+            var voting_power = current_voting_power(account.voting_power, account.last_vote_time)
+            if (voting_power > DEFAULT_THRESHOLD) {
+                return true;
+            }
+        }
+        return false;
+    })
+}
+
+function vote(author, permlink, weight) {
+    return steem.broadcast.voteAsync(
+        wif, 
+        user, 
+        author,
+        permlink,
+        weight
+    )
+    .then((results) =>  {
+        console.log("Vote results: ", results)
+        return results;
+    },
+    (err) => {
+        console.log("Vote failed for %s: %s", user, err.message)
+    })
+}
+
+function execute(voting) {
+    schedule.scheduleJob(MINUTE, function() {
+        if (voting.length() < 1) {
+            return {};
+        }
+               
+        const { author, permlink, weight } = voting.shift();
+
+        return check_can_vote().then((can_vote) => {
+            if (can_vote) {
+                vote(author, permlink, weight)
+            }
+            else {
+                voting.push({ author, permlink, weight })
+            }
+        })
+    })
+}
+
+module.exports = {
+    execute
+}

This is the module for the voting queue implementation. It periodically pulls votes off a queue and votes them.

diff --git a/steem-exif-spider-bot/package.json b/steem-exif-spider-bot/package.json
index 1abbdbe..8024fd2 100644
--- a/steem-exif-spider-bot/package.json
+++ b/steem-exif-spider-bot/package.json
@@ -31,6 +31,7 @@
     "got": "^8.3.0",
     "handlebars": "^4.0.11",
     "jdataview": "^2.5.0",
+    "node-schedule": "^1.3.0",
     "request": "^2.85.0",
     "steem": "^0.7.1",
     "tempfile": "^2.0.0",

Adding node schedule dependency so the queue can be run using node-schedule



Posted on Utopian.io - Rewarding Open Source Contributors

Authors get paid when people like you upvote their post.
If you enjoyed what you read here, create your account today and start earning FREE STEEM!
Sort Order:  

Thank you for the contribution.

Great effort and I like seeing how often you contribute to these projects and utopian.io .

When you write these contributions I would like to think that you know there is at least another human being reading your contribution. I am not sure what function these long diffs achieve for other people, but they are not helpful at all to me.

What I would do , maybe, is to have at least the removed lines not features in this, while considering making the other lines more about the domain logic not about the lower-level implementation.

You can contact us on Discord.
[utopian-moderator]

Thanks so much. Yeah, the diffs are really lengthy and you can probably just look at the PR, so I don't know what's so great about them. It's just that there was a time a moderator asked me to add them in.

BTW, all lines (I am pretty sure) are related to the features. I think I actually left out a few that I think aren't (package-lock.json). If you see any that you think are not, let me know. I'll be sure to keep a look out in the future.

Thanks for the feedback. I appreciate the chance to improve.

Hey @r351574nc3 I am @utopian-io. I have just upvoted you!

Achievements

  • You have less than 500 followers. Just gave you a gift to help you succeed!
  • Seems like you contribute quite often. AMAZING!

Community-Driven Witness!

I am the first and only Steem Community-Driven Witness. Participate on Discord. Lets GROW TOGETHER!

mooncryption-utopian-witness-gif

Up-vote this comment to grow my power and help Open Source contributions like this one. Want to chat? Join me on Discord https://discord.gg/Pc8HG9x

  ·  7 years ago (edited)

Pretty neat, but your @salty-mcgriddles account is being very spammy. Maybe it could be activated only by a specific trigger in posts or comments.

Not my account. It's my daughter's. She's running the bot to play with it. I don't think it'll stay running much longer. She'll probably turn it off when we get back from our soccer game.

Maybe it could be activated only by a specific trigger in posts or comments.

Yeah. That's a good idea.

I think it supposed to also leave an upvote which would make people less annoyed with the spam.

Spam is not the only issue here either. The other issue is privacy.

Posting the EXIF info without someone opting in to it can also compromise the identity of the account posting. Someone may accidentally post a photo without the EXIF scrubbed, but realize it and quietly make a change. With a bot running rampant posting EXIF data whether anyone likes it or not, someone's personal data may be made a whole lot more obvious.

Ok, to be clear, exif information is already public once you post the photo (see http://exif-viewer.com/). The point of this bot is convenience so readers and posters can get information that's already available. But yeah, spam is bad. The point is not to spam. It's a bot of convenience. If people don't find it useful, then there's no point to it.

  ·  7 years ago (edited)

Visible EXIF data is public if a poster chooses to publish it in their post - and technically they have published it by posting an image - so anyone with half a brain can view the data if they want to - it's not up to you to spam it in their comments. There is nothing convenient about any of this. Your script is horrendous - kill it and DO NOT EXECUTE IT AGAIN.

Your campaign against my daughter is despicable. You should be ashamed. Odds are, this will be refined. It will be run again. It'll probably be run on opt-in basis, so people that don't know about it or don't want to use it don't have to.

You're just one no in a sea of YES.

IT IS NOT USEFUL - YOU ARE PULLING META DATA FROM EVERY IMAGE YOU SCRAPE - WHETHER IT BE A SCREENSHOT, GRAPHIC ART, SLIDE SCANS ETC. AND YOU ARE POSTING THIS DATA IN THE COMMENTS OF POSTS THAT HAVE NOTHING TO DO WITH PHOTOGRAPHY - AT ALL

You say it's not useful and then you explain exactly why it is useful. Which is it? I'm not posting any data in comments. I can't be held responsible for what users do with this bot.

Also, it's checking the #photography tag. If it's picking up images that are not #photography, then that's tag abuse. It's not the bot's fault.

@r351574nc3 @salty-mcgriddles @stranded
Why don't you use APP Engine to create a parser and interface for users to generate their own metadata - posting it directly to comments looks like nothing more than comment spamming for upvotes.

I mean - adapt your scraper to do what it is doing now (collecting steemit EXIF and formatting it) - but move it to another platform. Then, let a user input their steemit name and present them with pre-formatted exif data which they can link to or include in their posts.

You mean like this one? https://steemit.com/utopian-io/@r351574nc3/new-project-steemit-exif-microservice

let a user input their steemit name

Check (not just a name, but directed content by permlink)

present them with pre-formatted exif data which they can link to or include in their posts.

Check (preformatted JSON, that they can then refine and prune for whatever data they want)

So yes, you could create a very useful and refined service which would be unique to steemit - I think a project like that would have a lot of merit.

Here's your demo https://steemit-exif-ms.herokuapp.com/soma909/le-fond-de-l-air-est-rouge-a-grin-without-a-cat-2017-fuji-x100-mk1-tele-conversion-lens

With some mediocre skills, you can extract exactly what fields you want and simultaneously convert it to markdown to include in a post.

@r351574nc3 @salty-mcgriddles @stranded
I don't know where you learned to code - but you should not be running alpha code in a live environment.

Hehe...it's not even alpha. This is called a/b testing. You sound like you know just enough to know where software comes from, but not at all how it's made. Also, pretty conceited like you think you can do better. You should. Competition is what drives quality, but you already knew that.

Also, private alpha and closed/open beta testing is done on ... live/in-production systems (not test environments).

Loading...