STEEM internals #3: Hot or Not. Net R-Shares vs Ups and Downs. Meet the HOT rank algorithm used for STEEM posts.

in steem •  8 years ago 

I would like to describe the algorithm that is currently used for STEEM posts to evaluate whether the post is hot or not. I will also explain how it is related to the algorithm used by Reddit. The most HOT posts are available on the 'hot' tab.

HotOrNot

How does Reddit HOT value is calculated.

The algorithm used to rank Reddit stories is open source and its description can be found on a few websites. A good reference can be found directly in STEEMIT source code and you can read it here.

The original Reddit's source code for hot formula is:

cpdef double _hot(long ups, long downs, double date):
    """The hot formula. Should match the equivalent function in postgres."""
    s = score(ups, downs)
    order = log10(max(abs(s), 1))
    if s > 0:
        sign = 1
    elif s < 0:
        sign = -1
    else:
        sign = 0
    seconds = date - 1134028003
    return round(sign * order + seconds / 45000, 7)

I will try to describe it again using a recipe syntax:

  1. Count number of upvotes U.
  2. Count number of downvotes D.
  3. Calculate s = U - D.
  4. Calculate log10 value of the maximum value, of the absolute value of s and 1 (avoid 0 on purpose due to log10 logarithm) and assign the result to 'order' variable.
  5. Assign sign = 0 if number of upvotes is equal to downvotes, sign = 1 if there are more upvotes than downvotes, sign = -1 otherwise.
  6. Calculate t = time from 8 December 2005 to post creation date. See here why start date is fixed.
  7. Use a simple formula to get the rank value:

formula

Three main characteristics of the algorithm

  • Hotness rank of a post once evaluated does not change. It means that your post from today will have the same hotness rank in 2017, 2018 etc... The hotness rank value for newer posts with the same number of upvotes and downvotes will be greater than for current posts. In other words tomorrow's post that has the same number of upvotes and downvotes as today's post will be more hot than today's just because it is newer.
  • Due to log10 applied to the formula, the first 10 ups-downs count the same as the next 100.
  • Controversial posts that get similar amounts of upvotes and downvotes will get a low hotness rank. Hot posts must have much more upvotes than downvotes.

Why can we say that STEEM posts on STEEMIT are ranked using a similar algorithm

If you look into STEEMIT source code, you may notice that there are similarities and comments inside the source reference original Reddit solution. See the current implementation below:

      auto s = c.net_rshares.value / 10000000;
      double order = log10( std::max<int64_t>( std::abs(s), 1) );
      int sign = 0;
      if( s > 0 ) sign = 1;
      else if( s < 0 ) sign = -1;
      auto seconds = c.created.sec_since_epoch();
      return sign * order + double(seconds) / 10000.0;

If you can read this you may notice that it is very similar to Reddit's recipe. There are two main differences though:

  • Time difference starts at epoch
    epoch = datetime(1970, 1, 1). Therefore t is number of seconds since 1st January 1970 to post creation time.

  • Instead of upvotes - downvotes net value of R-shares is used.

OK but what are R-Shares?

This may sound surprising but STEEM upvotes are measured in R-shares. An downvote is not a simple -1 and an upvote is not +1. R-Shares value is determined by account's STEEM Power times remaining voting power. Net value of R-shares of a post is the value of its all R-Shares combined.

And this is a catch that makes the huge difference between STEEM and Reddit. A fresh STEEM post will become more HOT not just by a bigger number of upvotes than downvotes but if it was upvoted by powerful people. This means by STEEM accounts with big amounts of STEEM Power and voting power.

My last post was noticed by a few good people. Thank you for that. I hope you will continue to enjoy my journey into STEEM internals. Next time I would like to dive in further and try to describe the difference between trending and active posts.

DISCLAIMER: THE INFORMATION IS DELIVERED FREE OF CHARGE AND 'AS IS' WITHOUT WARRANTY OF ANY KIND. I HOPE IT IS ACCURATE AND FREE OF ERRORS AND YOU FIND IT USEFUL.

Authors get paid when people like you upvote their post.
If you enjoyed what you read here, create your account today and start earning FREE STEEM!
Sort Order:  

I am really interested in seeing more steem internals. Please keep posting.

Thank you for the feedback. I will try to do my best!

yes good to see and understand the cogs which make this Steem machine work !! upvoted !

Thanks!

Is there no squaring of the Steem power? I have heard that before but it doesn't seem to be in the code above.

Hi @dennygalindo, squaring is used for post reward calculation. I hope that I will be able to describe the process later.