Steem for script kiddies: paginated get_account_history calls

in steem •  3 years ago  (edited)

Today, @ety001 published the post, The official API node will decrease the get_account_history upper limit. The post said,

image.png

Pixabay license from OpenClipart-Vectors, source

We will decrease the upper limit of get_account_history API from 10000 to 100.

Please add pagination function into your program if you are using this API.

It happens that I've been meaning to learn about implementing pagination of API calls in shell scripts, but I never took the time... until today. After reading account_history_api.get_account_history, I was able to determine that I need to control the two parameters: start and limit.

Unsurprisingly, start tells it where to start, and limit tells it how many to grab (moving forward in time). It's clear from the documentation that you can start by setting start to some value, and then incrementing it in chunks of limit to get multiple pages.

What wasn't immediately obvious to me from the documentation, however, is how to get the pages at the end of the transaction list. I hacked around the shell prompt, and it turned out to be simple enough. Step 1 is to get the most recent transaction number, and step 2 is to calculate the other page starting numbers by decrementing from there.

So, here's the little toy script that I scraped together to accomplish that (using @social as an example account):

#!/bin/bash

STEEM_API="https://api.steemitdev.com"
PG_LIM=1  # Remember this starts counting at 0
STEEM_ACCT="social"
export STEEM_API PG_LIM STEEM_ACCT

### Get most recent transaction id
START=$(curl -s --data '{
  "jsonrpc": "2.0",
  "method": "account_history_api.get_account_history",
  "params": {
    "account": "'${STEEM_ACCT}'",
    "start": -1,
    "limit": 0
  },
  "id": 1
}' ${STEEM_API} | jq -S '.result[][][0]')

echo "Most recent transaction: "${START}

### List the most recent transactions in pages
for LCV in {2..0}
do
   FROM=$(( ${START} - ${LCV} - ( ${LCV} * ${PG_LIM} )  ))
   echo "New Page: (back ${LCV} pages)"
   curl -s --data '{
     "jsonrpc": "2.0",
     "method": "account_history_api.get_account_history",
     "params": {
       "account": "'${STEEM_ACCT}'",
       "start": "'${FROM}'",
       "limit": '${PG_LIM}'
     },
     "id": 1
   }' ${STEEM_API} | jq -S '.result[][][0]'
done



And here's some sample output, with 3 pages and page limit set to "1" (which gives back two results):


Most recent transaction: 47431
New Page: (back 2 pages)
47426
47427
New Page: (back 1 pages)
47428
47429
New Page: (back 0 pages)
47430
47431


Of course, this just prints out the label number for the transactions. Additional machinations would be needed with "jq" in order to extract other information from the transaction history. This is left as an exercise for the reader. ;-)

One point of note is that both start and limit start counting at 0. So if I tell it to start at transaction 1, it actually starts at the second transaction, and if I tell it to use 100 for the limit, it actually returns 101 results.

Incidentally, if I'm not misunderstanding something, some of the documentation for "get_account_history" seems to contain erroneous explanations for start and limit (i.e. here).

If I understand correctly, start actually has the following meanings:

Start valueMeaning
-1The most recent transaction in the account's history
0The first (earliest) transaction in the account's history
NThe (N+1) transaction in the account's history (Starting at 0 and moving forward in time.)

And - as noted above - limit starts counting at zero, so where the above document says: "1,000 results or 10,000 results", that should be "1,001" or "10,001". (And those will both be invalid after the coming change to 100 err... 101 that prompted this post. ;-)

That's all for now. Happy scripting and thanks for reading!


image.png

Pixabay license, source

Reminder


Visit the /promoted page and #burnsteem25 to support the inflation-fighters who are helping to enable decentralized regulation of Steem token supply growth.

Authors get paid when people like you upvote their post.
If you enjoyed what you read here, create your account today and start earning FREE STEEM!
Sort Order:  

Now I tested the script and the api. As we have already discovered, the pagination is a little irritating. Because limit+1 records is returned instead of limit records, you have to be careful with the pagination... or set limit=(real)limit-1 in the request.

However, it is very confusing that the number of data records in the response should not be more than limit due to the loop termination condition.

For rocksdb, maybe it's here? Which (I think) goes here?

The github issue said:

Due to the lack of performance tuning of RocksDB, the current ahnode api performance is seriously insufficient.

I still don't understand why it works the way it does, though. Maybe an off by one error in this loop?

And/Or maybe this should be > instead of >= ?

I'd need to step through it line by line to understand, but I'm not that curious. ;-)

  ·  3 years ago (edited)

that should be "1,001" or "10,001"

Just briefly my thoughts on this (I'm travelling at the moment):
Limit 100 means that you only get 100 records back. So the last one is record no. 99.
The start at index 0 has the decisive advantage that you can use the start and limit for all loops with a loop variable.
steps = 100 #constant
loops = 0 #increase after each loop
start = steps * loops
limit = steps

Unfortunately I can't test your script at the moment. ☀️

Loading...

Interesting article. I haven't studied much about APIs, but i do plan to get into it...at least from the JavaScript perspective since it is the primary language i'm familiar with.

  ·  3 years ago (edited)

While I'm reading your post I'm drinking coffee, I don't know if it's because of the coffee, but I'm not sure what you're saying, I understand that people have advanced here since its inception, so I don't know, I could look at other people's comments, but I prefer be honest and wish you a happy weekend :)

Quick question: get_discussions_by_created is limited to 100 entries, unfortunately there is no start parameter for this function.

{"jsonrpc":"2.0", "method":"condenser_api.get_discussions_by_created", "params":[{"tag":"deutsch","limit":100, "truncate_body": 1}], "id":1}

Do you have any idea how I can get the next 100 entries?

Maybe there is another API function to read posts by_created. (I only use curl calls, no steem.js).

EDIT: Really hard, got the solution in this post which was written six years ago. Do you know the reason why these two parameters are not specified in the API documentation?

Wow. start_permlink and start_author - I had no idea about that, so thank you. I wonder if there is also a secret trick for account.notifications... Looks like, maybe, last_id.

curl -s --data '{"jsonrpc":"2.0", "method":"bridge.account_notifications", "params":{"account":"remlaps-lite","limit":1,"last_id":50126763}, "id":0}' https://api.steemit.com | jq -S .
{
"id": 0,
"jsonrpc": "2.0",
"result": [
{
"date": "2022-08-05T01:01:06",
"id": 50126762,
"msg": "@penny4thoughts voted on your post ($0.10)",
"score": 25,
"type": "vote",
"url": "@remlaps-lite/creativity-challenge-6-uber-s-canine-division"
}
]
}

@cmp2020 and I were just looking for something like that yesterday, but until I read your comment and the linked post, it hadn't occurred to me to check in github.

This post has been featured in the latest edition of Steem News...