Using HiveSQL with Python on Mac

over 3 years ago

This is a short tutorial on how to get started with using HiveSQL in python scripts on a Mac computer. HiveSQL is an awesome SQL database service created and maintained by @arcange. It makes getting data from Hive blockchain fast and easy. Now that HiveSQL is funded by Decentralized Hive Fund this service is free.

To use HiveSQL you will need to get login credentials to access the database. To get access the database, you will need to transfer 0.01 HBD to @hivesql account. No memo is needed. @hivesql will automatically send back a transfer with an encrypted memo that will contain information needed to access the database. Since the memo will be encrypted, to see the memo the wallet need to accessed with a memo key. The information contained in the memo will be server url, database name, login user name, and a password.

Now, we can install necessary drivers and dependencies on our mac machine. Follow the following steps.

Open the terminal and run the following command to install the homebrew:
/usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
Next run the following command:
brew tap microsoft/mssql-release https://github.com/Microsoft/homebrew-mssql-release
Next run:
brew update
And run the following to install the driver:
brew install msodbcsql17 mssql-tools
Lastly, we need to pip install pyodbc module. You may need to use pip, pip3, or pip3.8 command depending on the version of the python you are using. Run this:
pip3 install pyodbc

Now we are ready to use HiveSQL within our python scripts. The following is a sample script to test if we can properly connect to HiveSQL and get data back.

import pyodbc
import pprint

connection = pyodbc.connect('Driver={ODBC Driver 17 for SQL Server};'
                            'Server=vip.hivesql.io;'
                            'Database=DBHive;'
                            'uid=Hive-geekgirl;'
                            'pwd=XXXXXXXXXXXXX')

cursor = connection.cursor()

SQLCommand = '''
SELECT delegator, delegatee, vesting_shares, timestamp
FROM TxDelegateVestingShares
WHERE delegator = 'geekgirl'
ORDER BY timestamp DESC
'''

result = cursor.execute(SQLCommand)
result = result.fetchmany(100)
pprint.pprint(result)
connection.close()

We use ODBC Driver 17 for SQL Server for the driver. Server, database, uid, and pwd information will be sent by @hivesql. Within SQLCommand variable will contain our SQL query commands. Run the script to confirm that python script is working properly, connects to HiveSQL and returns the correct data. Once everything works properly you should be able to replace the contents of the SQLCommand variable and run any SQL queries as needed.

Posted Using LeoFinance ^Beta

hive sql dev coding stem ctptalk neoxian posh crypto blog leofinance

0.000

35 comments

@poshbot 59

over 3 years ago

https://twitter.com/geekjen/status/1349205030436364290

0.000

@chireerocks 75

over 3 years ago (Edited)

@geekgirl, I am not a Coder but language of "Coding Universe" feels like mystic one. Stay blessed.

0.000

@edicted 82

over 3 years ago

Python OP

Posted Using LeoFinance ^Beta

0.000

@rishi556 70

over 3 years ago

Lies, JS OP.

0.000

@amr008 68

over 3 years ago

For hive this is the best but unfortunately for second layer tokens I don't think HiveSQL is the solution though since we can't depend on Custom JSON . I used to use HiveSQL , now I am shifitng to hive-engine API.

Posted Using LeoFinance ^Beta

0.000

@geekgirl 79

over 3 years ago

Interesting. I haven’t explored all tables and columns yet. I would think all data should be available there. I will keep that in mind. Thanks.

0.000

@amr008 68

over 3 years ago

Well yeah the last table is TxCustoms table where you can see the 2nd layer custom json data .

There are a lot of duplicates over there ,
For ex :

I tried to retrieve total LEO staked data - Actual stake was around 7k ( for around 10 days ) but I got 8.3k . When I check manually there were lot of stake json's appearing twice .

All the hive related data are accurate though . Only the Custom JSON part has irregularities.

Posted Using LeoFinance ^Beta

0.000

@geekgirl 79

over 3 years ago

Thank you for this information. I will keep that in mind when I explore the data in that table. I think those issues can be resolved though.

If there is error when the custom_json data is stored on HiveSQL, we should probably let @arcange know.
If the problem is due to how layer 2 data is stored on blockchain, I would assume it could be sorted out with python.

It is interesting. I will definitely keep that in mind when I get to learning more about custom_json.

0.000

@amr008 68

over 3 years ago

Yeah 2nd point is actually what I used for a while but when the tx's are huge it became increasingly difficult for me to sort out the irregularities .

The problem isn't with HiveSQL . check this out -

I went to that particular user history on hiveblocks and checked manually and I saw that there was actually two custom_json for the stake . One was rejected and one was accepted but that won't be shown in HiveSQL table , HiveSQL TxCustoms table shows all custom_json tx's irrespective of whether it is valid or not. It is shown in 'logs' in hive explorer that one is accepted and other is rejected.

Posted Using LeoFinance ^Beta

0.000

@geekgirl 79

over 3 years ago

How are ‘rejected’ and ‘accepted’ stored on the blockchain? separate custom_json?

0.000

@amr008 68

over 3 years ago (Edited)

That's what troubled me too, I asked the same on discord couple of weeks ago and got this answer

There is a bug in Keychain where some transactions are broadcasted twice. I guess this is something related to that. Even though two custom jsons were processed at the same time, Hive Engine would have accepted only one.

Then I headed down to Hive Engine explorer and saw this

First image ( accepted stake )

Second image ( rejected stake )

So the payload for both the custom_json was the same but the 'logs' was different . So if you want reliable 2nd layer token JSON data , according to me , hiveengine API is where to look at.

HiveSQL stores 'payload' in its table but what we need here is actually 'logs' . So right now I am extracting all the data from December 1st and storing it in JSON format ( although I prefer CSV , it will become difficult for me to clean it later) .

Did any of the above make sense?

!WINE

Posted Using LeoFinance ^Beta

0.000

@wine.bot 55

over 3 years ago

^{Cheers, @amr008 You Successfully Shared 0.100 WINE With @geekgirl.}
^{You Earned 0.100 WINE As Curation Reward.}
^{You Utilized 2/3 Successful Calls.}

wine-greeting

^{WINE Current Market Price : 0.000 HIVE}

0.000

@geekgirl 79

over 3 years ago

Yes, everything makes sense. I looked at some examples. You are right, there is no way to see which one is rejected, or accepted in HiveSQL.

The question is if 'logs' is actually stored in the blockchain or is it probably stored in a separate hive-engine database?

0.000

@amr008 68

over 3 years ago

Hm that's the question I am trying to find answer too. I would love to work with HiveSQL itself because via API it takes a lot of time.

Posted Using LeoFinance ^Beta

0.000

@geekgirl 79

over 3 years ago

I don’t know if this will help, but one idea is to collect all transaction_ids where logs show ‘error’ from hive-engine api. Then use this list to exclude in HiveSQL query.

0.000

@amr008 68

over 3 years ago

For sure that can be done but as far as I have explored till now , you can get the logs data only by going through all the blocks one by one.

Using a for loop , right now it's fetching 1000 blocks in 4 minutes so yeah it takes a long time again.

0.000

@arcange 79

over 3 years ago (Edited)

custom_json are non-consensus operations. This means that the blockchain will never reject any (except if authorities are missing).

Therefore, HiveSQL (like the blockchain) includes all broadcasted custom_json, even if "duplicate" or considered as invalid by a 2nd layer app.

This is why there is no hive-engine "state" table in HiveSQL because the hive-engine code is closed source and HiveSQL doesn't know how to interpret those operations.

0.000

@amr008 68

over 3 years ago

Got it :) Thanks for taking your time and clearing it.

0.000

@geekgirl 79

over 3 years ago

Thank you @arcange for explaining.

0.000

@arcange 79

over 3 years ago

My pleasure.

0.000

@dalz 79

over 3 years ago

Nice one .... what about export cv?

0.000

@geekgirl 79

over 3 years ago

What is export cv?

0.000

@dalz 79

over 3 years ago

Oh ... typo, sorry, I mean CSV ... to export the data in csv file :)

0.000

@geekgirl 79

over 3 years ago

One of the ways is to use csv module. I just added the following lines of code at the bottom of the script above, and the query results were saved as csv file on my desktop.

import csv
import os

fields = ['delegator', 'delegatee', 'vesting_shares', 'timestamp']
filepath = os.getcwd() + '/Desktop/' + 'delegation.csv'

with open(filepath, 'w') as f:
    write = csv.writer(f)
    write.writerow(fields)
    write.writerows(result)

0.000

@dalz 79

over 3 years ago

Great one!
Thanks a lot!

0.000

@poshbot 59

over 3 years ago

https://twitter.com/Bhattg18/status/1349272758421647364

0.000

@cmplxty 76

over 3 years ago

This is awesome! I'm just learning how to code and work with Python. I just the other day figured out how to get PyCharm installed on my computer so I can start playing around writing small bits of code on my laptop, since it's an IDE.

It would be really awesome to use Python to run my SQL queries, I had gotten pretty good with SQL last year for work so I would love to mess around with some of it here!

0.000

@geekgirl 79

over 3 years ago

Python is great. If you already know SQL you will love HiveSQL.

0.000

@importexport 49

over 3 years ago

This helps us consider how we might be able to explore and build for our project. Thank you !

Posted Using LeoFinance ^Beta

0.000

@geekgirl 79

over 3 years ago

That’s awesome.

0.000

@forykw 68

over 3 years ago

Fkkk you are indeed a geek girl... have you thought about coming to NZ? We are hiring HPC (High-Performance Computing) people, and you checkmark all the boxes until now. We already have a database person, but we need a second geek... I am on the storage and ALL troubleshooting scope, but I can always get some support for other experienced people. Reach me out if you are interested.

0.000

@geekgirl 79

over 3 years ago

I actually know nothing. I just try to learn sometimes. Thanks for the amazing offer. I am not planning to change jobs anytime soon.

0.000

@forykw 68

over 3 years ago

👍

0.000