HiveSQL in Python: Which module to choose - pypyodbc, pyodbc, or pymssql?

avatar
(Edited)

hivesql.png

For those who still don't know what HiveSQL is, it is a centralized MSSQL database developed and maintained by Arcange. It is one of the oldest services on Hive and has been around for years. It is always updated and provides access to Hive blockchain data with simple or complex SQL queries.

HiveSQL is great for anybody who wants to get data from Hive blockchain fast. If you plan to make an App, or compile some specific stats, or just want to learn and practice your SQL skills with real world database, give HiveSQL a try. It is free for anybody with a Hive account.

When I first started using HiveSQL, I had absolutely zero knowledge of SQL. Using HiveSQL is how learned SQL.

I prefer to use HiveSQL within python scripts. Because before I was started learning SQL, I was trying to learn python. Knowing a little bit of python gives more options of how to get data from Hive. Using Beem module by Holger80, we can get data from Hive blockchain directly. Beem is great. However, getting data directly from the blockchain in some cases may be slow. Sometime even too slow. Perhaps, best way is to use both Beem and HiveSQL.

When I first started using HiveSQL, I used pypyodbc module. At first it was working good. But later I realized it wasn't delivering all the data I wanted and some data was getting truncated. Then, Crokkon suggested using pyodbc instead and it worked perfectly.

I have used pyodbc module for a while and didn't have any issues until I decided to deploy a simple app to heroku. The app wouldn't build properly due to some problems with pyodbc. I suspect it had to do with having proper driver. Normally on my local computer I would install the microsoft driver. In this case I couldn't do it on Heroku. I tries various solutions offered online. None of them worked.

Before giving up on deploying the app to heroku, I decided to try one more thing. It was to use a different module - pmssql. It uses FreeTDS, which is easy to install on a computer and seems to be available on heroku already.

The best thing is to connect to the an MSSQL database all three of these modules use the similar methods, so it is not difficult to adjust to code from one to another. With pypyodbc and pyodbc we need to pass the driver name as argument, and it is not needed with pymssql. Everything else, was almost same.

Once I changed to pymssql, the app deployed to heroku without any issue. For this reason I would recommend using pymssql. I haven't had any problems using it yet.

I use the following simple python function to connect to HiveSQL and the results back. Afterwards I used the results within python code to do whatever else needed to be come with the data received.

def hive_sql(SQLCommand, limit):
    db = os.environ['HIVESQL'].split()
    conn = pymssql.connect(server=db[0], user=db[1], password=db[2], database=db[3])
    cursor = conn.cursor()
    cursor.execute(SQLCommand)
    result = cursor.fetchmany(limit)
    conn.close()
    return result

It takes SQL query commands as an argument and returns the results from HiveSQL. Within the python code we can write our SQL query and assign it to SQLCommand variable in a text format like this.

SQLCommand = '''
SELECT COLUMN_NAME
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = 'DynamicGlobalProperties'
'''

Hive blockchain and HiveSQL are great place to start for those who are interested in learning a little bit of python and SQL.

I would like to continue sharing my experiences using HiveSQL with more task focused code snippets and keep building into something more interesting. Feel free to share you thoughts and experiences with HiveSQL.

Posted Using LeoFinance Beta



0
0
0.000
17 comments
avatar

Great post :)
I used to use pyodbc from beginning but unfortunately it didn't work on heroku so I shifted to pymssql .

Perhaps, best way is to use both Beem and HiveSQL.

This is true. I sometimes use been API , HiveSQL as well as hiveengine API to make it effective.

Posted Using LeoFinance Beta

0
0
0.000
avatar

One of these days I will try hive-engine api too. Thanks for reminding of it.

Posted Using LeoFinance Beta

0
0
0.000
avatar

Sure. If you need any help please let me know. Since you are good at python, it will be easier for you.

Posted Using LeoFinance Beta

0
0
0.000
avatar

This is exactly what I want to learn and like you I have to start from zero. I am following you for more great content.

Posted Using LeoFinance Beta

0
0
0.000
avatar

Maybe worth noting that HiveSQL is free now since it is funded by a DHF proposal.
Thanks for the tip on pymssql, I haven't tried that yet.

0
0
0.000
avatar

I have been mentioning that in other HiveSQL posts. Forgot this time. I will add that now. Thank you.

Posted Using LeoFinance Beta

0
0
0.000
avatar

I thought you knew SQL for a long time but that's cool you learned from Hive too! I only had a teeny bit of experience with it before Hive and it works in a similar way to excel in terms of the logic in macros. For me, it's just knowing what all the different terms are like SELECT and UNION etc.

I can't remember if you shared with me a post that had the different terms used. I have saved the queries I use in the free SQL software I found but I had a lot of help in setting them up.

0
0
0.000
avatar

I started using HiveSQL when it used to be SteemSQL, so it would be about 3-4 years. I probably should have clarified that. I just generally like to refer to Hive as a rebranding of Steem and consider it as the original chain. :)

See if this cheatsheet can be helpful.

sql.png

Posted Using LeoFinance Beta

0
0
0.000
avatar

That's a great cheatsheet! Thank you, I'll be downloading that. I found you can also rename the columns after the SELECT command which is really helpful for uploading your csv files to tax/accountancy calculator websites like Koinly should you need to. They only accept certain tables with certain formats but it's been a lifesaver!

0
0
0.000
avatar

IMHO, your cheatsheet is not the most helpful to share about HiveSQL because it only allows users to issue SELECT statements.
Any command from the last two columns (except joins) will be rejected.

0
0
0.000
avatar

Yes, you are right. It is not necessarily a cheatsheet for HiveSQL, but SQL in general. I should have mentioned that.

It actually is good idea to create a cheatsheet specific to only HiveSQL that can help new users.

0
0
0.000
avatar

Wow great information. I'm new and interested in learning. Hope you will help me. I'm following you and also on Twitter now Thanks @geekgirl

0
0
0.000
avatar

I used to use hivesql and used to use with python. I used pypyodbc

0
0
0.000
avatar

I should have read this earlier. I found pyodbc and managed to install the driver to make it work, but I will bear pymssql in mind for future projects. I will have a look at your other posts around this topic as there is always something to learn.

I am impressed at the speed of HiveSQL queries. I used Beem for other stuff and it takes a while. I may see if SQL is a better option for that.

0
0
0.000
avatar

Yes HiveSQL is a lot faster. I haven't had any issues with pymssql yet. Implementation of both is similar too, so it wouldn't take much time to switch from pyodbc to pymssql if you ever will need to.

0
0
0.000