HiveSQL with Python: Tables and Columns

avatar

hivesql.png

HiveSQL is a free service that provides us with ability to retrieve Hive blockchain data in a fast and easy manner. I prefer using HiveSQL within my python scripts. Feel free to read my previous post on the topic - Using HiveSQL with Python on Mac to see how to get started with HiveSQL using python.

When using SQL commands within python code we don't have all of the information about table names and the column names within the database right away. In a gui programs we can find them under the schema information. It would make sense to first create a reference file to see what tables and columns are available in the database.

In this post I would like to share how to get that information with a python code and store it as a csv file for future reference.

There is a ver simply SQL command to get the list of all tables and columns:

SELECT TABLE_NAME, COLUMN_NAME
FROM INFORMATION_SCHEMA.COLUMNS

That's it. It will return us all the table and column names. Taking this data I use few lines of python code to properly format them, so that I can save it as csv file. Once that's done, using csv module we save the file to the desired location.

End result of the script is a file, hivesql_tables_columns.csv saved on the desktop or any other preferred location. Now, every time we use HiveSQL, we can use this reference file to see what tables we have and columns within those tables.

Following is the python code and it should be self explanatory. Feel free to ask if you have questions, or share ideas.

import pyodbc
import csv
import os

connection = pyodbc.connect('Driver={ODBC Driver 17 for SQL Server};'
                            'Server=vip.hivesql.io;'
                            'Database=DBHive;'
                            'uid=Hive-geekgirl;'
                            'pwd=XXXXXXXXXXXXXXXX')

cursor = connection.cursor()

SQLCommand = '''
SELECT TABLE_NAME, COLUMN_NAME
FROM INFORMATION_SCHEMA.COLUMNS
'''
result = cursor.execute(SQLCommand)
result = result.fetchmany(2000)
connection.close()

def convert_to_sheet(result):
    sheet = []
    row = []
    for table, column in result:
        if table in row:
            row.append(column)
        else:
            if row != []:
                sheet.append(row)
                row = []
            row.append(table)
            row.append(column)
    sheet.append(row)

    max_len = 0
    for row in sheet:
        if len(row) > max_len:
            max_len = len(row)
    for row in sheet:
        for x in range(len(row),max_len):
            row.append('-')

    sheet2 = []
    col = []
    for count in range(max_len):
        for row in sheet:
            col.append(row[count])
        sheet2.append(col)
        col = []
    return sheet2

def save_as_csv(sheet):
    filepath = os.getcwd() + '/Desktop/' + 'hivesql_tables_columns.csv'
    with open(filepath, 'w') as f:
        write = csv.writer(f)
        write.writerows(sheet)

sheet = convert_to_sheet(result)
save_as_csv(sheet)

Posted Using LeoFinance Beta



0
0
0.000
12 comments
avatar

This is cool :)

I usually used this to get the table names -

cursor = conn.cursor()
for row in cursor.tables():
print(row.table_name)

Then getting the columns was easy for whichever table I needed.

0
0
0.000
avatar

You will get a better view of how tables are organized by pre-sorting data 😉

SELECT
    table_name, 
    column_name
FROM 
    INFORMATION_SCHEMA.COLUMNS
ORDER BY
    table_name, 
    ordinal_position

Another problem is programmatically limiting the number of rows you retrieve without including it in a loop could invalidate your script if the total number of rows returned becomes greater than your limit.

0
0
0.000
avatar

Hive SQL has been very handy to prepare my stats and reports. I have some common scripts written in Javascript and Have been using it for a long time. Cheers! !wine

Posted Using LeoFinance Beta

0
0
0.000