This post is about importing and using your own functions and the difference between packages and modules.
Below function convert Excel to csv format, and can be run in any editor( for me it’s VS Code).
import pandas as pd from pandas.io import excel import numpy as np import os.path ''''' Convert Excel xlsx file to csv file. If csv file already there. it will append to that file else it will create new csv file. ''''' def excel_to_csv(from_excel_file, to_csv_file): excel_file = from_excel_file csv_file = to_csv_file try: if os.path.isfile(to_csv_file): print('write to existing csv file') from_excel = pd.read_excel (excel_file) from_excel.to_csv(csv_file, mode='a', index=False, header=False) else: print("csv file created...") from_excel = pd.read_excel (excel_file) #remove dulicates from_excel.drop_duplicates(['date','header','timeline'],keep= 'last') from_excel.to_csv (csv_file, index = None, header=True) except Exception as ex: print('========================================================') print('Excel to csv conversion failed...') print('Is Excel file and csv file available?') print('========================================================') if __name__ == "__main__": excel_to_csv( r'C:\Users\my_file\Documents\May2021\News\all_news.xlsx', r'C:\Users\my_file\Documents\May2021\News\to_sql_news.csv')
You can import your functions as modules and use them in another script. Modules are scripts saved with .py file name (eg. update_data.py).
from excel_to_csv import excel_to_csv ''' this is just a script to demostrate importing a self written module. the excel_to_csv module is in the same folder as this script. So can just use 'from excel_to_csv import excel_to_csv'. The module name and function name happens to be the same. ''' #function call, takes 2 arguments, Excel file address and csv file address excel_to_csv( r'C:\Users\my_file\Documents\May2021\News\all_news.xlsx', r'C:\Users\my_fileDocuments\May2021\News\to_sql_news.csv')
Packages are modules in a folder. Sometimes you need to group your modules into separate folders due to sheer amount or just to group them into area of applications. (like modules for data transformation, modules for data extraction etc…).
For this example I shall put the excel_to_csv.py file into another folder named ‘Data_Transform’. You need to have a empty python file named __init__.py inside the Data_Transform folder, to tell Python that this folder is a python package. After which you can call the modules using this convention : from Data_Transform.excel_to_csv import excel_to_csv.
from Data_Transform.excel_to_csv import excel_to_csv ''' this is just a script to demostrate importing a self written module in a package. The module name and function name happens to be the same. ''' #function call, takes 2 arguments, Excel file address and csv file address excel_to_csv( r'C:\Users\my_folder\Documents\May2021\News\all_news.xlsx', r'C:\Users\my_folder\Documents\May2021\News\to_sql_news.csv')