Python, Pandas, SQL , Merge and Group by

So you have

You have an expense report, for example, coming from aws CUR (https://aws.amazon.com/premiumsupport/knowledge-center/cost-usage-report/) but you want to group it per customer (not account and by month)

use the following python code it may help you:

from pyathena import connect

import pandas as pd

st='s3://c-res-stg/s1/'

conn = connect(aws_access_key_id=access,aws_secret_access_key=secret, s3_staging_dir=st,region_name='eu-west-1')

sql_q='''select linkedaccountid,d_month,sum(blendedcost) cost from c_cust_db.c_res_fin_d_p1 group by linkedaccountid,d_month'''

df=pd.read_sql_query(sql_q,conn)

cust_df=pd.read_excel('/Users/geva/Downloads/Customer_list.xlsx')

t=df.merge(cust_df,left_on='linkedaccountid',right_on='Account ID').groupby(['Customer Name','d_month']).sum()

Recent Posts