top of page
Search


CDC in Pandas - Or how do I get the last relevant row?
If you have a case of CDC - change data capture, in pandas and you want to have the last and most updated row per entity, CDC means that...
bdata3
Nov 22, 20221 min read


Jupyter and Pandas tips and tricks
This is an ongoing post with tips and tricks to use with Jupyter and Pandas - sure there are many more send me and I'll add them...
bdata3
Jan 23, 20221 min read


Apache Airflow - install change and and basic setting
To install use docker-compose (how to it install on ec2 see the previous post): https://airflow.apache.org/docs/apache-airflow/stable/sta...
bdata3
Jan 4, 20221 min read


Install Mongo using Docker and access it from OUTSIDE
Well just spent a couple of hours figuring it out... follow this link and some common sense and you'll get there...
bdata3
Dec 16, 20211 min read


Easy way to python in aws - use awswrangler
If you write python code in AWS environment you should consider use awswrangler. I wrote a simple example - for finding a dataframe...
bdata3
Aug 3, 20211 min read


How to know your data
There are many ways how to know your data prior to manipulate and process it: you can use if you have the dat in sql you can use sql to...
bdata3
Apr 22, 20211 min read


Running SQL in Jupyter - Nice one
You can run sql and than visualise using pandas and plot.... and it's sooo easy : !pip install ipython-sql import sqlalchemy %load_ext...
bdata3
Jan 31, 20211 min read


How to run mysql on docker and connect from remote
So I've spent (too much time and got to much errors in the way) what worked for me in the end is: mkdir mysql create docker-compose.yml...
bdata3
Nov 15, 20201 min read


Converting Single string quote into JSON and running on bunch of files
So you have a file like json but with single quote and you want to load it to mongodb or just to have a valid Json .... import ast,json...
bdata3
Oct 21, 20201 min read


PDF - extract several pages out of a large pdf using python
Sometimes you get a very long PDF and in the end, you want only a couple of pages out of it. us the following code to grab these pages :...
bdata3
Jul 19, 20201 min read


Pandas add a column to groupby dataframe
There are some times where you want to add to grouped by datagram additional columns, for example, if you group by substring of the...
bdata3
Jul 6, 20201 min read


Save Pandas data frame to S3
Well sometimes it just works when having the s3 as the destination but the following worked even when the above didn't: from io import...
bdata3
Jul 5, 20201 min read


Clipboard and Python - Or cut and file
There may be a situation when you have a picture in your clipboard and you want to save it in a file. Or you are working with Pandas and...
bdata3
May 14, 20201 min read


Python smart-open package
If you have a very large file on S3 or on the web use smart_open (or even if you have local file gzip..) It is a Python 3 library for...
bdata3
May 10, 20201 min read


Python Pandas groupby - filter out/chose group
So you have a dataframe and you want to group by and then take only some of the groups or take only groups other then other. After group...
bdata3
May 10, 20201 min read


Pandas and S3 in 1 liner
You have a gzip (or other) file on S3 and you want it in dataframe: import pandas as pd import boto3 df = pd.read_csv('s3://bucket/file.c...
bdata3
Apr 23, 20201 min read


Useful link on AWS env and Docker
In this short post, I've collected useful links for EC2 improvements such as how to configure the local discs, how to add UI if you need...
bdata3
Mar 22, 20201 min read


Cut a part of a movie - or Python for Monty Python
Well, you want to send a small snap of a movie - let's say in Whatsapp - for example, part of Life of Brian - because you want to prove a...
bdata3
Mar 17, 20201 min read


Python, Pandas, SQL , Merge and Group by
So you have You have an expense report, for example, coming from aws CUR (https://aws.amazon.com/premiumsupport/knowledge-center/cost-us...
bdata3
Mar 17, 20201 min read


Access s3 boto3 client with profile - or other service with python
Sometimes you need to access your aws services with specific profile (or not default) The easiest way to do so : import boto3 new_session...
bdata3
Feb 20, 20201 min read
Blog: Blog2
bottom of page