top of page
Search

Python smart-open package

If you have a very large file on S3 or on the web use smart_open (or even if you have local file gzip..)

It is a Python 3 library for efficient streaming of very large files from/to storages such as S3, GCS, HDFS, WebHDFS, HTTP, HTTPS, SFTP, or local filesystem. It supports transparent, on-the-fly (de-)compression for a variety of different formats.


to print file header :


from smart_open import open

t=open('s3://bucket/file.csv.gz')

for n,i in enumerate(t):

if n==0:

print(i)






ree

 
 
 

Comments


Subscribe Form

©2019 by Big Data. Proudly created with Wix.com

bottom of page