Download files from Amazon’s S3
February 1, 2010 | pythonauthor: Karol Zielinski | comments: 1 | views: 1803
Tags: amazon, amazon web services, aws, download, python, s3, script
I had lots of files in my bucket on S3. That’s why downloading these files via firefox plugins wasn’t the best possible idea. I had to create script, which will do it for me.
Ok, so let’s code something…
We need to have one external library in our environment. It’s boto (Python interface to Amazon Web Services).
Now, create new file (let’s call it ‘download_files_from_s3.py’) and add there:
from boto.s3 import Connection
import StringIO
import getopt, sys
class Downloader():
base_dir = None
bucket_name = None
aws_access_key_id = 'YOUR_AWS_ACCESS_KEY_ID'
aws_secret_access_key = 'YOUR_SECRET_ACCESS_KEY'
def start_downloading(self):
try:
opts, args = getopt.getopt(sys.argv[1:], "b:u:", ["help"])
except getopt.GetoptError, err:
print "No way! We don't have an option like this one!"
print str(err)
sys.exit(2)
self.base_dir = None
for o, b in opts:
if o == "-b":
self.base_dir = b
print "I have base_dir"
print "It's: " + str(self.base_dir)
if o == "-u":
self.bucket_name = b
print "I have bucket_name"
print "It's: " + str(self.bucket_name)
elif o in ("-h", "--help"):
print "No way! We don't have an option like this one!"
sys.exit()
if not self.base_dir:
print "I need base_dir!"
sys.exit()
if not self.bucket_name:
print "I need bucket_name!"
sys.exit()
# connect with Amazon S3
s3_connection = Connection(aws_access_key_id=self.aws_access_key_id,
aws_secret_access_key=self.aws_secret_access_key)
bucket = s3_connection.get_bucket(self.bucket_name)
print 'Start fetching keys...'
keys = []
rs = bucket.get_all_keys()
for each in rs:
keys.append(each)
print 'Loop nr. 1'
print str(len(rs)) + ' keys in current loop'
inc = 2
while True:
last = keys[len(keys)-1]
rs = bucket.get_all_keys(marker=last.key)
print 'Loop nr. ' + str(inc)
print str(len(rs)) + ' keys in current loop'
if len(rs) < 1:
break
for each in rs:
keys.append(each)
inc += 1
print 'I have all keys'
print 'Start downloading...'
print 'We have ' + str(len(keys)) + ' files to download.'
inc = 0
for each in keys:
orig_image_key = bucket.get_key(each)
orig_im_temp = StringIO.StringIO(orig_image_key.get_contents_as_string())
f = open('%s%s' % (self.base_dir, each.name), 'w')
f.write(orig_im_temp.getvalue())
f.close()
inc += 1
print 'File ' + str(inc) + ' is on local disk'
print 'All finished.'
def main():
downloader = Downloader()
downloader.start_downloading()
if __name__ == '__main__':
main()
Ok, script is ready.
We need to run it by:
python download_files_from_s3.py -b our_base_dir -u our_bucket_name
where our_base_dir is a path where we want to put downloaded files and our_bucket_name is a name of our bucket on AWS S3.
Hello, I'm Karol Zielinski, internet evangelist, an entrepreneur, project manager and a web developer from Gdynia, Poland. I like creative design, good advertisement, social media and all kind of stuff around the web.
February 1, 2010, 4:46 am
[...] tech.karolzielinski.com Follow us on Twitter 25 śledzących RSS Feed 191 czytelników Skrypt do ściągania plików z S3 1 głosuj! Miałem mnóstwo plików na Amazonowym S3, które musiałem jakoś stamtąd [...]