... because from time to time I'm a web developer, too
About me
Projects
Contact
Links

Download files from Amazon’s S3

February 1, 2010 | python
author: Karol Zielinski | comments: 1 | views: 1803
Tags: , , , , , ,

I had lots of files in my bucket on S3. That’s why downloading these files via firefox plugins wasn’t the best possible idea. I had to create script, which will do it for me.

 

amazon web services

Ok, so let’s code something…

We need to have one external library in our environment. It’s boto (Python interface to Amazon Web Services).

Now, create new file (let’s call it ‘download_files_from_s3.py’) and add there:

from boto.s3 import Connection
import StringIO
import getopt, sys

class Downloader():

    base_dir = None
    bucket_name = None
    aws_access_key_id = 'YOUR_AWS_ACCESS_KEY_ID'
    aws_secret_access_key = 'YOUR_SECRET_ACCESS_KEY'

    def start_downloading(self):

        try:
            opts, args = getopt.getopt(sys.argv[1:], "b:u:", ["help"])
        except getopt.GetoptError, err:
            print "No way! We don't have an option like this one!"
            print str(err)
            sys.exit(2)

        self.base_dir = None
        for o, b in opts:
            if o == "-b":
                self.base_dir = b
                print "I have base_dir"
                print "It's: " + str(self.base_dir)
            if o == "-u":
                self.bucket_name = b
                print "I have bucket_name"
                print "It's: " + str(self.bucket_name)
            elif o in ("-h", "--help"):
                print "No way! We don't have an option like this one!"
                sys.exit()

        if not self.base_dir:
            print "I need base_dir!"
            sys.exit()

        if not self.bucket_name:
            print "I need bucket_name!"
            sys.exit()

        # connect with Amazon S3
        s3_connection = Connection(aws_access_key_id=self.aws_access_key_id,
                                   aws_secret_access_key=self.aws_secret_access_key)
        bucket = s3_connection.get_bucket(self.bucket_name)

        print 'Start fetching keys...'

        keys = []
        rs = bucket.get_all_keys()

        for each in rs:
            keys.append(each)

        print 'Loop nr. 1'
        print str(len(rs)) + ' keys in current loop'

        inc = 2
        while True:
            last = keys[len(keys)-1]

            rs = bucket.get_all_keys(marker=last.key)

            print 'Loop nr. ' + str(inc)
            print str(len(rs)) + ' keys in current loop'

            if len(rs) < 1:
                break
            for each in rs:
                keys.append(each)

            inc += 1

        print 'I have all keys'
        print 'Start downloading...'
        print 'We have ' + str(len(keys)) + ' files to download.'

        inc = 0
        for each in keys:
            orig_image_key = bucket.get_key(each)
            orig_im_temp = StringIO.StringIO(orig_image_key.get_contents_as_string())

            f = open('%s%s' % (self.base_dir, each.name), 'w')
            f.write(orig_im_temp.getvalue())
            f.close()

            inc += 1

            print 'File ' + str(inc) + ' is on local disk'

        print 'All finished.'

def main():
    downloader = Downloader()
    downloader.start_downloading()

if __name__ == '__main__':
    main()

Ok, script is ready.

We need to run it by:

python download_files_from_s3.py -b our_base_dir -u our_bucket_name

where our_base_dir is a path where we want to put downloaded files and our_bucket_name is a name of our bucket on AWS S3.

Bookmark and Share
Post Download files from Amazon’s S3 to develway Post Download files from Amazon’s S3 to Delicious Post Download files from Amazon’s S3 to Digg Post Download files from Amazon’s S3 to Facebook Post Download files from Amazon’s S3 to Reddit Post Download files from Amazon’s S3 to StumbleUpon

Related news and resources

Comments (1)

4Avatars v0.3.1 v0.3.1
Skrypt do ściągania plików z S3 - develway.pl - wiadomości dla programistów, wiadomości IT, świeże linki ze świata IT
February 1, 2010, 4:46 am

[...] tech.karolzielinski.com Follow us on Twitter 25 śledzących RSS Feed 191 czytelników Skrypt do ściągania plików z S3 1 głosuj! Miałem mnóstwo plików na Amazonowym S3, które musiałem jakoś stamtąd [...]

Write a comment

Karol Zielinski :: Just a tech stuff Hello, I'm Karol Zielinski, internet evangelist, an entrepreneur, project manager and a web developer from Gdynia, Poland. I like creative design, good advertisement, social media and all kind of stuff around the web.

Most popular posts

Much more links

Karol Zielinski    |   contact me
Gdynia, Poland
RSS - Just a tech stuff - python, java blog - web development blog Karol Zielinski on twitter Karol Zielinski on LinkedIn Karol Zielinski on facebook Karol Zielinski on delicious Karol Zielinski on digg Karol Zielinski on flickr Karol Zielinski on stumbleupon Karol Zielinski on technorati