Just finished building a drag and drop uploader feature for http://commando.io. I went back and forth between storing files in S3, on disk, or using database backed storage. Unfortunately CORS was not released yet, because if it was, I would have gone with S3.
I went MongoDB + GridFS. It has worked out well so far. Really like that I can store the meta-data right with the binary data in MongoDB, and can query and filter directly on the meta-data. Also, using a MongoDB replica set, I get automatic redundancy of files.
I put together a FileAPI based uploader using signed PUTs. I hadn't gotten around to releasing it yet but here is a gist for those who might be interested: https://gist.github.com/3593744
With CORS, you can now easily build web applications that use JavaScript and HTML5 to interact with resources in Amazon S3, enabling you to implement HTML5 drag and drop uploads to Amazon S3, show upload progress, or update content
With CORS in place, is it a good idea to uploading images (or for that matter HTML) directly to S3 instead of the file system, say if you are building a CMS?
The only issue with images is where you would want the image to be processed (resized for example), before being published, as S3 quite rightly doesn't carry out that sort of function. Using a 2000x2000 image as a thumbnail is not a great idea.
I've been thinking about this as I need to implement it soon. I think I'm going to send to S3 then just send a message to a worker to grab it an resize it in the background. My workers are on different VMs so I need some sort of shared storage. That said I've also considered stuffing the image into redis but that seems gross even though my image volume is low.
The other benefit I see in going straight to S3 is that I can use the S3 URL to display the image to the user in context immediately after upload even though it hasn't been resized. The worker can just update the image URL when its done with its resizing.
I think this would be a good use case for CORS. You'd be able to have users upload lage images, accessible through the CMS, without locking up any of your (as an example)Nginx processe during the transfer. Very attractive.
How safe is it against abuse [i.e. MITM, dos]? Most probably I am missing smth, but if credentials are applied in browser, can user get hold of them and upload a couple of petabytes?
On an upload-by-upload basis, you create a signed manifest. This dictates what actions can be taken on an S3 bucket, e.g., the key that can be written to, the size of the files that can be uploaded. This can be used to protect against a DDoS attack.
I went MongoDB + GridFS. It has worked out well so far. Really like that I can store the meta-data right with the binary data in MongoDB, and can query and filter directly on the meta-data. Also, using a MongoDB replica set, I get automatic redundancy of files.
I made a quick demo video of the upload feature showing drag & drop, and the files interface, check it out if you please: http://www.youtube.com/watch?v=ru7YZ2E65YU