2

I am working on a desktop app that offers uploading to the cloud storage. Storage providers have an easy way to upload files. You get accessKeyId and secretAccessKey and you are ready to upload. I am trying to come up with optimal way for upload files.

Option 1. Pack each app instance with access keys. This way files can be uploaded directly to cloud without the middle man. Unfortunately, I cannot execute any logic before uploading to the cloud. For example.. if each users has 5GB of storage available, I cannot verify this constraint right at storage provider. I might send a request to my own server before upload to make verification, but since keys are hardcoded in app and I am sure this is an easy exploit.

Option 2. Send each uploaded file to a server, where constraint logic can be executed and forward the file to the final cloud storage. This approach suffers from bottleneck at the server. For example, if 100 users start uploading(or downloading) 1 GB file and if the server has bandwidth speed 1000Mb/s, than each user uploads at only 10Mb/s = 1.25MB/s.

Option 2 seems to be the way to go, because I get control over who can upload. I am looking for tips to minimise bandwidth bottleneck. What approach is recommended to handle simultaneous uploading of large files to the cloud storage? I am thinking of deploying many low-cpu and low-memory instances and use streaming instead of buffering the whole file first and sending it after.

3
  • I think you need to provide more information about your application and its purpose. E.g.: Where do you get these storage access keys? From the storage provider? Do you provide them yourself or do the users sign up with the providers first? (Generally, modern apps allow users to setup their own storage accounts and then use those.) Scaling projections: How many users do you need to support for you initial launch? How often will they upload data, and at what size? Commented Apr 30, 2019 at 0:11
  • As an example, an app could give a user the option of Dropbox, Google Drive, or Microsoft OneDrive for cloud storage. Then the user would need to save the credentials for their provider into the app. From then on, the app would be able to save files to the online storage (and presumably also interrogate the cloud provider about the amount of storage space available, if its API allows that.) Commented Apr 30, 2019 at 0:15
  • Storage keys are from the storage providers. It's plug and play. You get the keys, you are ready to go. I provide keys, because I provide storage to make things as easy and cheap as possible for the user (dropbox is super expensive). I really dont know much about the number of users and file size. The thing with storage providers like aws and simililar.. they offer storage and thats it. I haven't found a way to incorporate any logic directly at their end.
    – sanjihan
    Commented Apr 30, 2019 at 7:07

2 Answers 2

3

Option 1 vs Option 2

Any validation you do on your server is obviously completely pointless, if you then allow the user to upload the file directly to your cloud storage (Option 1).

Going through your server (Option 2) may be a good approach at the beginning, if you don't expect to have large numbers of concurrent users right from the start. But your question was about how to move files to the cloud directly...

Alternative Solution

You don't want to give your users the secretAccessKey - that's why it's called secret. Instead, you'll validate your users and provide them with a temporary, restricted access key to your cloud storage (e.g. AWS STS). The client then uses this key to upload the file.

It should be possible to set up basic restrictions on file size, etc. with your storage provider. For more complex verification (e.g. only cat pics are allowed), you'll likely have to run the validation after the upload completed and then remove invalid files.

2
  • Thanks for an answer. I inspected the cloud storage providers for this and indeed they offer temporary application keys(backblaze.com/b2/docs/application_keys.html). They don't offer any of the basic restrictions though. You are charged for download, yet they offer no way of knowing which app key is consuming the bandwidth. It is mind boggling they left that out. Of course going with azure or amazon is pointless, because paying $90 for 1 TB of download is crazy. Are you aware of any cloud storage providers that have similar pricing as backblaze and offer application keys?
    – sanjihan
    Commented May 4, 2019 at 10:46
  • I don't. But perhaps you could provide read-access via temporary keys as well (depends on your requirements). Then you allow people to upload files, check them, and only after they're checked do you give out read-keys.
    – doubleYou
    Commented May 5, 2019 at 8:02
4

Keys to the Kingdom

Consider the cloud storage as if it were your house, and the access codes as the keys that open its front-door.

Do you:

  1. Hand the keys over to your customer to go and collect their package from your house?
  2. Hand the keys to a trusted employee who will, on the clients request, retrieve the package from your house?

If you answered option 1. Please stop. Stop all development and programming activities immediately. Go to your local police station and request a quick conversation about home security and appropriate precautions.

If you answered option 2. Congratulations. You understand that while people are generally good, and will tend to do the right thing, they will:

  1. Have a bad day, with accidents. People make mistakes, it happens, and they don't always own up, if they've even noticed that they made a mistake.
  2. Be well meaning and set things straight. Whether you wanted it or not.
  3. Be curious, and take an extra look around. Well that's interesting they ordered that from SomeBusiness did they?
  4. Be outright mean. Wouldn't this all look prettier in my place... oh, and this place needs redecorating...

Also the lovely bonus:

  1. Your trusted employee has every incentive to:
    • report accidents,
    • follow the process,
    • not sticky beak (too much, and keep relatively quite about what they do see),
    • and not be a destructive individual (they like being paid, and they do not like being prosecuted in a court of law).

So always place your own keys behind an API that you trust. Where it can be monitored. So when things go wrong you have reasonable avenues of redress to help recover from the current problem, and/or avoid repeating them as future problems.

Their Kingdom Keys

That being said. No one said you had to provide your own house for them to store packages in. How about if you give them a key chain onto which they can add the address and keys for their own warehouse?

  • This obviates the need for you to worry about usage controls, and restrictions. Unless these are features provided to help them manage their own warehouse.

  • It obviates the need for you to maintain lists of who can access the service. They have their key, away they go.

  • It also removes you of any culpability for lost, stolen, or other misuses of their data - That is between them and their warehouse, assuming you aren't doing anything illegal/immoral. Your only concern is about avoiding mistakes.

Not the answer you're looking for? Browse other questions tagged or ask your own question.