Background:
I have battled to install an rstudio server on AWS EC2 for some time (since November). It has been like the peeling back layers of an onion, always another hurdle. For a windows user the documentation on AWS that they post themselves is terrible. I have managed to create rstudio instances, but experience problems, that I am sure there are easy answers to, but I have had no luck working them out and I have tried many things. I am just trying to upload a file to rstudio, read it and save it somewhere. Considering reading in a file first, I have had the following problems.
Reading in a file:
The upload button will not accept large files. For small files this works fine, but anything that would require a server to analyse does not work via this button. I also tried via four different browsers to see if it is a browser issue, it was not. Small files worked no problem though (e.g. 40kb).
Putting files in dropbox and then syncing with rstudio did not work. The sync was fine for small files (e.g. 200kb), but GB files would not appear, or be corrupted.
I tried to connect via both Winscp and filezilla through putty. This was successful running the commands
sudo chown -R ubuntu /home/rstudio
sudo chmod -R 755 /home/rstudio
and I could upload to rstudio. Unfortunately after that point I could not longer access the instances. I could access them via AWS, but then could not by pass the rstudio login screen. I tried this many times, restarted and on many different instances. I also hired a freelancer to help me and he ran some other commands which I have kept a copy of to access rstudio in the same way. Successfully files could be uploaded(very slowly though), but then access logging in via the browser to the instance was no longer available, so effectively it took out my instances (I could start the instances, just not start rstudio).
I have also tried code in putty such as
rsync -avz myHugeFile.csv [email protected]
:
But it may be that I did not know how to put the location of myHugeFile.csv on my computer(I tried alot of things though), but it did not work.
I have managed to upload to a tmp folder on the AWs EC2 root drive and can then use putty to move the files accross, but uploading a 10GB file has taken 36hrs. I think this is not normal. When the files arrive they are much smaller than they were originally and they had become corrupted.
I am using the AMI from louis aslett on rstudio that you get if you type rstudio under community AMIs on the amazon platform when I set up my instance.
It is not a size of instance problem as have had 244GB Ram large instances and the 120GB ones mostly.
If it is possible to upload large files to dropbox and have them since with rstudio server, that would be great (at the moment only small files sync). Alternatively being able to use the upload button. Or pretty much any solution would be awesome. I have created an S3 bucket as it maybe easier this way, I suspect that amazon might limit ability to upload to EC2 via other routes. But that seems crazy to me.
Please do let me know if you have any thoughts on making any one of these steps work.