0

Background:

I have battled to install an rstudio server on AWS EC2 for some time (since November). It has been like the peeling back layers of an onion, always another hurdle. For a windows user the documentation on AWS that they post themselves is terrible. I have managed to create rstudio instances, but experience problems, that I am sure there are easy answers to, but I have had no luck working them out and I have tried many things. I am just trying to upload a file to rstudio, read it and save it somewhere. Considering reading in a file first, I have had the following problems.

Reading in a file:

The upload button will not accept large files. For small files this works fine, but anything that would require a server to analyse does not work via this button. I also tried via four different browsers to see if it is a browser issue, it was not. Small files worked no problem though (e.g. 40kb).

Putting files in dropbox and then syncing with rstudio did not work. The sync was fine for small files (e.g. 200kb), but GB files would not appear, or be corrupted.

I tried to connect via both Winscp and filezilla through putty. This was successful running the commands

sudo chown -R ubuntu /home/rstudio

sudo chmod -R 755 /home/rstudio

and I could upload to rstudio. Unfortunately after that point I could not longer access the instances. I could access them via AWS, but then could not by pass the rstudio login screen. I tried this many times, restarted and on many different instances. I also hired a freelancer to help me and he ran some other commands which I have kept a copy of to access rstudio in the same way. Successfully files could be uploaded(very slowly though), but then access logging in via the browser to the instance was no longer available, so effectively it took out my instances (I could start the instances, just not start rstudio).

I have also tried code in putty such as

rsync -avz myHugeFile.csv [email protected]:

But it may be that I did not know how to put the location of myHugeFile.csv on my computer(I tried alot of things though), but it did not work.

I have managed to upload to a tmp folder on the AWs EC2 root drive and can then use putty to move the files accross, but uploading a 10GB file has taken 36hrs. I think this is not normal. When the files arrive they are much smaller than they were originally and they had become corrupted.

I am using the AMI from louis aslett on rstudio that you get if you type rstudio under community AMIs on the amazon platform when I set up my instance.

It is not a size of instance problem as have had 244GB Ram large instances and the 120GB ones mostly.

If it is possible to upload large files to dropbox and have them since with rstudio server, that would be great (at the moment only small files sync). Alternatively being able to use the upload button. Or pretty much any solution would be awesome. I have created an S3 bucket as it maybe easier this way, I suspect that amazon might limit ability to upload to EC2 via other routes. But that seems crazy to me.

Please do let me know if you have any thoughts on making any one of these steps work.

2
  • Is your EC2 instance running behind an ELB, or are you uploading directly to that instance?
    – Castaglia
    Commented Mar 11, 2016 at 18:57
  • Hi there is is not ELB to my best knowledge, but appreciate the suggestion.
    – Joey
    Commented Jul 2, 2016 at 4:06

1 Answer 1

0

Ok so I realised what was going on here. The default home directory size for AWS is less than 8-10GB regardless of the size of your instance. As this as trying to upload to home then there was not enough room. An experienced linux user would not have fallen into this trap, but hopefully any other windows users new to this who come across this problem will see this. If you upload into a different drive on the instance then this can be solved. As the Louis Aslett Rstudio AMI is based in this 8-10GB space then you will have to set your working directory outside this, the home directory. Not intuitively apparent from Rstudio server interface. Whilst this is an advanced forum and this is a rookie error I am hoping no one deletes this question as I spent months on this and I think someone else will too. If someone has a better way to get around this please feel free to add it :)

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .