I currently have a data frame of ~83000 rows (13 columns) that has data from years 2000-2012 of crimes, each row is a crime and the zip code is reported (so the zip code xxxxx can be found in year 2001, 2003, and 2007 as an example).
Here is an example of my data:
Year Quarter Zip MissingZip BusCode LossCode NumTheftsPQ DUL
2000 1 99502 1 3 5 2 9479
2009 2 99502 2 3 4 3 3220
2000 1 11111 1 3 5 2 3479
2004 2 11111 2 3 4 3 1020
Right now, I am able to assign global variables to all of my zip codes (I am using R studio and my list of data shown is very long and it has significantly slowed the program). Here is how I have assigned global variables to all of my zip codes:
for (n in all.data$Zip) {
x <- subset(all.data, n == all.data$Zip) #subsets the data
u <- x[1,3] #gets the zip code value
assign(paste0("Zip", u), x, envir = .GlobalEnv) #assigns it to a global environment
#need something here, MasterList <<- ?
}
I would like to contain all of these variables in a list. For example, if all my zip code variables were stored in list "MasterList":
MasterList["Zip11111"]
would yield the data frame:
Year Quarter Zip MissingZip BusCode LossCode NumTheftsPQ DUL
2000 1 11111 1 3 5 2 3479
2004 2 11111 2 3 4 3 1020
Is this possible? What would be an alternative/faster/better way to do such? I was hoping that storing these variables in a list would be more efficient.
Bonus points: I know in my for loop I am reassigning variables that already exist to the exact same thing, wasting processing time. Any quick line I could add to speed this up?
Thanks in advance for your help!
all.data[all.data$Zip == 11111,]
? I don't understand what you're trying to achieve with thefor
loop...