All Questions
Tagged with doparallel data.table
13
questions
1
vote
0
answers
78
views
R foreach do parallel %dopar% performance problems (and possibly affecting entire computer)
Note:
I recognize this is a slightly more amorphous/non-replicable problem than is ideal, but I feel it is worthwhile given the other instances we've seen on stackoverflow and potential general ...
0
votes
2
answers
142
views
Fast Sampling with Replacement in R without using a loop or Apply
I need a fast and efficient way of sampling with replacement for my Bootstrapping exercise.
I found a similar question here but the solution doesn't offer enough of a speed up
similar question
Here is ...
0
votes
1
answer
278
views
Issues with seed setting in parallelised foreach loop R
I have a selection of tests I want to run in parallel. When I do this using foreach(), I get the expected output of 20 test - iteration pairs:
## Without seed
require(data.table)
require(foreach)
...
0
votes
0
answers
72
views
How to parallelize an explicit for-loop over a data.table (by reference)
Problem statement
I have this function daeqtl_mapping_() that takes in three data tables: snp_pairs, zygosity and ae.
I am iterating over each row of snp_pairs, and updating it with two new columns.
...
1
vote
1
answer
734
views
results from foreach loop in R
I have a function that I need to run on 2000 data frames. Each iteration is taking a very long time i.e almost 40 minutes and hence I'm using the 'foreach' package in R.
I have generated the data in ...
0
votes
0
answers
145
views
R: data.table inside GA optimization throws error
I am trying to run a GA in parallel with the packages GA and data.table.
In short, I am optimizing a high-dimensional (100k variables), non-linear objective function. For now, I am looking into a ...
2
votes
1
answer
324
views
What is foreach %dopar% actually doing when applied to a dataframe as in df[i,]
I think I've completely misunderstood how foreach parallel operations work.
In the following example is foreach running 7 independant threads of foo(DF[i,]) for different values of i which leapfrog ...
0
votes
1
answer
138
views
Process optimisation of code within dopar
I am trying to optimize my code to run glms multiple times, and I would like to leverage parallelization, either with foreach or some other more efficient way.
As you can see; the for loop takes ...
0
votes
1
answer
377
views
Parallelize set in data.table; works with for loop;but foreach %dopar% doesn't; foreach %do% works
I am trying to parallelize my code below; it works perfectly fine with foreach %do%; but not with %dopar%; could someone please help.
I did look at a few other posts and tried a few things but ...
2
votes
1
answer
138
views
Processing Large Data Sets in R
I have a data set of ~5mm rows of businesses with contact information (ID(int), Email(text), BusinessPhone(text), WorkPhone(text), CellPhone(text)) - over 3 million of these rows contain duplicate ...
6
votes
0
answers
1k
views
Run several R functions in parallel
I have a dataset with few numeric columns and over 100 millions of rows as a data.table object. I would like to do group operations on some of the columns based on other columns. For example, count ...
1
vote
1
answer
1k
views
Parallelization with data.table
I have the following problem. I have a piece-wise linear function described by (xPoints, yPoints) and want to compute fast--I have to do it over and over again--the implied y-value for a long list of ...
0
votes
0
answers
303
views
data.table operations with %dopar% are very slow
I run a loop over elements of list grouped_data_list using foreach and dopar.
The runtime is terribly slow, while workers are visibly busy.
If I make a vectorized routine with lapply, and without ...