3

I want to generate a character vector with 20 elements, each has a random string.

So I generate a random string with the following code:

sample(x = c(letters, LETTERS), size = sample.int(100, 1), replace = T) %>% paste0(collapse = "")

and I generate the vector with this:

replicate(n = 20, expr = sample(x = c(letters, LETTERS), size = sample.int(100, 1), replace = T) %>% paste0(collapse = ""))

(I am basically putting the first code chunk in replicate as an argument.) and the output is the following (which is exactly what I want):

 [1] "DyHpcnruLfKHOsvy"                                                                          
 [2] "lQqwOkKD"                                                                                  
 [3] "XddKDCtOJqZHxAgHqreDwWSBQkDCBdwFclHMFhuCzXXwb"                                             
 [4] "bmuiUUdsnHJsxIEyeLClvbLGBbfEgXFVsScrWxiTcZxPNTTwAJGZDVgJzDKUG"                             
 [5] "yBDZYKxPXXGFwmlPWNMQuUJsfRXsBoQhuVXnYfMNkHFpmAgSRafGBzkKu"                                 
 [6] "LLuNdUoayRwtLqRJKrnxERpMmlntghfkjUqPkxMMubUozsLbPOFESOqtAWKoojOrttVCQlIYkyGRglr"           
 [7] "KuydhJOVZNNDrrMLDeWda"                                                                     
 [8] "ItwNtPGIQDsqCRBoVUCkClgHzCUiYRAiHIQRqpGBpfzRXgmWFArRtmnWhtciPgLlqrVs"                      
 [9] "BqqEjCpUHLzOlsmqiAOchAKysbtUCzce"                                                          
[10] "JJzdyoFqFnZOeLAABK"                                                                        
[11] "bakCawEaOkMspowlFUsAMjAbMxNxguHAHLomiGtenMuENNuPElGwqdqNdVS"                               
[12] "OEtuDejCDVfDwGjKbjWSCsicrRmqGGpWyqMfaGGPNkJhJMbgUtkjbcwitLqVojCERLxTWaCNFRltxgiwdJAbUtoksW"
[13] "crVVVzIyWbAlfyFndgipAZZJLcMqtEtZtBpbisbyAUWsKTJLiwyNvyVPPuoxOkafEeLARYDEOqEoh"             
[14] "QgAZkacEMBbGebUCToXFTLSqqlYhqpbdsPYvIrwJhfpgDcPiJlfiATEEDrYahyXgxLEVXvsbQ"                 
[15] "jHSYxhskNMxYnnbGQLQgTJKsuRXEeDpiPlonDABrXxivwepNNvZGrugSfHoMi"                             
[16] "CdCDpUjlUyLwiujvcLcxpNZjtxUMTMVYxnjCEQqbJQOXZJeTLHXQRbHaIsOIDmKeyNainhphvwEAHscCAhOjUsqQe" 
[17] "XvoelRDEYrxMfffBjRzmFPrLRjayCLRFVpWxzjcIxkRZQiPutModt"                                     
[18] "FNjvlFdyrRTVDWvnXVWjckCDFUkxnbUfkqYDNIPZVMOfjUejEKiuhhTXdi"                                
[19] "qsQDQtaVyoNVHtNNltPqLEuNGDxiscsOsXZfhaUNdBCoSwcouhhpwFhfCcqYPPFXrnjKqnlEknuKsaWVizaIacMiT" 
[20] "ykGqCILONPhzEABAuNjtEjzXxeFLnybwZdVEbDdzDQoDKmIiLvZNhoEEEvYJS" 

But here is the problem. When I try to pipe the first code chuck into replicate using magrittr's pipe (%>%), I get an output with similar strings!

This is how I do it:

sample(x = c(letters, LETTERS), size = sample.int(100, 1), replace = T) %>% paste0(collapse = "") %>% replicate(n = 20, expr = ., simplify = "array")

also, I tried to wrap them into parentheses:

(sample(x = c(letters, LETTERS), size = sample.int(100, 1), replace = T) %>% paste0(collapse = "")) %>% replicate(n = 20, expr = ., simplify = "array")

But, in both cases, this is the undesirable output:


 [1] "yNLJofKIlodQuFVzseeYOmjcxJRxFzgSXWCFoBhCFyJplcgVrrIEhoSSDJXHWhlkWkDHdtKexZFEvIHFokUNg"
 [2] "yNLJofKIlodQuFVzseeYOmjcxJRxFzgSXWCFoBhCFyJplcgVrrIEhoSSDJXHWhlkWkDHdtKexZFEvIHFokUNg"
 [3] "yNLJofKIlodQuFVzseeYOmjcxJRxFzgSXWCFoBhCFyJplcgVrrIEhoSSDJXHWhlkWkDHdtKexZFEvIHFokUNg"
 [4] "yNLJofKIlodQuFVzseeYOmjcxJRxFzgSXWCFoBhCFyJplcgVrrIEhoSSDJXHWhlkWkDHdtKexZFEvIHFokUNg"
 [5] "yNLJofKIlodQuFVzseeYOmjcxJRxFzgSXWCFoBhCFyJplcgVrrIEhoSSDJXHWhlkWkDHdtKexZFEvIHFokUNg"
 [6] "yNLJofKIlodQuFVzseeYOmjcxJRxFzgSXWCFoBhCFyJplcgVrrIEhoSSDJXHWhlkWkDHdtKexZFEvIHFokUNg"
 [7] "yNLJofKIlodQuFVzseeYOmjcxJRxFzgSXWCFoBhCFyJplcgVrrIEhoSSDJXHWhlkWkDHdtKexZFEvIHFokUNg"
 [8] "yNLJofKIlodQuFVzseeYOmjcxJRxFzgSXWCFoBhCFyJplcgVrrIEhoSSDJXHWhlkWkDHdtKexZFEvIHFokUNg"
 [9] "yNLJofKIlodQuFVzseeYOmjcxJRxFzgSXWCFoBhCFyJplcgVrrIEhoSSDJXHWhlkWkDHdtKexZFEvIHFokUNg"
[10] "yNLJofKIlodQuFVzseeYOmjcxJRxFzgSXWCFoBhCFyJplcgVrrIEhoSSDJXHWhlkWkDHdtKexZFEvIHFokUNg"
[11] "yNLJofKIlodQuFVzseeYOmjcxJRxFzgSXWCFoBhCFyJplcgVrrIEhoSSDJXHWhlkWkDHdtKexZFEvIHFokUNg"
[12] "yNLJofKIlodQuFVzseeYOmjcxJRxFzgSXWCFoBhCFyJplcgVrrIEhoSSDJXHWhlkWkDHdtKexZFEvIHFokUNg"
[13] "yNLJofKIlodQuFVzseeYOmjcxJRxFzgSXWCFoBhCFyJplcgVrrIEhoSSDJXHWhlkWkDHdtKexZFEvIHFokUNg"
[14] "yNLJofKIlodQuFVzseeYOmjcxJRxFzgSXWCFoBhCFyJplcgVrrIEhoSSDJXHWhlkWkDHdtKexZFEvIHFokUNg"
[15] "yNLJofKIlodQuFVzseeYOmjcxJRxFzgSXWCFoBhCFyJplcgVrrIEhoSSDJXHWhlkWkDHdtKexZFEvIHFokUNg"
[16] "yNLJofKIlodQuFVzseeYOmjcxJRxFzgSXWCFoBhCFyJplcgVrrIEhoSSDJXHWhlkWkDHdtKexZFEvIHFokUNg"
[17] "yNLJofKIlodQuFVzseeYOmjcxJRxFzgSXWCFoBhCFyJplcgVrrIEhoSSDJXHWhlkWkDHdtKexZFEvIHFokUNg"
[18] "yNLJofKIlodQuFVzseeYOmjcxJRxFzgSXWCFoBhCFyJplcgVrrIEhoSSDJXHWhlkWkDHdtKexZFEvIHFokUNg"
[19] "yNLJofKIlodQuFVzseeYOmjcxJRxFzgSXWCFoBhCFyJplcgVrrIEhoSSDJXHWhlkWkDHdtKexZFEvIHFokUNg"
[20] "yNLJofKIlodQuFVzseeYOmjcxJRxFzgSXWCFoBhCFyJplcgVrrIEhoSSDJXHWhlkWkDHdtKexZFEvIHFokUNg"

What do you think the problem is here?

5
  • 1
    Oh wow! Base R's new pipe, |> behaves differently: sample(x = c(letters, LETTERS), size = sample.int(100, 1), replace = T) |> replicate(n = 20, expr = _, simplify = "array"). At first, I agree with your results but base R's pipe now confuses me!
    – Parfait
    Commented Jun 30 at 3:53
  • @lotus, why do the pipes differ? I would have expected base R's pipe to do similarly. Very interesting!
    – Parfait
    Commented Jun 30 at 3:53
  • A good read: What are the differences between R's native pipe |> and the magrittr pipe %>%?. My guess is %>% evaluates first operation before second while |> is really a re-write of nested function calls.
    – Parfait
    Commented Jun 30 at 4:02
  • That's interesting! I'm so used to %>% that I feel betrayed right now :) Commented Jun 30 at 4:03
  • 1
    The replicate() function uses non-standard evaluation (NSE) on its expr argument. The magrittr pipe assumes standard evaluation, so it evaluates the first expression, and passes the result as expr. The base pipe just rearranges the expression, and that happens to give the desired result in this case, since expr is passed as an expression, not as the value of that expression. But the best advice is to avoid pipes when using NSE. Commented Jun 30 at 13:12

1 Answer 1

6

Both calls differ in number of operations. Essentially, magrittr's %>% simply passes the result of one operation into another operation.

In first call, only one operation is run where replicate runs its defined expression here being random sample + paste0 20 times. In other words, replicate directly runs sample 20 times.

replicate(
  n = 20, 
  expr = sample(
    x = c(letters, LETTERS), size = sample.int(100, 1), replace = T
  ) %>% paste0(collapse = "")
)

In second call, there are two operations where a random sample is run once and then its result is passed into replicate that simply repeats (i.e., replicates) this expression 20 times. Specifically, replicate never directly runs sample.

sample(
  x = c(letters, LETTERS), 
  size = sample.int(100, 1), 
  replace = T
) %>% replicate(n = 20, expr = ., simplify = "array")

Interestingly, this above behavior differs from base R's new pipe |> introduced in 4.1.0 and its RHS placeholder _ in 4.2.0:

sample(
  x = c(letters, LETTERS), 
  size = sample.int(100, 1), 
  replace = T
) |> replicate(n = 20, expr = _, simplify = "array")

where per deparse(substitute(...)) as shown by @Dirk is a handy re-write of nested calls which resembles your first call:

deparse(
  substitute(
    sample(x = c(letters, LETTERS), size = sample.int(100, 1), replace = T) |> 
      replicate(n = 20, expr = _, simplify = "array")
  )
)
[1] "replicate(n = 20, expr = sample(x = c(letters, LETTERS), size = sample.int(100, "
[2] "    1), replace = T), simplify = \"array\")"  
0

Not the answer you're looking for? Browse other questions tagged or ask your own question.