I want to apply a function to every row of a data frame. Using apply, the result is not itself a data frame again, it looks more like a list or matrix? (I don't know enough R to be able to tell from the output i get, just that it isn't a data frame)

Which is the right function to use to apply a function to every row of a data frame, returning a new data frame?

The function i want to apply to each row:

map_uri <- function(request){
    ret <- request
    uri_stem <- uri_map[uri_map[,1] == request["cs-uri-query"],2]
    if(length(uri_stem) > 0){
        ret <- request
        ret["cs-uri-stem"] <- uri_stem
        ret["cs-uri-query"] <- "-"
    if(request["cs-uri-stem"] == "/index.html"){
        ret["cs-uri-stem"] = "/"



what I am trying:

cleansed <- apply(requests, 1, map_uri)
cleansed[,c("cs-uri-query", "cs-uri-stem")]

which gives me the error

Fehler in cleansed[, c("cs-uri-stem", "cs-uri-query")] : Indizierung außerhalb der Grenzen

(Index out of bounds)

For some reason, the structure changes in a way that makes above indexing wrong.


Data to make this a working example:

uri_map.tsv http://pastebin.com/XhUuTMqA

uri_map <- read.table("http://pastebin.com/raw/XhUuTMqA", sep="\t", header=FALSE)

And input data for the transformation function:


requests <- read.table("http://pastebin.com/raw/b7ja4rKn", sep=" ", header=TRUE)

  • 1
    apply transposes: apply(matrix(1:4, 2), 1, identity).
    – Roland
    Commented May 2, 2016 at 13:05
  • @Roland thank you very much! Is that just not documented or did I not read careful enough
    – kutschkem
    Commented May 2, 2016 at 13:14
  • It's documented in help("apply"), but somewhat cryptic.
    – Roland
    Commented May 2, 2016 at 13:25

You can use the apply family but, you're right, the result is either a matrix or a list. Not a big deal though to get back to a data.frame.

Your function needs to return something consistent across columns (raw iris instead of iris[, 1:4] would not work below, because of iris$Species which is a factor with 3 levels where summary returns 6 numeric from a numeric column) and that's where a reproducible would help. Below, I used iris and summary:

  1. apply: as.data.frame(apply(iris[, 1:4], 2, summary))
  2. sapply: as.data.frame(sapply(iris[, 1:4], summary))
  3. lapply: do.call(cbind, lapply(iris[, 1:4], summary))
  • Is there a reason you used apply(*, 2, *) instead of 1? 1 is rows and 2 is columns, right?
    – kutschkem
    Commented May 2, 2016 at 12:58
  • 3
    exactly my captain Commented May 2, 2016 at 13:06

I have just implemented this function, which applies FUN over rows as lists and concatenates the result to a tibble:


lapply_rows <- function(df, return_tibble = TRUE, FUN, ...) {
  df_rownames <- rownames(df)

  res <- lapply(purrr::transpose(df), FUN = FUN, ...) %>%
    purrr::map_depth(2, function(x) {
      if (length(x) != 1) {
      } else {
    }) %>%

  if (!return_tibble) {
    res <- as.data.frame(res)
    rownames(res) <- df_rownames


df is converted to a list of lists by purrr::transpose(df), where each sublist is one row of the original df. FUN must return a named list, which can also contain elements with a length other than one. These elements are then wrapped in list() (type of a column of a data.frame-like object could be also a list). If return_tibble is FALSE, result is coerced to data.frame and original rownames are set.


df <- lapply_rows(mtcars, FUN = function(row_list) {
  row_list$cyl_2 <- row_list$cyl ** 2
  row_list$colors <- c("red", "green", "blue")
  row_list$sublist <- mtcars[1:5, 1:5]

# A tibble: 6 x 14
    mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb cyl_2 colors    sublist         
  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <list>    <list>          
1  21       6   160   110  3.9   2.62  16.5     0     1     4     4    36 <chr [3]> <df[,5] [5 × 5]>
2  21       6   160   110  3.9   2.88  17.0     0     1     4     4    36 <chr [3]> <df[,5] [5 × 5]>
3  22.8     4   108    93  3.85  2.32  18.6     1     1     4     1    16 <chr [3]> <df[,5] [5 × 5]>
4  21.4     6   258   110  3.08  3.22  19.4     1     0     3     1    36 <chr [3]> <df[,5] [5 × 5]>
5  18.7     8   360   175  3.15  3.44  17.0     0     0     3     2    64 <chr [3]> <df[,5] [5 × 5]>
6  18.1     6   225   105  2.76  3.46  20.2     1     0     3     1    36 <chr [3]> <df[,5] [5 × 5]>

Example returning a data.frame:

df2 <- lapply_rows(mtcars, return_tibble = FALSE, FUN = function(row_list) {
  row_list$cyl_2 <- row_list$cyl ** 2
  row_list$colors <- c("red", "green", "blue")
  row_list$sublist <- mtcars[1:5, 1:5]

                   mpg cyl disp  hp drat    wt  qsec vs am gear carb cyl_2           colors
Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4    36 red, green, blue
Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4    36 red, green, blue
Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1    16 red, green, blue
Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1    36 red, green, blue
Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2    64 red, green, blue
Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1    36 red, green, blue
Mazda RX4         21.00, 21.00, 22.80, 21.40, 18.70, 6.00, 6.00, 4.00, 6.00, 8.00, 160.00, 160.00, 108.00, 258.00, 360.00, 110.00, 110.00, 93.00, 110.00, 175.00, 3.90, 3.90, 3.85, 3.08, 3.15
Mazda RX4 Wag     21.00, 21.00, 22.80, 21.40, 18.70, 6.00, 6.00, 4.00, 6.00, 8.00, 160.00, 160.00, 108.00, 258.00, 360.00, 110.00, 110.00, 93.00, 110.00, 175.00, 3.90, 3.90, 3.85, 3.08, 3.15
Datsun 710        21.00, 21.00, 22.80, 21.40, 18.70, 6.00, 6.00, 4.00, 6.00, 8.00, 160.00, 160.00, 108.00, 258.00, 360.00, 110.00, 110.00, 93.00, 110.00, 175.00, 3.90, 3.90, 3.85, 3.08, 3.15
Hornet 4 Drive    21.00, 21.00, 22.80, 21.40, 18.70, 6.00, 6.00, 4.00, 6.00, 8.00, 160.00, 160.00, 108.00, 258.00, 360.00, 110.00, 110.00, 93.00, 110.00, 175.00, 3.90, 3.90, 3.85, 3.08, 3.15
Hornet Sportabout 21.00, 21.00, 22.80, 21.40, 18.70, 6.00, 6.00, 4.00, 6.00, 8.00, 160.00, 160.00, 108.00, 258.00, 360.00, 110.00, 110.00, 93.00, 110.00, 175.00, 3.90, 3.90, 3.85, 3.08, 3.15
Valiant           21.00, 21.00, 22.80, 21.40, 18.70, 6.00, 6.00, 4.00, 6.00, 8.00, 160.00, 160.00, 108.00, 258.00, 360.00, 110.00, 110.00, 93.00, 110.00, 175.00, 3.90, 3.90, 3.85, 3.08, 3.15

(you can see that tibble is handling the <list> columns much better)


With dplyr 1.0+, you can apply a function on every row with rowwise():

df <- tibble(x = 1:6, y = 2:7, z = 3:8)
# Compute the mean of x, y, z in each row
df %>% rowwise() %>% mutate(m = mean(c(x, y, z)))

If your function is already vectorized (as + is in this example), you don't need rowwise():

df %>% mutate(s = x + y + z)

If your function returns multiple values, summarize() can unpack those values into separate columns. See ansewrs at dplyr summarise() with multiple return values from a single function

