Filter factor levels in R using dplyr

Question

This is the glimpse() of my dataframe DF:

Observations: 221184
Variables:
$ Epsilon    (fctr) 96002.txt, 96002.txt, 96004.txt, 96004.txt, 96005.txt, 960...
$ Value   (int) 61914, 61887, 61680, 61649, 61776, 61800, 61753, 61725, 616...

I want to filter (remove) all the observations with the first two levels of Epsilon using dplyr.

I mean:

DF %>% filter(Epsilon != "96002.txt" & Epsilon != "96004.txt")

However, I don't want to use the string values (i.e., "96002.txt" and "96004.txt") but the level orders (i.e., 1 and 2), because it should be a general instruction independent of the level values.

Is filter(as.numeric(Epsilon)>2) what you are looking for? — nicola, Commented May 5, 2015 at 11:46
@nicola Great, it is! Please rewrite it as an answer (not a comment) and I will accept it. — Medical physicist, Commented May 5, 2015 at 11:49
As commented by nicola, you can convert factors to their numeric/integer representation just by applying as.numeric or as.integer on them (which often causes confusion when it's not inteded). — talat, Commented May 5, 2015 at 11:50

nicola · Accepted Answer · 2015-05-05 11:52:55Z

35

You can easily convert a factor into an integer and then use conditions on it. Just replace your filter statement with:

 filter(as.integer(Epsilon)>2)

More generally, if you have a vector of indices level you want to eliminate, you can try:

 #some random levels we don't want
 nonWantedLevels<-c(5,6,9,12,13)
 #just the filter part
 filter(!as.integer(Epsilon) %in% nonWantedLevels)

answered May 5, 2015 at 11:52

nicola

24.4k3 gold badges35 silver badges57 bronze badges

1

Is as.integer() better/safer than as.numeric here?
– Rasmus Larsen
Commented Dec 7, 2019 at 11:02
5

Very slightly more efficient, since a factor is internally an integer and numeric coerces to a float value.
– nicola
Commented Dec 9, 2019 at 11:39

Add a comment |

Collectives�� on Stack Overflow

Filter factor levels in R using dplyr

1 Answer 1

Not the answer you're looking for? Browse other questions tagged
r
dplyr
or ask your own question.

Hot Network Questions

Collectives�� on Stack Overflow

1 Answer 1

Not the answer you're looking for? Browse other questions tagged rdplyr or ask your own question.

Related

Not the answer you're looking for? Browse other questions tagged
r
dplyr
or ask your own question.