21

From the language definition:

"char": a ‘scalar’ string object (internal only) ***
"character": a vector containing character values

Users cannot easily get hold of objects of types marked with a ‘***’.

y <- "My name is hasnain"
class(y)

The class functions tells that 'y' belongs to the character class. What I have perceived is that 'y' is an object which is of char type and it belongs to character class. Does the character object also belongs to the character class?

4
  • There are no scalar object types in R. typeof(y) returns "character". I don't know where you got this quote from, but "internal" probably refers to underlying C code.
    – Roland
    Commented Sep 11, 2019 at 8:50
  • @Roland archive.linux.duke.edu/cran/doc/manuals/r-release/… page# 5
    – Hasnain
    Commented Sep 11, 2019 at 8:52
  • 4
    "'y' is an object which is of char type" That's not correct. It's of type "character".
    – Roland
    Commented Sep 11, 2019 at 9:03
  • 5
    On further reflection, I think, you've asked actually a good, interesting question. I have absolutely no idea how one could get an object of type "char" in R. I might add a bounty asking for an example, if nobody posts an answer.
    – Roland
    Commented Sep 11, 2019 at 9:26

2 Answers 2

33
+500

The two R types char and character at the internal C side correspond to CHARSXP and STRSXP respectively. At the R level, one always deals with character objects; a single string, like:

y <- "My name is hasnain"

is actually a character object of length 1. Internally, each element of a character is a char, but R doesn't provide (AFAIK) a direct way to extract, create and/or use a char.

Although you can't create a char/CHARSXP object with pure R, it's straightforward to get it through the R/C interface using the mkChar function, which takes a standard C string and turns it into a CHARSXP. For instance, one can create a char.c file:

#include <stdio.h>
#include <stdlib.h>
#include <R.h>
#include <Rinternals.h>
SEXP returnCHAR() {
   SEXP ret = PROTECT(mkChar("Hello World!"));
   UNPROTECT(1);
   return ret;
}

After compiling it through R CMD SHLIB char.c, from the R side:

dyn.load("char.so")  #linux dll; extension varies across platforms
x<-.Call("returnCHAR")
x
# <CHARSXP: "Hello World!">
typeof(x)
#[1] "char"
length(x)
#[1] 12

Besides typeof and length I didn't find many other R functions that acts on char objects. Even as.character doesn't work! I could neither extract a char from a standard character vector, nor insert this char into an existing character vector (assignment doesn't work).

The c function coerces to a list if an object is a char:

c(1,"a",x)
#[[1]]
#[1] 1
#
#[[2]]
#[1] "a"
#
#[[3]]
#<CHARSXP: "Hello World!">

We can make use of .Internal(inspect()) (warning: inspect is an internal, not exposed function and so it might change in future releases. Don't rely on it) to have a glimpse of the internal structure of an object. As far as I know, char/CHARXSP objects are shared between string vectors to save memory. For instance:

let<-letters[1:2]
.Internal(inspect(let))
#@1aff2a8 16 STRSXP g0c2 [NAM(1)] (len=2, tl=0)
#  @1368c60 09 CHARSXP g0c1 [MARK,gp=0x61] [ASCII] [cached] "a"
#  @16dc7c0 09 CHARSXP g0c1 [MARK,gp=0x60] [ASCII] [cached] "b"
mya<-"a"
.Internal(inspect(mya))
#@3068710 16 STRSXP g0c1 [NAM(3)] (len=1, tl=0)
#  @1368c60 09 CHARSXP g0c1 [MARK,gp=0x61] [ASCII] [cached] "a"

From the above output, we note two things:

  • STRSXP objects are vectors of CHARSXP objects, as we mentioned;
  • strings are stored in a "global pool": the "a" string is stored at the same address despite being created independently in two different objects.
1
  • 3
    Great answer! One thing I think might improve it is addressing OP's confusion as to "My name is hasnain" being a character vector of length one rather than a char object
    – duckmayr
    Commented Sep 15, 2019 at 11:59
0

As per nicola's answer

The two R types char and character at the internal C side correspond to CHARSXP and STRSXP respectively. At the R level, one always deals with character objects; a single string, like:

y <- "My name is hasnain"

In R, a piece of text is represented as a sequence of characters (letters, numbers, and symbols). The data type R provides for storing sequences of characters is character. Formally, the mode of an object that holds character strings in R is "character".

A character object is used to represent string values in R. We convert objects into character values with the as.character() function:

> x = as.character(3.14) 
> x              # print the character string 
[1] "3.14" 
> class(x)       # print the class name of x 
[1] "character"

Character/string – each element in the vector is a string of one or more characters. Built in character vectors are letters and LETTERS which provide the 26 lower (and upper) case letters, respecitively.

> y = c("a", "bc", "def")
> length(y)
[1] 3
> nchar(y)
[1] 1 2 3
> y == "a"
[1] TRUE FALSE FALSE
> y == "b"
[1] FALSE FALSE FALSE

It is as simple as that yes you've perceived right that y is an object of type char ad belongs to character/string class but if and only if 'a' here the sequence of chars forming a string/character

Not the answer you're looking for? Browse other questions tagged or ask your own question.