select elements from an R object. All the informtion in this chapter comes from the Subsetting chapter of Advanced R by Hadley Wickham.
We have two vectors, x
and y
.
Use [
to select any number of elements from a vector.
Return the elements in specified positions.
x[c(3, 1)]
[1] 2 1
# Order of the elements of `x` is another vector of the same length as `x`
order(x)
[1] 5 1 3 2 4
order(y)
[1] 2 4 1 3
x
[1] 1 3 2 4 -10
y
[1] "c" "a" "d" "b"
x[order(x)]
[1] -10 1 2 3 4
x[order(y)]
[1] 3 4 1 2
y[order(x)]
[1] NA "c" "d" "a" "b"
# Duplicate indices will duplicate values
y
[1] "c" "a" "d" "b"
y[c(1, 1)]
[1] "c" "c"
# Real numbers are silently truncated to integers
y
[1] "c" "a" "d" "b"
y[c(2.1, 2.9)]
[1] "a" "a"
Exclude elements at the specified positions:
# both element ar negative
x[-c(3, 1)]
[1] 3 4 -10
you can't mix positive and negative integers in a single subset
x[c(-1, 2)]
Error in x[c(-1, 2)]: only 0's may be mixed with negative subscripts
But you can do the following
c(x[-1], x[2])
[1] 3 2 4 -10 3
Logical vectors are recycled. For subsetting c(1, 0)
is different to c(TRUE, FALSE)
.
# For subsetting
y
[1] "c" "a" "d" "b"
y[c(FALSE, TRUE, TRUE, FALSE)]
[1] "a" "d"
y[c(0, 1, 1, 0)]
[1] "c" "c"
# Length zero
y[0]
character(0)
# Return original vector
y[]
[1] "c" "a" "d" "b"
# Logical evaluation
if (TRUE) {
print("works")
}
[1] "works"
if (1) {
print("works")
}
[1] "works"
if (FALSE) {
# Not working
print("does not work")
}
if (0) {
# Not working
print("does not work")
}
# Rescycle of logical vectors for subsetting
x
[1] 1 3 2 4 -10
x[c(TRUE, FALSE)]
[1] 1 2 -10
Sometimes, elements of a vector are named. Remember, this is different from having a vector named. The variables in a data frame are named vectors, but you could also name the elements of a variable. This is similar, but not the same to have a factor vector. Factors are numeric vectors with a class factor
. Named elements in a vector are elements of any type
with a corresponding name. Factors behave like factors depending of the function that is reading them. Named vectors behave according to their own type
, regardless of the name. Yes, class and names are attributes but of different kinds.
# using setNamnes()
nombres <- c("Serapio", "Trimegisto", "Amalasunta", "Metafrasto", "Brunilda")
xm <- setNames(object = x,
nm = nombres)
xm
Serapio Trimegisto Amalasunta Metafrasto Brunilda
1 3 2 4 -10
# using names()
x
[1] 1 3 2 4 -10
names(x) <- nombres
x
Serapio Trimegisto Amalasunta Metafrasto Brunilda
1 3 2 4 -10
# Print or use the names
names(x)
[1] "Serapio" "Trimegisto" "Amalasunta" "Metafrasto" "Brunilda"
nm <- names(x)
nm
[1] "Serapio" "Trimegisto" "Amalasunta" "Metafrasto" "Brunilda"
# Remove names
names(x) <- NULL
x
[1] 1 3 2 4 -10
# Getting one element
xm["Amalasunta"]
Amalasunta
2
# getting an element from a vector without names
x["Amalasunta"]
[1] NA
# Some names exist and some other not
xm[c("Andres", "Amalasunta")]
<NA> Amalasunta
NA 2
# repeated names
xm[c("Trimegisto", "Trimegisto")]
Trimegisto Trimegisto
3 3
# Names are matched exaxtly.
xm[c("Tri", "Trimegisto")]
<NA> Trimegisto
NA 3
It is just a bad idea.
fnames <- factor(nombres)
fnames
[1] Serapio Trimegisto Amalasunta Metafrasto Brunilda
Levels: Amalasunta Brunilda Metafrasto Serapio Trimegisto
as.numeric(fnames)
[1] 4 5 1 3 2
[1] <NA> <NA>
Levels: Amalasunta Brunilda Metafrasto Serapio Trimegisto
Amalasunta Brunilda
2 -10
xm[f3]
Serapio Trimegisto
1 3
as.numeric(f3)
[1] 1 2
For attribution, please cite this work as
Castaneda (2023, Jan. 24). R Training for GPID Team: Subsetting. Retrieved from https://povcalnet-team.github.io/Rtraining/posts/subsetting/
BibTeX citation
@misc{castaneda2023subsetting, author = {Castaneda, R.Andres}, title = {R Training for GPID Team: Subsetting}, url = {https://povcalnet-team.github.io/Rtraining/posts/subsetting/}, year = {2023} }