Mann-Kendall interpretation. Working in R, "object 'y' not found"

QMT

New Member
#1
I am trying to run a Mann-Kendall over a group of files, to perform the MK on each one. This is precip data, for each area, over decades. I read the data into R, then create df "datalist". I bind all the files into one, and then split them by the name in the first column ("V1"). This all works fine. Then I want to perform the MK on each split "chunk". Previously, I have used this same code to create means of each "chunk" and it worked perfectly:
Code:
Samp <- sapply(splitByHUCs, function(x) mean(x$RO_MM, na.rm = FALSE, use.names = T))
I was hoping to just slightly tweak the code like this:
Code:
Samp <- sapply(splitByHUCs, function(x) Kendall(x$V2, y$V3))
, where x is the timeline and y is the data.
But I am getting "Error in Kendall(x$V2, y$V3) : object 'y' not found". I am confused about the requirements for the MK test.

I have the manual and it states, "Usage Kendall(x, y) Arguments x first variable, a vector y second variable, a vector the same length as x Details In many applications x and y may be ranks or even ordered categorical variables. In our function x and y should be numeric vectors or factors. Any observations corresponding to NA in either x or y are removed. Kendall’s rank correlation measures the strength of monotonic association between the vectors x and y."

This is not clear to me. My data are a vector of precip data over a time period. I am not comparing it to something else, I only want to see if there is a trend in the data. Could someone please explain what I am missing? Should I be using a different test from the package? I find the manual very cryptic and confusing. I tried to run MannKendall(x$V3) and got results, but I don't understand what is being correlated? There are no "pairs", it is a string of data records. So, I am not understanding how to interpret the results.

Attached please find a small sample of the data. All columns are the same length. There are no blank rows or "NA" cells. dput (splitByHUCs) shows this: "row.names = c(NA, -84L), class = c("data.table", "data.frame")"

This is my full code:
Code:
fnames <- dir("~/Desktop/test_files/", pattern = "*_45Fall_*")
read_data <- function(z){
   dat <- fread(z, skip = 56, select = 1:3)
}
datalist <- lapply(fnames, read_data)
bigdata <- rbindlist(datalist, use.names = T)
splitByHUCs <- split(bigdata, f = bigdata$V1, sep = "\n", lex.order = TRUE)
# So far, so good.
Samp <- sapply(splitByHUCs, function(x) Kendall(x$V2, y$V3))
write.csv(Samp, file = "~/Desktop/results/Fall.csv", row.names = FALSE)
Attached find 3 sample files. NOTE: The timelines are abbreviated for these samples, hence the code above "skip = 56" should be ignored. Also, the column headers "HUC8", "YEAR", and "RO_MM" are lost in the process, due to the fact that I am skipping rows. The "V1", "V2", "V3" are created by the script, so I just ran with them.
Thanks for your attention.
 

Attachments

Last edited:

QMT

New Member
#4
This is the result I got running Samp <- sapply(splitByHUCs, function(x) MannKendall(x$V3)) --
Code:
                                Region_1010001    Region_1010002    Region_1010003    Region_1010004    Region_1010005    Region_1020001
tau    Kendall’s tau statistic     0.065700762       0.157773957        0.06197992       0.020086085       0.036739387      0.135437593
sl     two-sided p-value            0.37843132       0.033935666       0.406213045       0.789810181       0.623690486      0.068828821
S      Kendall Score                       229               550               216                70               128              472
D      denominator, tau=S/D             3485.5              3486       3484.999756       3484.999756       3483.999268      3484.999756
varS   variance of S               67007.66406       67008.66406       67006.66406       67006.66406       67004.66406      67006.66406
Can someone please explain what this is telling me? I'm still not sure this is the correct test.
 
Last edited: