# How to deal with missing values at the end of a time-serie?

##### New Member
Hi everyone!

I have some problems concerning missing values in time-series data.
I have 11 values for each subject, with values being hormone dosage at different time points (0, +40, +50, +55...). Some of these values are missing. I want to impute data and I have to use univariate, non equi-spaced, time-series imputation methods. I tried the "zoo" package, with na.approx() and na.spline() functions.
The na.approx() replaces NA by linear interpolation while na.spline() replaces NA by cubic spline interpolation (corresponding to a polynomial function). Theorically, spline interpolation appears more adequate because I have the hypothesis of an increase of hormone concentration around the 6th sample followed by a decrease. However the na.spline() function replaces some of my NAs by negative values, which is impossible for my type of data (biological dosage).
On the other hand, na.approx() replaces my NAs by realistic values, but fails to replace the NAs if they are the last in my time-series.

Here is an example of what I have done:
Code:
cort.data2 <- c(2.34, 1.5, NA, NA, NA, 2.57, 3.53, 3.63, NA, NA, NA)
cort.time2 <- c(0, 43, 49, 54, 59, 69, 74, 81, 95, 110, 125)
sj02AM <- zoo(cort.data2, cort.time2)
na.spline(sj02AM, na.rm = FALSE)
na.approx(sj02AM, na.rm = FALSE)
With cort.time = time in minutes of the sample (baseline at 0min), and corti.data = the concentration of my hormone of interest (not supposed to be below 0).

Results are:
For na.spline(): 2.34 1.50 1.25 1.20 1.36 2.57 3.53 3.63 -5.14 -35.93 -99.09
For na.approx(): 2.34 1.50 1.75 1.95 2.16 2.57 3.53 3.63 NA NA NA

I tried to look at some other packages but they do not seem to be suitable for non equi-spaced time-series. Also, I do not have a background in statistics or mathematics, so I don't think I am able to construct my own function in R...

Thank you for you help

#### hlsmith

##### Less is more. Stay pure. Stay poor.
Cool problem. What is the final purpose of your project? What will you do with these data?

How much missingness is there?

Also, typically with missingness people recommend multiple imputation to address the uncertainty in imputes. Not sure how to put constraints on splines by that may be the best route. May there may be a Bayesian version that and put weight on positive values.