[Stata] Trouble with identifying time variable

I've got 5-minute intraday data on some exchange rates, and the dates which take the form DD/MM/YYYY HH:MM and when i try to set it as a %tc variable it says:

'string %fmt required for string variables'

I've tried everything and dont know what to do. Any help is much appreciated. Thanks.
Re: Trouble with identifying time variable in Stata

Did you specify a "mask" to code your variable ? Something like this:

gen x = clock(date, "DMYhm")
It seems that your new variable only has missing values... No other idea about your problem.

My suggestion for now is to create your time variable without the clock command. For instance if you have exactly 5 minutes between two consecutive records you can create a variable time with "egen time = group(date)". And if you don't, you can imagine smtg like "gen time = 60*real(substr(date, and where hours are written)) + real(substr(date, and where minuts are written))" etc.".

Hopte this helps!
Hey. Im not really sure what you mean.

Here an example of what my data is like:


Exchange Rate

Using this small sample, im giving this command to identify time as the time variable:

gen time2 = clock(time, "DMYhm")

And i just get 5 missing values. If you could have a go with this sample and see how to do it I'd be saved. Sorry but i dont really get your other suggestion, or how to code it.

Last edited:
What a silly mistake. I've done it with MDYhm and now and i get a new variable that has figure like '1.49e+12'. For example, I did it with the 5 observations I've posted in this thread and the new variable has '1.49e+12' for every observation. Is this ok to use as my time variable?
Hey! I've managed to get my times right. I just right clicked on the variable and chose format. The I picked clock and the appropriate format and it turned those strange '1.49e+12' numbers into the correct times. Does data like this have to be stored as double? I ask as I read somewhere it does.
It seems that the answer lies in the starting date that Stata uses for the clock command. I copy-paste the relevant explanations:

What happens if you store a %tc value as a float:
The largest integer that can be stored precisely in a float is
16,777,216, corresponding to 01jan1960 04:39:37.216. Times after that
will be subject to rounding; the rounding as of recent times can be as
much as 2 minutes, 11 seconds.

What happens if you store a %tc value as a double:
The largest integer that can be stored precisely in a double is
9,007,199,254,740,992, corresponding to a date in year 285,422,880.
Stata cuts off dates at year 9999, but for other reasons.

Now it depends on what you want to do with your time variable. For instance if you want to regress your exchange rates against time, what matters is to have a time variable that is correctly coded. More precisely if the dates are:

"today 12:00"
"today 12:05"
"today 12:15"

then you want the gap between in your time variable between observations 2 and 3 to be twice as large as the gap between observations 1 and 2.
This is what i'm trying to do. Ive got a set of news surprise data (announced value-forecast) at the times of release. For example, I have 12 GDP announcements made at 8:30 quarterly spanning three years. I'm regressing the exchange rate against the news release to quantify the effect the news release has on the rate. I'm trying to do this using these two models. The most basic is:

Rt= βkSt + εt

Where Rt is returns and St is the GDP news. Am I right in creating a GDP variable with its figures placed in the matrix at the times it was released. The effects wear off after 15 minutes so they must be put in the right place. Then, regressing returns against the variable should give me how the news effects the exchange rate, right?

Then, the more complex model is:

Rt = β0 + (SUMMATION)βiRt-i + (SUMMATION) βk,t Sk,t-j + εt (x)

So as to include the persistence of the effects over time, so making j=5 or something to test how the effects dampen out over the 25 minutes following the release. I am concerned I am not using stata correctly to do this. I think i've got my time variables correctly coded so far. I did the 'gen time2 = clock(time, "DMYhm") command and then right clicked the variable and set it to clock and chose the format I would like. It then gave me a time variable with all the right times that would allow me to tsset it.

My issue is what to do with the news variables as they are not strictly time series data as they do not have temporally consecutive values. Do I just put them in where they are and regress or is there something better. A paper by Almeida, Goodhart and Payne called “The Effects of Macroeconomic News on High Frequency Exchange Rate Behaviour” does what I am trying to do.

Any suggestions from anybody will be much appreciated. Thanks for your help so far Etienne.