A couple of guys in the team and myself woke up early to do the deployment and as far we could tell everything was OK, the services in the new boxes were processing requests and there was almost no downtime.
We were not aware of this at first, but at that time one of the DevOps found out that the services in the new boxes were not writing anything to splunk; and after some investigation he found out that the new windows servers had the timezone set to BST unlike the other existing servers that were set to UTC. So he quickly changed those servers and the terraform template to UTC, job done, time for breakfast...
Not quite, after a while one of the QAs noticed that the times in some of the websites were jumping back and forth one hour. Thankfully we still are in daylight saving time, if this had happened during the winter we would not have noticed until Spring, and by then the investigation would have been so much more difficult because the lack of context.
The guys couldn't understand what was going on, all the servers were set to use UTC (we didn't know that the timezone had been changed few minutes ago). Fortunately it wasn't the first time I saw something like that, so logged to one of the boxes and checked the windows event viewer, and there it was, the timezone change. Restarted the services and after forcing some updates in the affected entities everything was fine again.
Going back home I remembered reading about this in the great book CLR via c#, processes and threads hold to their own locale. So I wrote an small program to write DateTime.Now and Date time.Now in a loop, as I know that writing it will help me remember it quickly next time.
This is the output with the timezone set to BST.
And this is the output with the timezone set to UTC again, note that the first instance still uses BST, while the new instance has picked up the update.
Now is time to address this technical debt, we have to use DateTime.UTCNow everywhere in the backend and also send dates with timezone between services and frontend to avoid this kind of problems.
No comments:
Post a Comment