From Kolab Wiki
KEP 2: Modification of datetime: store local time, add 'tz' attribute
|Latest Working Copy||KEP-0002.txt|
This is a Kolab Enhancement Proposal (KEP) to update how the Kolab Format version 2.0 (XML version 1.0) stored date/time fields. All times were stored in UTC and did not allow for time zone information. This was too simplistic an approach to solve the complex issues caused by time zones and their DST rules, as explained in the background section below.
This KEP addresses issues of usability as well as function. The usability related issue is that users sometimes specifically set time zones for datetime fields and expect this explicit selection to be preserved across sessions and clients. Without storage of this information clients cannot meet that user expectation.
The functional issue is the more important of the two: For non-recurring events there can be errors in the display of an event's time if DST rules have changed since the event was made. For recurring events in parts of the world with DST regimes it was impossible to define a recurrence that takes place at the same local time all year and is correctly displayed by clients in time zones with different DST rules, in particular different times of switching. Both DST-related issues are accentuated by the fact that DST rules are subject to political decisions taken in the future, and consequently unknown today.
In order to achieve a recurring event that retains its local time across DST transitions, a client must know which time zone to use. The implicit assumption of older clients to always use local time zone is problematic, as explained in subsection "Description of current client behaviour" below. So enabling time zone information for datetime fields is essential.
Some reference for background was provided on the Kolab format list  and there have been various proposals for resolution of the issue, including adding time zone information in a separate XML tag, along with DST details . This document is based upon those discussions and their followup throughout October 2010 through May 2011.
Update to the Kolab Format
All Kolab object types hold datetime in the form of creation and modification times. Several other object types also hold other datetime fields. This KEP describes the canonical format for all datetime fields across all Kolab object types. This will ensure consistency and is part of a changeset applied against version 1.0 of all object types of the Kolab Format 2.0.
Change of type: datetime
Kolab ISO8601 Profile
Based on RFC 3339 , the Kolab Groupware Solution specific profile of the ISO 8601  standard for representation of dates and times using the Gregorian calendar using the Augmented Backus-Naur Form (ABNF) is as follows:
date-fullyear = 4DIGIT date-month = 2DIGIT ; 01-12 date-mday = 2DIGIT ; 01-28, 01-29, 01-30, 01-31 based on ; month/year time-hour = 2DIGIT ; 00-23 time-minute = 2DIGIT ; 00-59 time-second = 2DIGIT ; 00-58, 00-59, 00-60 based on leap second ; rules time-secfrac = "." 1*DIGIT time-in-utc = "Z"
partial-time = time-hour ":" time-minute ":" time-second [time-secfrac] full-date = date-fullyear "-" date-month "-" date-mday full-time = partial-time [time-in-utc]
date-time = full-date "T" full-time
The "Z" to specify a time in UTC MUST be used for times in UTC and MUST NOT be added to values in any time zone other than UTC.
Per ABNF, ISO8601 and RFC3339, the "T" and "Z" characters in this syntax are explicitly defined as the upper-case letters, usage of the lower-case letters is explicitly forbidden.
Valid date-time fields according to the above definition are
2010-01-31T11:27:21Z 2005-12-19T02:55:23.437689098765 2001-06-19T11:01:23 2005-12-19T02:55:23.43Z 2011-05-01
Kolab Format Datetime Type
A field of "datetime" type MUST be compliant to the Kolab ISO8601 profile.
It MAY have one additional "tz" attribute. The value of the "tz" attribute is a string, which MUST be one of 'UTC' (all caps) -- OR -- geographical time zone identifiers in the uniform naming convention designed by Paul Eggert, specifying time zones from the Olson database, a.k.a. tz database, a.k.a. zoneinfo database .
Where the definition of a field explicitly specifies storage in UTC only, the "tz" attribute MUST NOT be used.
Where the definition of a field explicitly allows storage in local time, the "tz" attribute MUST be used in all cases, including for storage of UTC.
Kolab Format Date Type
The Kolab Date Type is defined as full-date.
Where a field explicitly refers to a full-date ONLY, the "tz" attribute MUST be used in all cases, including for storage of UTC.
Valid date-time fields according to the above definition are
<field>2010-01-31T11:27:21Z</field> <field tz="Europe/Berlin">2005-12-19T02:55:23.437689098765</field> <field tz="America/Sao_Paulo">2001-06-19T11:01:23</field> <field tz="UTC">2005-12-19T02:55:23.43Z</field> <field tz="America/Los_Angeles">2010-05-01</field>
Kolab Date and Datetime Usage
- Clients MUST store all date and/or datetime fields not based on user interaction/fields that are automatically generated -- AND -- carry values in the past or present in UTC only. This explicitly includes the following fields: 'creation-date', 'last-modification-date' of all Kolab object types.
- Clients MUST store all date and/or datetime fields based on direct user interaction -- OR --- fields that may carry values which are NOT limited to UTC only in local time using the "tz" attribute. This explicitly includes the following fields: 'start-date', 'end-date' of all Kolab object types.
- Clients MUST NOT use fractions of seconds (time-secfrag) for any datetime field in any Kolab object type unless the definition of that field and object specifically permit or require time-secfrag, which MUST always be done in a way to specify the maximum number of digits. Fractions of sections MUST NOT be used for any field in object types: 'note', 'contact', 'distribution-list', 'journal', 'event', 'task'.
- Clients MUST be capable of reading date and datetime fields that comply with the writing rules of this KEP and subsequent definitions of the Kolab object types they process.
- Clients MUST preserve user preference and selection in the "tz" attribute to the maximum extent possible.
- Clients SHOULD check if a new update of the Olson database or the authoritative database used by the system is available and get that update at least once every three months, OR suggest update policies for their respective operating systems that ensure the time zone data database gets updated regularly. As far as is currently known, all commonly used and supported GNU/Linux distributions do this already.
- Clients MAY support loose parsing according to the superset provided by ISO 8601 .
Canonical client behaviour
- When creating a new object with time zone sensitive fields, clients SHOULD default to the local time zone of the user, but SHOULD allow the user to select the time zone for storage and consequently recurrence calculation;
- When modifying existing objects, clients MUST use the value of the 'tz' attribute of the respective fields to set the default/preselected value for the editing of the fields, where applicable. For instance the 'start-date' and 'end-date' time zone defaults if presented to the user by the client MUST match those stored in the 'tz' attribute. The time zone stored in the 'tz' attribute SHOULD only be changed based upon user interaction;
- When calculating recurrences, a client MUST calculate in a way that keeps the event at the same local time in the time zone stored in the 'tz' attribute. Clients MUST then use the result as the time from which to calculate the time of the event at the client's time zone. For more details see Notes for client implementors below;
- When receiving iTip invitations, a client MUST treat the time zone id in the VTIMEZONE object as authoritative and, if it is not a valid Olson database time zone identifier, translate it using the tzid mapping table provided by the Kolab community. If the time zone id in the VTIMEZONE element does not exist in the tzid mapping table, clients MAY attempt to map the time zone based on its rules to a currently used time zone -- AND/OR -- allow the user to select an appropriate time zone for storing an event;
- For recurrence calculation: When tz is specified as 'UTC', a client MUST calculate recurrences strictly according to UTC;
- For recurrence calculation: Where tz is missing although the Kolab Format required it, a client SHOULD calculate recurrences strictly according to UTC;
- When clients encounter deviations from the schema, e.g. parsing datetime objects that do not match the writing conventions, or a missing 'tz' attribute for start-date or end-date in an event using a version of the Kolab Format based on this KEP, clients SHOULD inform the user of a potential issue, using the 'product-id' to help the user identify clients that might be broken. There will likely be an explicit KEP on this issue at a later point in time. This mechanism MAY also be used for the update strategy, see below.
Examples of valid fields using datetime structures according to the above definition are
<start-date tz="Europe/Rome">2011-05-01</start-date> <start-date tz="UTC">2010-01-31T11:27:21Z</start-date> <start-date tz="America/Los_Angeles">2005-12-19T02:55:23</start-date> <start-date tz="Europe/Brussels">2001-06-19T11:01:23</start-date> <last-modified>2011-04-01T01:02:33Z</last-modified> <hypothetical-high-precision-timestamp tz="America/Sao_Paulo">2001-06-19T16:39:57.1229853</hypothetical-high-precision-timestamp>
The previous date time type that was used by the Kolab Formats up to and including version 2.0 based on strict UTC Zulu notation continues to be the authoritative form of writing all the automatically/software generated fields which carry information that is in the past or present. So all old data can be parsed by new parsers, and old clients will continue to understand at least some of the fields written according to future Kolab Formats based upon this KEP.
There will NOT be backwards compatibility for all types of newer objects, however. While in principle all clients should already preserve tags and attributes they do not understand, not all older clients properly guarantee this at the current point in time. So the newly introduced 'tz' attribute would be in peril of being stripped out by older clients. Older clients are also likely to falsely interpret data written by newer clients, potentially corrupting it upon write.
And finally it is unavoidable that older clients will continue to behave as they did thus far, continuing to display some recurrence times incorrectly.
Smart Upgrade Option
When encountering data written to the Kolab Format prior to this KEP, clients MAY choose to
- continue to display the old data sets as the old clients did, leaving the old data unchanged;
- bring up a dialogue informing the user of a change of Kolab Format version and suggesting an update which should usually be carried out in the respective editing dialogue (especially for events) so users can provide the data that was absent in the old format;
- silently update the data to the newer Kolab Format, based on the assumptions that older clients made when interpreting this data, thus maximally preserving client behavior.
The first option is the recommended approach for non-recurring events, the second is the recommended approach for recurring events. The third option should only be used with extreme caution and ideally some explicit user interaction for the entire process, e.g. an "update wizard".
As all of these approaches include additional work for client implementers, none of these are required.
In any case it is NOT recommended to ever have older and newer clients coexist on a shared set of data, and client implementers should seek to implement advice to this extent for their users.
Notes for client implementors
For events and tasks, there are two existing use cases this KEP addresses:
- Store user preference
- A user typically has selected a time zone to enter a date/time, either implicitly or explicitly. When storing without timezone information, that information is lost. So while the user might realistically expect the event to preserve the time zone they entered initially when editing it again, Kolab was unable to provide this functionality thus far.
- Event time calculation
- Other than for the storage of user preference, recurrences are the most important but not the only use case for this Kolab Enhancement Proposal.
- Where the 'tz' attribute is explicitly set to 'UTC' or missing (in which case clients SHOULD also issue a warning due to an incorrect data format), clients MUST implement strict mapping to UTC. Otherwise the time zone stored in the 'tz' attribute MUST be considered authoritative, and the value of the time stored in the datetime field MUST be considered local to this time zone;
- Where the 'tz' attribute is explicitly set to something other than 'UTC', clients MUST use a time zone database (e.g. Olson, a.k.a. tzdata) to calculate the event;
- For recurring events, the same methodology MUST be used each time the event occurs;
- This resulting time can then be displayed (if the same as local time zone), be translated into the local time zone for display, or be translated to UTC for UTC based use cases.
- tzid mapping table
- There should be mapping between time zone identifiers from and to the Olson database locations for systems not based on the Olson time zone identifiers, i.e. Microsoft Windows. In order to achieve that, Kolab Systems will work with the various client implementers to provide a canonical mapping table that all clients without Olson database can use for such mapping on their systems against the Olson database locations. This tzid mapping table will be provided publicly under a Free Software license.
- Invitation handling
- The tzid mapping table should also be used for handling iTip invitations, which carry an arbitrary number of VTIMEZONE objects. When the id of the VTIMEZONE object is unknown, clients can fall back on automatic detection against the local database and/or user choice when storing the event. Clients SHOULD offer users the possibility to send an email with the VTIMEZONE object to email@example.com so the ID can be included into the next revision of the tzid mapping table.
Description of the issue
Relationship between UTC and local times
The functions to convert between UTC and local times are more complex than one might naively suspect.
The reason for this lies in DST necessarily being implemented as a dynamic offset on UTC that is different from the offset on UTC that is as standard time. This creates the well known effect that in one direction the hour 2am to 3am exists twice, in the other direction it is missing. But the effects of this ambiguity are not limited to that one hour which is typically placed between Saturday and Sunday. As a result, the relationship between UTC, which computers use, and local time, which people experience, is ambiguous if the DST rules are not known.
The same mapping difficulty also exists in the other direction. An offset on UTC loosely correlated to a longitude, but on each longitude there is typically several countries with different DST regimes, switching at different times, in different directions (depending on the hemisphere), or not at all. So an offset to UTC does not correlate to time zones, it correlates to a group of time zones, with or without DST.
The function to translate between UTC and local time therefore has more input parameters than just the time in UTC or local time.
It is furthermore subject to change due to a variety of effects:
- Time of switching between standard time / DST: The dates of when a region switches from standard time to DST and when it switches back are set by a political process, and occasionally even changed on short notice, e.g. Turkey in 2011.
- The amount of switching: Most regions switch by an hour between standard time and DST, but this is not a given. Currently only "Australia/Lord_Howe" is switching by only 30 minutes. There is no guarantee it will stay the only one.
- The existence of DST switching: Many regions routinely discuss getting rid of DST, some countries might in fact do this, as demonstrated in the next point.
- The UTC offset of standard time: The offset to UTC for standard time is also not guaranteed to remain stable, e.g. Russia has plans to abolish DST entirely, and switch standard time over to what was previously DST.
- What's stable? The geography of the planet and some of its geographical markers are significantly more stable over time than timezone rules. An example are major cities, which may change their name, but less often so their position.
Resolving the ambiguities between UTC and local time(s)
Resolving the ambiguity always requires knowledge about the time zone, for which geographical identifiers are the most reliable and secure approach of identifying them (also see below). But because DST rules are subject to change, a client can only have certainty about the translation between UTC and local time with some security for its current date and the past, and only if the client has been kept sufficiently up to date.
In consequence, the probability of a correct translation reaches certainty only for some time in the past, is very high for the present, and then decreases quickly for the future.
Avoiding implicit assumptions
This lack of certainty translates into necessary assumptions when trying to store date time fields in UTC when local time is intended. The assumption typically made is that DST rules in the future are not going to change from what they are today since the client could not know about them anyways, and it is typically made implicitly by the client when using the system functions to convert the time entered by the user into UTC for storage, using the system's database, which for GNU/Linux is based on the Olson database.
Because a client cannot know whether DST rules are going to change between the time an event is scheduled and the date for which it is scheduled, this conversion is based on an assumption, which can be proven wrong later by socio-political changes (see above).
Furthermore, when this event is displayed later, there is no way for the client to know whether that initial assumption held true.
Because the original selection of the user is otherwise lost in the conversion to UTC, or comes with a substantial amount of meta data, the easiest approach to restore the initial user's intent is storing it directly, as the local time entered by the user.
Current client behaviour
The following are two examples of existing client behaviour prior to this KEP which demonstrate the issue.
Single, Non-Recurring Events
The exception to the above is when the event has been placed sufficiently far into the future and lives within the weeks of DST regime flexibility.
Then the following can occur:
- Person A stores event for 11:00 Europe/Berlin three years into the future.
- The client looks up DST rules known at this time, correctly concludes this is still standard time, and stores 10:00 UTC.
- One year later, DST rules get changed, and propagated through the typical channels to all platforms.
- Two years later, the client looks up the event, knows that DST is in effect, and correctly translates 10:00 UTC to 12:00 local time in Europe/Berlin.
So in effect, the event which was set for 11:00 Europe/Berlin is now incorrectly displayed at 12:00 Europe/Berlin due to the time zone change. So correct behaviour on all sides can lead to incorrect results due to this ambiguity between UTC and local time.
This problem also exists in recurring events, and affects them more often as their instances typically need to be calculated both within standard time and DST throughout the year, and often extend far enough to see changes in DST regimes.
Existing clients currently make the implicit assumption that the time was specified in and should be calculated against the local time zone of the client itself. This will lead to issues when a user is changing time zones, or when participants in multiple time zones are concerned. This behaviour could be confirmed with both Kontact and the Kolab Web Client Horde.
A weekly meeting is set for 11:00 every Wednesday in Zurich, Switzerland, starting on 23 June 2010. This gets translated stored in UTC as 2010-02-17T09:00:00Z. On Wednesday 17 February 2010 Switzerland is using standard time, the local timezone is therefore UTC+1. If strictly interpreting the stored information, the meeting should now start at 10:00.
Versions of KDE Kontact <=4.6.3 however display the meeting as scheduled for 11:00. The same is true for the Kolab web client based on Horde for versions of Kolab Server <= 2.2.4. This however is equivalent to 10:00 UTC. When adding another user in Sao Paulo, Brazil to the equation, the event is shown as taking place at 06:00 local time, or 08:00 UTC, due to the Brazilian DST with an offset of UTC-3 that went into the assumption for the calculation of the recurrence. The result is that two users, while being presented with a data set that looks consistent, will miss each other.
Which other clients exhibit the same behavior is unclear, but it seems there is no reasonable assumption that current behavior correctly models any rational use case.
Background notes on design decisions and backwards compatibility
This section provides summaries of discussions around this KEP and the rationale that went into the design decisions above. It is primarily intended for the purpose of documenting at least part of the thought process and can be safely ignored by client implementers. As a general note it should be said this is a complex issue, with no silver bullet or one solution that is so clearly better than all the others that everyone just has to agree - the problem can be resolved in separate ways, each bringing their own advantages and disadvantages, and ultimately a call of judgment and preference that had to be made.
UTC vs local time
As demonstrated, storage in UTC without additional information is an oversimplification and does not solve the complex issues provided by documented existing use cases. For UTC storage to be able to address the situation, the bidirectional ambiguities between UTC and local time(s) need to be resolved. To achieve this, clients would have to store and interpret additional information besides date time and time zone to resolve this ambiguity and re-calculate the original user's intent.
To address all the known issues, this information would have to include the entire time zone information. To address the most common issues, it would at least have to include the assumption made regarding whether or not DST would be in effect. Checking that stored data against the known rules at the time of interpretation would have to be undertaken on many objects and virtually all calendar items for most operations before times could be relied upon and used for further calculation, such as free/busy listing and so on. So while possible, it would require additional logic, which due to the complexity of the issue is prone to errors.
Ultimately this approach would lead towards storing time zone data in the objects themselves in order to achieve consistency and security to preserve the intent of the user. This brings several pitfalls because time zone data is subject to change, in particular:
- Clients may not be allowed to update, e.g. for shared calendars with only read permission;
- Clients that "know" data to be outdated but cannot update the object may in fact prefer to present the user a correct event, rather than displaying a wrong event consistently. This behavior has been observed in iCalendar clients, which have storage of both TZ-ID & in-format TZ-data;
- No good indicators exist when in-format TZ-data should be updated. Without these a client with the old timezone data would feel the urge to update the object again, in effect rolling the update back;
- Substantial size would be added to the event, often VTIMEZONE definitions can be larger than the data for the event itself. If more than one timezone is used, each timezone has to be within the object. As many appointments will be in the same timezone, the data within a folder or a Kolab account and server will be highly redundant;
- Once a client notices the need for an update, many objects would be affected which would lead to the need of a lot of data transfer and backup space to update the redundant data.
The most substantial argument is the problem of finding a good algorithm that will determine when an update of the time zone data stored in the objects should be done. As there is currently no serial number of authoritative data attached to this information, such a mechanism would have to be invented and would further complicate implementation of this KEP.
Therefore the principle of storing local time (understanding UTC as one possible time zone) along the lines of store what you mean has come out as the preferred option for most participants in the discussions, and all participants could accept this approach.
Usage of the Olson database
Because of the disadvantages for in-format storage of time zone data, the preference of the majority was to store only a reference as geographical identifier, which was understood to be the most robust form of storing time zone references that can adapt to virtually all possible changes in time zones. When going down this route, agreeing on a limited set of time zone identifiers that is nonetheless complete makes it substantially easier on client implementers.
The Olson database is specifically mentioned recommended in the VTIMEZONE section of RFC 2445  as a good reference base for globally canonical time zone IDs and to our knowledge comes closest to an actual global standard for such time zone IDs. The Olson time zone data is also reportedly the data source most widely (see ) used, in particular by:
- BSD-derived systems, including FreeBSD, NetBSD, OpenBSD, DragonFly BSD, and Mac OS X;
- the GNU C Library and systems that use it, including GNU, most Linux distributions, BeOS, Haiku, Nexenta OS, and Cygwin;
- the Java Runtime Environment since release 1.4 (2002);
- the Perl modules DateTime::TimeZone and DateTime::LeapSecond since 2003
- Oracle releases since 10g (2004);
- PHP releases since 5.1.0 (2005);
- PostgreSQL since release 8.0 (2005);
- System V Release 4-derived systems, such as Solaris and UnixWare;
- AIX includes zoneinfo but does not use it itself, as it is for support of third-party applications like MySQL;
- several other Unix systems, including Tru64, and UNICOS/mp (also IRIX, still maintained but no longer shipped).
- Python via pytz module
Choosing the Olson time zone identifiers will simplify matters for all clients on the above platforms or using the above programming languages.
The issue of most concern is that Microsoft Windows has its own time zone identification system. A bridge is however provided by the Unicode Common Locale Data Repository (CLDR) as well as the International Components for Unicode (ICU). The CLDR also provides mapping for Microsoft Windows time zone IDs to the standard Olson names. So using references to the Olson timezone database is likely the best out of an imperfect set of choices.
In order to support clients on non-Olson platforms, as well as all clients in their iTIP  handling, Kolab Systems shall work with all client implementors to maintain and continue to make freely and publicly available a tzid mapping table to match various time zone ids to Olson database locations.
iTip / VTIMEZONE completeness
By choosing the approach of reference to Olson time zone IDs, Kolab Clients will not be able to easily implement a rarely used aspect of iTIP invitation.
This is because iTip requires a client to parse multiple arbitrary time zone definitions and their VTIMEZONE data as part of an iTip invitation, and handle them correctly. If a Kolab client receives an invitation with an unknown time zone identifier that cannot be mapped to an Olson time zone ID, the client may not be capable of handling the invitation, at all. For all Olson time zone IDs, the recommended source of IDs for VTIMEZONE (see above), and for all common Windows time zone identifiers we expect to be able to handle them correctly in the first implementation of this KEP. Despite our research into the matter, we could not find a use case for fictional time zones made up as part of an iTip invitation.
We also discovered that other clients also do not seem to implement iTip completely, e.g. Microsoft Exchange/Outlook only supports one VTIMEZONE object.
So full iTip compliance seems rare, if it exists, at all, and iCalendar & iTip both suffer from the weaknesses of in-format storage, as well as a lack of update rules and possibilities, which leads to inconsistencies in the way the data is handled between clients when the client knows the in-format data to be outdated. By going by time zone identifies strictly and references only, translating where necessary, we expect Kolab to provide a more consistent and robust user experience that comes closer to a users expectation than other iTip implementations which stick to VTIMEZONE storage.
In any case the trade-off seemed an acceptable design decision to make.
The design decisions outlined above also are considered a good match for the future because we expect major progress from RFC draft Timezone Service Protocol. Once it has matured further and sees implementation, it is likely to provide accurate data with mapping of aliases to DST data without local databases, thus resolving the primary weakness of the database approach, and for its spread today will likely support the Olson timezone database aliases.
- ↑ KEP:1
- ↑ Georg Greve, http://kolab.org/pipermail/kolab-format/2010-October/001004.html
- ↑ Henrik Helwich, http://kolab.org/pipermail/kolab-format/2010-October/000999.html
- ↑ RFC 3339: Date and Time on the Internet: Timestamps
- ↑ 5.0 5.1 ISO 8601: https://secure.wikimedia.org/wikipedia/en/wiki/ISO_8601
- ↑ Wikipedia: Zoneinfo
- ↑ RFC 2445: Internet Calendaring and Scheduling Core Object Specification (iCalendar)
- ↑ Wikipedia: Zoneinfo
- ↑ RFC 2446: iCalendar Transport-Independent Interoperability Protocol (iTIP)
- ↑ "[MS-OXCICAL]: iCalendar to Appointment Object Conversion Protocol Specification" V 5.0 of 18 March 2011, http://msdn.microsoft.com/en-us/library/cc463911(EXCHG.80).aspx, Footnote 44 in Section 18.104.22.168.1.19 & Footnote 46 in Section 22.214.171.124.1.19
- ↑ Douglass, http://tools.ietf.org/html/draft-douglass-timezone-service-00
This document has been placed in the public domain.