PIM-Item Relations
From Kolab Wiki
Contents |
Abstract
To enable modeling of relations between pim objects as well as modeling parent structures the pim object is in, this document lines out a "Releation" property. Further, 3 relations types with a corresponding concept are lined out, which are believed to be actually useful, compared to simple plaintext tags. The "Relation" property is not restricted to these 3 concepts, and can be extended with more concepts as required. Note that the a relation is a mere suggestion in what structure or context an item should be displayed to the user.
The aim is that it is for example possible to:
- Embedd a note within a todo hierarchy (so the system work across different object types)
- Embedd a pim object within a hierarchy of Plaintext tags, while retaining the possibility to rename any of those without breaking the structure.
Relation
A relation tree can be stored using child relations or parent relations. While child relations are more practical when storing a complete tree, parent relations are easier to handle when transporting the relation tree with each item (each item stores it's parents). This would also allow for adding a "Relation-Kolab-Object", which transports a complete relation tree.
Child relation tree
child = element child-node {
element object {UID} |
element uid {UID},
element name {String} ?,
child-node *
}
child-relation = element relation {
element type {String},
element timestamp {UTC-DateTime},
child-node*
}
A relation consists of a root relation-element and a set of nodes. The relation element specifies the type of relation for the contained tree. The nodes build the relation-tree, within which the object concerned is located. The object can be located anywhere within the tree (also in multiple locations), and is identified by the object-element containing the UID of the object concerned. The object element MUST only be used if the UID refers to an existing pim object. If the uid element is used, the contained UID MAY be matched agains an existing object according to the rules of the relation-type, or may be a virtual node defined by only it's name and uid.
- relation: The root element of the relation
- type: The type of the relation-tree
- node: a single node in the tree
- uid: the uid of a node, this uid may be used to match other objects according to the rules of the relation-type. The uid SHALL be globally unique.
- name: A name of the node which SHALL be used as long as there is nothing else available.
- timestamp: The last modified date SHALL be used in case the there are no other means for conflict resolution.
Parent relation tree
parent = element parent-node {
element uid {UID},
element name {String} ?,
parent-node *
}
parent-relation = element relation {
element type {String},
element timestamp {UTC-DateTime},
parent-node *
}
No matching of the object is required, as each object always only specifies it's own parents (respectively the relevant branch of the tree).
Initially defined types
The types can be freely defined and possibly have further rules (i.e. on node matching), as well as a corresponding concept how they should be used. The following types are initially defined, following the concepts descibed here #Concepts.
Topics
A simple text tag.
- Items MAY belong to several topics.
Project
Projects should be matched against Todos. If a todo has the same uid as the project, that means that the todo describes the project and the name of the todo should override the name of the node.
- virtual nodes MUST NOT be used for projects
- the project tree itself is built by the todo realated-to mechanism, the relation property is only used to attach notes to todos
Context
A simple text tag.
Items MAY belong to several contexts.
Implementation
xCal/iCalendar based objects
- Projects:
A Project-Node SHALL be a todo object, and the Project hierarchy SHALL be modeled using the related-to property.
- Topic:
Topics are not applicable to calendaring objects.
- Context:
Contexts shall be modeled using the x-relation property.
xCard/vCard based objects
Projects, Topics and Contexts SHALL be modeled using the x-relation property.
Note
Projects, Topics and Contexts SHALL be modeled using the relation property.
Examples
Topic tree
A Topic tree for a personal knowledge base, note that the object appears twice in the same tree:
<relation>
<node>
<uid>uid2</uid>
<name>Linux</name>
<node>
<object>uid4</object>
</node>
<node>
<uid>uid3</uid>
<name>Development</name>
<node>
<object>uid4</object>
</node>
</node>
</node>
<type>Topic</type>
</relation>
- Linux
- Linked object
- Development
- Linked object
Project tree of two items
The first item adds this tree:
<relation>
<node>
<uid>uid1</uid>
<name>Conquer the World</name>
<node>
<object>uid2</object>
</node>
</node>
<type>Project</type>
</relation>
- Conquer the World (uid1)
- The object (uid2)
Note that "The object" could be a todo but just as well a note or any other pim item which can be identified by a uid. However the item with uid1 MUST be a todo object according to the rules of the "Project"-type.
Now the object with uid1 arrives.
<relation>
<node>
<uid>uid3</uid>
<name>Conquer the Universe</name>
<node>
<object>uid1</object>
</node>
</node>
<type>Project</type>
</relation>
- Conquer the Universe (uid3)
- Conquer the World (uid1)
- The object (uid2)
The new information of the second item has been used to reparent the tree.
A next item arrives:
<relation>
<node>
<uid>uid3</uid>
<name>Conquer the Universe</name>
<node>
<uid>uid1</uid>
<name>Conquer the World</name>
<node>
<object>uid4</object>
</node>
</node>
<type>Project</type>
</relation>
- Conquer the Universe (uid3)
- Conquer the World (uid1)
- The object (uid2)
- Another object (uid4)
As a possible conflict szenario, another item arrives:
<relation>
<node>
<uid>uid1</uid>
<name>Conquer the World</name>
<node>
<uid>uid3</uid>
<name>Conquer the Universe</name>
<node>
<object>uid4</object>
</node>
</node>
<type>Project</type>
</relation>
This is directly contradicting to the information we already have.
Identification
If a uid is set, the uid is used to identify another object. If the uid is left empty, the node may also be matched by name to other nodes of the same type. This allows the system to be used for building simple tag hierarchies. FIXME: that probably shouldn't be allowed, so remove.
Considerations
Existing mechanisms serving a similar purpose
iCalendar/Kolabv2/vCard Categories
Categories are an existing mechanism to establish relations between objects, they are however rather limited. It's basically only a tag with no concept attached to it. In order to make it a bit more powerful KDE started to create hierarchical categories, but since identification is done by name it breaks as soon as one wants to rename/move a category.
Generally speaking everything that goes beyond a plaintext tags using categories is out of the specification (at least the iCalendar and vCard specs), and is therefore highly fragile.
Since it is existing in the official specification we're using, we still have to support it, but it should be treated as plaintext tag and no more.
iCalendar related-to
Since this is the commonly used mechanism to model a tree of todos is shall be used instead of a project relation.
Attaching notes vs. existing properties
While the relations property allows to attach a note to basically anything, this mechanism is not meant to replace existing properties such as the vCard note-property or the iCalendar description-property. The Relation property is a suggestion that the items may be related, where using the vCard/iCalendar properties is a "contained in"-relation, which is much stronger.
Conflict resolution
The lastModifiedDate SHALL be used for conflict resolution (newer always wins). This means the lastModifiedDate MUST be stored withe *every* node in the tree, once extracted from the object. This is for the program building the three to check, as a new item arrives, if the provided data is more up-to-date or not. Note that the lastModifiedDate MUST only be updated when the tree changed, otherwise changes to the tree will be overwritten by older versions of the tree stored an object which has been recently modified (but not the tree itself).
If multiple persons are modifying the same tree in parallel that will indeed overwrite the changes of everyone with the most recent version. Alternatively a manual conflict resolution could be provided if the lastModifiedDates differ and the local tree has been modified (difficult to detect).
Example
There is an existing tree like this:
- Conquer the Universe (uid3)
- Conquer the World (uid1)
- The object (uid2)
- Another object (uid4)
Now we get a new Item with conflicting information:
- Conquer the World (uid1)
- Conquer the Universe (uid3)
- The new object (uid5)
If the new item has more up-to date information (lastmodified date is more recent), we restructure the thole tree to get this:
- Conquer the World (uid1)
- The object (uid2)
- Another object (uid4)
- Conquer the Universe (uid3)
- The new object (uid5)
If the already existing information was not as up to date as what is already in the tree, we would simply only use the direct parent information:
- Conquer the Universe (uid3)
- The new object (uid5)
- Conquer the World (uid1)
- The object (uid2)
- Another object (uid4)
Problems with the appraoch
Outdated reparenting operations
Reparenting a parent relies heavily on the conflict resolution, so if have two separate systems working on the same dataset, a reparenting operation can easily become outdated because:
- System1 reparents a parent-node, and saves the information to item1
- System2 edits another item2 within the same tree, before item1 has been synchronized, and saves the "old" tree-information
- The tree information of item2 now reverts the changes done to the tree structure because of it's more recent timestamp.
Reparenting a parent:
- ParentNode1
- SubParentNode
- object1
- object2
- ParentNode2
Now object1 and object two have the following parent-relation tree stored:
ParentNode1/SubParentNode
object1 and object2 are synchronized across two different systems.
Now system1 reparents SubParentNode to ParentNode2:
- ParentNode1
- ParentNode2
- SubParentNode
- object1
- object2
But before that information has been synchronized to system2, system2 modifies object2, thus storing the old tree layout again.
So we have on System1:
- object1: ParentNode2/SubParentNode
- object2: ParentNode2/SubParentNode
And on System2:
- object1: ParentNode1/SubParentNode
- object2: ParentNode1/SubParentNode
Object 1 will merge just fine (because it hasn't been modified meanwhile), but for object two there is a conflict.
SOLUTION: The conflict can be resolved easily if the lastModifiedDate of the relation property is only updated when the tree has been changed (meaning only explicit changes to the tree result in the tree being overwritten.
Unexpected behavior
If the tree is built over several shared directories, meaning each client may only have a subset of all the data available, it can come to rather unexpected behavior where the complete tree structure is changed, only because another shared folder was included (containing updated tree structure information).
Solution
To avoid this problem it is necessary only do matching within certain boundaries:
- Only match within a single storage location unless explicitly stated otherwise
- Don't match between shared folders and own folders (that could become rather strange anyways if some of the folders have write access and some not)
- While Notes and Todos may be stored in different locations, it is required to match the Projects in the notes against the todos.
The root tree would be:
/Resource1 /Resource1/shared/ /Resource2 /Resource2/shared/
Parent relations vs. child relations
Parent relation: a child referencing it's parent(s). Child relation: a parent referencing it's children.
Child relations have a couple of advantages:
- makes it very easy to delete a parent node while with a parent node being embedded in every object (parent relation), each item must be modified to delete a relation completely.
- less data, better performance: the relation tree has only to be parsed once and applies to many objects
However, child-relations only work if it can be guaranteed that the parent object is always available when the child object is.
RDF Triple-Store
Instead of a custom format, we could also just store rdf-triples, allowing to transport all kinds of information together with the object. It would become a lot more difficult parse and interpret the data though.
Rationale
With technologies and possiblities evolving on the desktop becomes increasingly difficult to synchornize all the data properly. Zanshin (A Todomanagment/Notetaking application) aims at unifying the workflow with all kinds of PIM items and Nepomuk (A Semantic Desktop implementation), allows for easy integration of all kinds of data avilable on the desktop. This results in relations modeled on the desktop, for which we currently don't have the means to synchronize them. If we start to synchornize this data accross different machines we run into the problem that we don't have necessarily all data avilable on the other side and need to be able to synchronize the relevant parts only. The relation property allows us to synchronize arbitrary relations between object directly within the object concerned. This is important in an environment where it cannot be guaranteed that the same data is also available at the destination.
Concepts
They idea is to have some workflow concepts attached to the grouping, so we can build a UI supporting that workflow, where i.e. a iCal category hardly has any workflow in mind and therefore also is of little use (in my experience). Having a workflow in mind also allows for optimizing the UID towards that workflow. The following lines out three means to group/organize tasks and notes. Note that the concepts are completely orthogonal, and a PIM-Item can be in several Projects/Contexts/Topics at the same time.
Projects and Contexs are GTD (Getting Things Done) Jargon and activity related concepts. I.e. they express that you want to accomplish something (a project), or that you want to do something within a certain context. A Topic on the other hand is much closer to a simple tag, and simply provides the means to build a knowledge base (like a personal wiki)
See also: http://www.intevation.de/pipermail/kolab-format/2012-April/001606.html
Projects
A "Project" is any todo which contains several actions (todos). So it's a way to break up larger task into smaller steps which can be accomplished within reasonable time. Projects are not necessarily large tasks though, they can also be very small, which is important as projects are replacing subtodos. Any todo with subtodos automatically becomes a project. This way todos can be structured into a tree. To allow linking further information to a project, a project can also contain notes, to allow having all required information centrally available. Note that while a project can also contain notes, a note can never contain todos.
A todo can only belong to one project at time, notes can be linked to several though.
Useful for:
- Todos
- Notes
- (could be extended with events IMO)
Contexts
A "Context" is a tag that describes what context we're currently working in. It provides a way to filter task based on what situation you're in, and what you can or should do in that context. The idea is that you can have contexts such as:
- office: you can do stuff for which you need your office
- traveling: easy todos you can do e.g. in a plane
- John Doe: All todos/notes you need when you are on the phone with John
A pim-item can belong to several contexts.
Topics
"Topics" are only meant for notes. The idea here is that you can have traditional notetaking in a tree-like structure. I.e. for a personal wiki. Important is that a note can belong to several topics at the same time, so it's not a container, but more a tag.
