Tideway Community Forum

forgot password?
   
 
SoftwareInstance Names and Types
Posted: 29 October 2008 02:28 AM   [ Ignore ]  
Jr. Member
RankRank
Total Posts:  46
Joined  2008-10-17

This one really isn’t a “How Do I?” question per se, but I wasn’t sure what section it should go in.

I have noticed that most of the patterns I have looked at always have something like ‘type: name version on host.name’ as part of the si name. I’m curious to know the reasoning behind this. What is the benefit of having all of these different attributes crammed into the name field of the Software Instance when there are individual fields for storing these values?

Regarding Software Instance type, I noticed that the type is sometimes set to ‘type_prefix: type’ or ‘type: name’. I find this a bit confusing as far as the conventions being used. Can anyone give me some insight as to why patterns have been implemented this way?

I guess the How Do I part of this question could be: How do I use the name and type fields? Maybe I’m just missing something here, but if I don’t see a value in storing all of these attributes crammed in a single field, is there an easy way to configure the patterns with a simplier naming convention for type and name? Or would this involve manually editing every pattern with the new naming convention?

Thanks,
Wes

Profile
 
 
Posted: 29 October 2008 04:23 AM   [ Ignore ]   [ # 1 ]  
Newbie
Avatar
Rank
Total Posts:  6
Joined  2008-02-08

I believe the ‘name’ attribute is made of multiple components to make it unique in the DB.

The SI ‘type’ identified the name of the CI. If you want to search for all hosts that contain oracle RDBMS, you should search on ‘type’ attribute. Generally you shouldn’t be searching on the name.

Profile
 
 
Posted: 29 October 2008 09:26 AM   [ Ignore ]   [ # 2 ]  
Administrator
Avatar
RankRankRankRank
Total Posts:  132
Joined  2008-01-25

Actually the only field you should have wholly unique on an SI is the key field. The key field will be a hash of all the fields that give this SI a unique identity. Usually those will be something along the lines of Host.key+SI.type+SI.instance. Key generation is vital when pattern writing as it is this field that is used to decide if a new node is needed or an existing one updated. Given that the key field tends to be a hash it is not normally displayed.

The name field contains a number of pieces of information because this field is the one usually used in summaries, reports and page titles. From experience most users expect this field to be unique and reflect the identity of the SI and when we trialled having partial information it caused a lot of confusion. People felt there were lots of duplicates and it was hard to locate, say, the Apache Webserver you were interested in a list of 200.

Clearly you could write reports that included all the “identity” fields, but this is not complete enough as the Foundation model is flexible enough to cope with variable numbers of “identity” fields unlike the fixed keys of an RDBMS. So if a particular product can run as two instances on the same server differentiated by the port they listen on then we can form a key from Host.key+SI.type+SI.port whereas if the way to differentiate them is by the config file they load we can form a key from Host.key+SI.type+SI.config_file.path. So it would be difficult to form a report showing all fields.

So the convention is to set the name field to the “readable” or “plain text” equivalent to the key field. For SIs this leads to names of the form “<SI.type> identified as <SI.instance> running on <Host.name>” when the key field is Host.key+SI.type+SI.instance and so on.

The name tends to be unique as it reflects the key. In some edge cases it may not make sense to include all the “identity” fields in the name, say if one of them is a 20 character license key, as it would make it largely unreadable.

These conventions apply to all the nodes in the inferred part of the model.

So the best thing to remember is that the key field is the true unique id and the one you should use if you need to write patterns. The name is the plain text equivalent of the key and is the one to use in everyday reporting.

Sorry for the long reply, but hopefully it provides a bit more insight into the model and the datastore.

Profile
 
 
Posted: 29 October 2008 09:58 AM   [ Ignore ]   [ # 3 ]  
Administrator
Avatar
RankRankRankRank
Total Posts:  132
Joined  2008-01-25

It occurs to me I didn’t talk about SI.type

If you are used to data modelling then a useful way of thinking of type is that it logocally subclasses the node kind. For this reason, as Paul said, if you are building searches and reports this is the field to search on to find the type of SIs you want (in fact ideally you should use it as the first part of the where clause to gain maximum benefit from the indexes)

In many basic cases this leads to the type being the name of the product being modelled. In more complex cases where there maybe several sub components, or a suite, there can be a few related types that tend to follow the style of <component name>. As the dominant use of the type field is for searching, classifying and selecting the convention is to use the minimum number of types if the simple approach of using a single type reflecting the product name is not sufficient.

A similar use of type is recommended on BAI nodes.

If you have particular examples that you noticed then let me know.

Profile
 
 
Posted: 29 October 2008 02:22 PM   [ Ignore ]   [ # 4 ]  
Jr. Member
RankRank
Total Posts:  46
Joined  2008-10-17

Thanks for the explaination. It helps out some. Personally I am just used to more normalized sets of data. From your description and from looking at the data I have collected so far, it seems that the Type field is used more in the way that I would consider the name and the Name field is used more for what I would consider a description. This is simply a matter of preference and doesn’t really affect the capabilities of the product.

I am curious though what the performance differences are for reporting. I don’t really see a need to combine multiple peices of data into a single field. In normalized data sets and standard SQL, having a field with non unique data allows queries to be optimized by selecting disctinct records to create a concise list or combining with other fields and using group by to organize related data. How does Tideway handle select distinct or group by commands?

Profile
 
 
Posted: 29 October 2008 03:13 PM   [ Ignore ]   [ # 5 ]  
Administrator
Avatar
RankRankRankRank
Total Posts:  132
Joined  2008-01-25

It’s important to remember that the datastore is not an RDBMS. Generally the tasks that you need to use group by and select distinct are replaced by traversals, key expressions and where clauses.

For instance in a simple case in an RDBMS to find the software running on a host you would probably make a direct selection of the host of interest from the host table and join it with the software table on a foreign key relationship. In the Foundation Datastore you would make a direct selection of the Host node and then traverse over the Host:HostedSoftware:RunningSoftware relationship, that will return a nodeset of all the SoftwareInstances.

You can process result sets. So to do the equivalent of

SELECT DISTINCT type FROM SoftwareInstance 
you would use
SEARCH SoftwareInstance SHOW type PROCESSWITH unique() 

or you can get a group by count the way you would with

SELECT typecount(typeFROM SoftwareInstance 
by using
SEARCH SoftwareInstance SHOW type PROCESSWITH countUnique(0

Profile
 
 
Posted: 30 October 2008 10:06 AM   [ Ignore ]   [ # 6 ]  
Newbie
Rank
Total Posts:  4
Joined  2008-02-01
Charles Oldham - 29 October 2008 09:58 AM
I In many basic cases this leads to the type being the name of the product being modelled. In more complex cases where there maybe several sub components, or a suite, there can be a few related types that tend to follow the style of <component name>.

Convention used in the patterns released as part of Tideway Knowledge Updates (TKU) is as follows:
The ‘type’ field of an SI is always set as either:
Publisher + Product Name (if the product contains 1 component of interest)
or
Publisher + Component Name (when the product contains multiple important components and particularly if these components can run on multiple hosts)

Profile
 
 
Posted: 30 October 2008 03:25 PM   [ Ignore ]   [ # 7 ]  
Administrator
Avatar
RankRankRankRank
Total Posts:  132
Joined  2008-01-25

Nik’s just reminded me that we do publish our conventions for Name, Types and Keys.

He also reminded me that out convention for keys is most restrictive to least, so I should have been using SI.instance SI.type Host.key in my examples.

Oops :)

Profile