What are the 'rules' for flipping from category to text data type?

November 1, 2011 12:01PM

I'm sure this must be covered elsewhere on the forums, but 10 minutes of searching didn't yield a result, so here goes...

Most long-term Omniscope users will be familiar with the similarities and differences of Category and Text data types. Most, I'm sure, will have experienced the odd headache when defining a Category field, and then 'over-filling' the Category with "too many" entries so that Omniscope decides it should be held as Text. Some will have experimented with various technical/advanced settings to try to control this.

At present we have 3 separate (but related) situations where this *appears* to be the case (i.e. a Category field is forced to become a Text field), but we're struggling to identify the exact cause. Ideally, we would like *absolute control* (in a Field Manager block, in DataManager, preferably) over a field's data type - and we're happy to take the consequences (e.g. a flaky IOK file!). In the absence of such control, it would be *very* helpful to understand the 'rules' that Omniscope uses to make this decision - this way we may be able to protect against it.

So, the cases we have identified are:

1. Too many separate values in the Category
We're all familiar with the warning ("you have more than 500 entries...") which Omniscope displays when a user converts a Text field into a Category. We can override this (by accepting the risk), but we have experienced cases where refreshing a model with new data causes Omniscope to reassess this risk, and convert from Category to Text.

2. Too long a text value / too many long text values
We're not exactly sure of the 'rule' here, but we have documented cases where a Category field with only 10-20 entries is converted to a Text field, and this seems to be where individual field values are long i.e. LEN([FieldName])>100 (or 200?). It may even be some form of aggregate, where more than 5 entries of more than 100 characters force the switch to Text?

3. Dynamic content from a Formula
We have cases where the content of a Category field is switchable according to a Variable setting i.e. a formula like:
IF([Variable]="X",[CatFieldOne],[CatFieldTwo])
We can define the formula field, and the two source fields, as Category data type... and both the 'source' fields (CatFieldOne and CatFieldTwo) remain as such, but changing the Variable value - and hence altering the content of the formula field - causes Omniscope to 'reasses' and flip to a text data type.

I would be interested in:
* Anyone else's experiences of Category to Text conversion - are there other cases I haven't identified?
* Visokio's summary of the 'rules'
* Any workarounds that people have discovered
...
* and a nice, simple solution from Visokio :)

Thanks!

steve · November 1, 2011 2:19PM

Guy, the category vs. text distinction is one we hope to remove altogether at some point, perhaps during planned data engine upgrades in 2.8. Some areas of the application ignore it anyway, such as the bar view, and future planned versions of views such as pivot.

You can currently disable the (overly?) protective nature of Omniscope by unticking "Settings > Advanced file settings > Auto-convert text fields". This is a file-wide control. A recent fix in 2.7, due to be back-ported into next week's 2.6 build, fixes an issue where this setting was being ignored on refresh from source.

Any edits to the cell data in a column, including those as a result of formula recalc, will trigger the check.

Non-tokenised category fields will be auto-converted to text if they have:
- 200 or more unique values
- average value length of 100 characters or more

In light of the confusion this causes, we have changed the rules as follows, so they will not kick in anywhere near as often:
- 1000 or more unique values
- average value length of 1000 characters or more

This change has been made in 2.7, and will be back-ported to 2.6 shortly.

November 1, 2011 3:37PM

Steve

Thanks - really useful feedback. We already use the "Auto-convert" setting (although I can't be sure this is used in *every* case.. I will start checking!) - am I right in saying this applies in the IOK file, irrespective of which Omniscope instance opens it (and on which computer)?

It sounds like the issue to do with re-evaluation is what's affecting some of our 'template' files - I will ensure we re-test on the 2.6 update once it's in place.

The 200 values / 100 average length is easy for us to test against our data sources - we can verify that this is the source of the issues we have been experiencing. It sounds like the change (to 1000/1000) and the fix for the 'refresh from source' bug will sort it all out... at least until 2.8 :)

steve · November 1, 2011 3:56PM

Yes, it is IOK-specific, persistent in that IOK whatever PC it is opened on, and not installation-specific.

We'll post back here when the change has been ported back. It will be in 2.7 from b102, tonight.

Keesup · November 10, 2011 9:10AM

We have implemented Steve's work around but the published file still results in category fields converted to text. Enterprise build 110 , "publish from source"

November 10, 2011 9:30AM

@Keesup - thanks... looks like we need to do some more testing (it's scheduled but not started yet)...

steve · November 22, 2011 8:12AM

The above changes have been back-ported to 2.6 and will be available from next week's new build of 2.6.

Welcome!

Categories

Ideas Parade

Tagged