Hi - I need to sum a number of machines within dataset, The only problem is that we have a lot of duplicates, so i need to identify the unique ones. There is like a machine reference number but the problem I'm having is to how to fish out only one of them to have a value of 1 and ignore the others. Which formula I should use?
This is what i need to end up with, it s basic if formula but it is able to give me number of unique values, which then i can sum on charts or divide other calculations by
Wojciech - Use an Aggregation block to investigate relationships between unique value counts/value lists. Set the Aggregation block to split only by unique Manufacturer Serial Numbers, and change the aggregation function on the Count function from Sum to Unique Values to Unique Value Count and finally back to Sum. You will see that each unique value of Serial Number has only 2 values in Count; 1s and zeros, and that there is only a single instance of the 1, because the sum of Count for all unique values of Serial Number is always 1.
Because your data had only a single 1 and the rest zeros for each unique Serial Number, summing the Count field gives the correct total unique Serial Number count of 492.
However, to be able to reference/use this value (492) in other formulae, you need to add a column to the data set explicitly by adding a formula field that counts the unique Manufacturer Serial Number Values:
SUBSET_UNIQUECOUNT[Manufacturer Serial Number]
This formula also returns 492 for all rows, because it is calculated across the whole data set.