The append operation allows you to append the data from two or more data-sets to create a single data-set.

In this example we are appending the data from 3 data-sets together. Each data-set contains spending data from selected days a single month. The first data-set contains data from January, the second contains data from February and the third contains data from March. The expense data is shown below.

We use an append operation to combine the data from each of these three data-sets. In this example we select all fields and change the name of the Source field to 'Date'. The DataManager configuration for this operation is shown below.

The Merge/Join operation allows you to merge the data from two data-sets together to create a single data-set.
In most scenarios merging two data-sets requires that the two tables have at least one common field, and that at least one of these fields contains no duplicate records. If you intend to merge on a Date/Time field the storage/display formatting of both merge fields must be identical.
Join by matching records where...
This allows you to specify the join criteria by selecting the matching fields from both datasets. You can define multiple join criteria; each criteria specifies a single match. To add a new join criteria click on the Add join criteria button. When you select a field to match on the number of unique records in that data-set are shown. In most cases one or both of the match fields should contain no duplicate values. If both fields contain duplicate values the merge may result in a huge number of records. Use the Accent sensitive and Case sensitive options to determine whether accent characters or case have a bearing on whether a value from the first data-set matches a value from the second.
This determines which records are included in the merged data based on the join criteria you specified. You can select any combination of these options.
This alllows you to determine what action Omniscope should take if there are any fields outside of the join criteria with matching names.
Select this option to create an additional field that lists the name of the input data-set each row in the data originated from. For merged records both data-sets will be listed.
In this example we will be merging the data of two tables. The first table contains a list of customers. The second table contains a list of transactions made by the customers during January. The input data for the Merge operation is shown below.

In order to merge these two data-sets we need to identify the join criteria. In this case both tables have a common field: Customer ID. We therefore need to create a merge operation with a single Join criteria, matching on records where Customer ID from the customers table matches Customer ID from the transactions table.
The DataManager configuration for this operation is shown below.
The Field Organiser operation allows you to manage the fields inside a data-set. You can add, delete, re-order and edit the properties of the fields.
The Field Organiser lists all of the fields in the data-set in the order that they appear. Each row of the list represents a single field.
You can add a field by clicking on the Add field button at the bottom of the operation. New fields will be added at the bottom of the list.
You can rename all of the fields by clicking on the Rename all button.
You can delete all of the fields by clicking on the Delete all button.
For each field Omniscope provides two sets of options:
These options are available for all field types.
|
These options only appear in Decimal or Integer number fields.
These options only appear in Text fields.

These options only appear in Date fields.
![]()
Go Back to data-operations [1].
The summarise fields operation allows you to create a "Summary" field that combines the values from one or more fields into a single field.

The Summarise fields operation is useful when importing data-sets with a large number of fields. Trying to analyse such a data-set can be inefficient, particularly when the data-set also contains a large number of records. If these fields aren't required for analysis, but you still want to retain the data it may be beneficial to create a single summary field instead.
In this example we are importing a data-set that contains information about a set of employees in the company. A sample of this data is shown below.
![]()
This data contains some fields that are useful for analysis: Name, Date of birth and Position. The data also contains some fields that we don't need to analyse: Performance evalutation and Academic qualifications. We could simply remove these fields using a field organiser, but we still want to be able to view the data. In this scenario we could use the Summarise fields operation to create a single field containing the values from these fields. Doing so will improve the overall performance of Omniscope once we have loaded the data.
The DataManager configuration for this operation is shown below.
Go Back to data-operations [1].
The Record filter operation allows you to generate a subset of the rows in a table by applying one or more filter rules.
Filter rules
Each filter rules defines a single condition for selecting a set of records in the data. You can create multiple conditions by defining multiple filter rules.
To add a new filter rule click on the Add rule button in the bottom toolbar. There is no limit to the number of rules you can add in a single Record filter operation.
To remove a single rule click on the Remove button or click the Remove all button to remove all of the rules in the operation.
To view or edit the rule click on the expand button
or click on the rule name. Each rule is comprised of 3 elements:
The match criteria options are shown at the top of the operation. They determine how the filter rules should be applied to the data. You can choose to either accept or reject the records that match all or any of the filter rules that are defines in the record filter operation.
In the following example we have a data-set containing a list of company employees. The data is shown below.

We will use the Record filter operation to retrieve all Female employees based in London. To achieve this we need to create two filter rules. In the first filter rule we want to obtain all employees based in London. This rule is configured as folllows:
The result of applying this rule is shown below.

Now we want to add another filter rule. In this rule we want ot obtain all female employees. This rule is configured as follows:
Go Back to data-operations [1].
The Random Sample operation generates a data-set containing a random sub-set of rows from the input data. This can be useful when you are working with very large data-sets, allowing you to work with a smaller sample of data while preparing and testing additional operations that need to be applied to the data.

Options
Example
The Random sample operation can be useful when you are working with very large data-sets. You can use the Random sample operation to generate a small sample of the data.This is useful because some operations can take a long time to execute on large data-sets. By working with a smaller data-set you can create, configure and test additional operations that you want to apply to the data much more quickly.
In this example we are working with a fairly large data-set containing approximately 1,000,000 records. We want to use a combination of the Random sample operation and the Input switch operation to switch the data between a small sample of 1,000 records and the full data-set without having to reconnect our workflow. A configuration that allows us to do this is shown below.

The Input switch operation allows you to switch between two input data-sets.

The Input switch operation contains only a single option: the switch. Clicking on the switch allows you to select the data-set you want to use.
An example of using the Input switch operation can be found in the Random sample [2] operation documentation.
The De-duplicate operation allows you to remove or retain the duplicate records in a data-set.
The aggregate options allows you to define an aggregated transformation of the input data and to define the aggregation functiona applied accross the values in each field.



The Searh/replace operation allows you to replace all occurrences of one value inside one or more fields with another.

Each Search/replace operation allows you to define one or more search actions. The actions appear as a list and are executed on the data in the order they appear in the list. You can change the order of the search actions by clicking and dragging the hand icon. You can change the name of the search by double-clicking on the name.

Search options
You can view and edit individual search options by clicking on the name of the search action or by clicking on the expand button.
The Scramble operation allows you to scamble the text in one or more fields. This allows sensitive data to be removed without affecting the structure of your file.

The Expand values operation allows you to expand the values inside a single field into one or more new fields.

![]()
![]()
The Collapse values operation allows you to combine all of the values in one or more fields into a single new field.

The Sort operation allows you to sort your data by one or more fields.

The Text mine operation is used to extract and derive useful information from text in your data. The text mine operation currently provides three different mining algorithms; Top words, Word count and Sentiment.

Select which text fields you want to mine.
The Top Words algorithm extracts the most popular words from the selected fields. The popularity of a word is determined by the number of occurrences in a text cell. Short words such as "the", "and" and "or" are ignored.
The Word count algorithm counts the number of words and the number of unique words in a text cell.
The Sentiment algorithm attempts to determine the sentiment of the text. Sentiment is determined by counting the number of positive and negative words and calculating an overall score. A positive score indicates a positive sentiment whilst a negative score indicates a negative sentiment. The higher the score, the higher the sentiment.
The Google translate operation allows you to translate the text in one or more fields of a data-set from one language to another. It uses the Google translate service to perform the translation. You must have access to the internet to use this operation.

Links:
[1] http://kb.visokio.com/data-operations
[2] http://kb.visokio.com/node/661