Visokio website     Downloads     Video tutorials     KnowledgeBase  
Idea: Splitting data sets in outputs? - Visokio Forums
Idea: Splitting data sets in outputs?
  • VoteVote Up5Vote Down     mburgess January 5, 2012 7:37AM
    If I have one list of products, is there a way that i can say for every unique combination of one or multiple fields, e.g. category & brand that i can 'split' the full list into as many combinations that exist in the list.

    This would need to be dynamic as number of combinations may vary by report, by retailer, by time period.

    I can identify each unique combination with a reference by collapsing the fields i want to use, but i cannot seem to find a way to then split it either within Omniscope or as part of a publish/export command.

    Does this exist, if not could it exist in the future?

    Thanks, Matt
  • 10 Comments
  •     michael January 5, 2012 8:52AM
    Hi Matt,

    Can you describe in details some example (what is input and what is output) of what you want to achieve.
    Do you mean that if for example we are splitting an output by a product in the DataManager workspace, we need to have a separate file for each product?
  •     mburgess January 5, 2012 8:59AM
    Could be done at the end of a DataManager data flow process that has run ETL transformations/scrubbing. Could do it within Omniscope DataManager as an Output block that exports/publishes separate output IOK or CSV files

    Example:
    unique products = 10,000
    unique categories (attribute of each product) = 10
    unique brands = 100

    therefore unique category & brand combos would be 10 x 100 = 1,000

    I would want Omniscope to take my 10,000 products and split them into separate blocks or files for each of the 1,000 combinations...does this help?

    Thanks
  •     chris January 5, 2012 9:14AM
    Hi Matt,

    Can you provide us with some actual sample data (a simple dataset containing the source data and a simple dataset containing the output expected).

    Thanks

    Chris
  •     michael January 5, 2012 9:51AM
    In case:

    "unique category & brand combo would be 10 x 100 = 1,000"

    do you want to see the Data Manager output (let's say text, cvs, Excel, etc..) not in one file but in 1000 separate files?
  •     mburgess January 5, 2012 9:54AM
    Michael - yes that is what i want to do (i'd like that option at least)
  •     michael January 5, 2012 10:50AM
    Unfortunately at the moment we only output/export/publish into a single file.

    It would be possible for us to add this functionality in the future.

    The way we might implement this would be to create a new output block that allowed you to split a single input dataset into multiple files based on a field selection. You would configure the folder you wanted to publish to and the types of file you wanted to create (eg. IOK, CSV).

    I've classified this as an idea. Please vote to increase the chance that it will be implemented.

  •     steve January 5, 2012 11:16AM
    Note that if you want to automate this today, you can do this in a complex way using a Server license and the Batch Output block. You would need to use Aggregate to create the permutations, then use this to dynamically create the batch output command file using a File Output block.
  • apoorvjain February 29, 2012 5:03PM
    Hi Steve, I was just wondering if you could clarify your methodology to automate the splitting of files. I have a large sales data set that I would like to split by store.

    Or if there has been an update that allows me to do this in an easier way I'd appreciate that too. Thanks.
  •     steve March 1, 2012 1:34AM
    This "Idea" is to have an inverse of "Batch Append", where you feed a large data file into a new "Splitting Output" block and configure what fields you want to use to split the data, and how they are named. This hasn't been implemented yet.

    This can be automated currently using Server edition in either of the following ways:


    1. Use DataManager to feed the large data file into:

    - a series of Aggregate and Field Org blocks, which produce a "command file"
    - a Batch Output block, which "executes" the command file.

    The command file contains a record for each desired output with text fields describing output settings and record filters. For example, a record might "command" Omniscope to "Output an IOK file filtered to include only category X" or "Email a PDF with tabs 2 and 3 showing data filtered to include only category Y".

    To get started, use the Batch Output block to generate an empty IOK command file which comes with full documentation in the file itself. You will need to make sure that the first flow of data mentioned above results in the correct values in the command file.


    2. Use the Scheduler to execute a series of XML actions that you have programmed/scripted yourself.

    - configure a Record Filter and a normal File Output
    - configure a parameter in the DM sidebar which configures the record filter
    - use "Settings > Server > Edit action descriptor" to create a starter XML action
    - the action will use File action to open your IOK file containing the DM model
    - then it will set a DM Parameter value
    - then it will execute the DM output
    - You then write a program / script which writes out a series of these XML files with different parameter values. Drop these into the Scheduler watch folder to execute them.


    The first approach is the simplest since it is entirely point-and-click.

Welcome!

It looks like you're new here. If you want to get involved, click one of these buttons!

Sign In Apply for Membership

Tagged