Visokio website     Downloads     Video tutorials     KnowledgeBase  
Batch Append mode for Excel data - Visokio Forums
Batch Append mode for Excel data
  • Guy_Cuthbert        Guy_Cuthbert June 15, 2010 10:36AM
    Hello

    In Data Manager, we have a File block and a Batch Append block. In the File block we can select to use Excel or not to read .XLS(X) files... but this option isn't present in the Batch Append block... which reader does Batch Append use? I ask because we have noticed that allowing the non-Excel reader to access the same file in concurrent operations (e.g. multiple File blocks reading from the same Excel file) can cause problems (looks like some kind of file locking?) - so we need to know whether we can read batches of Excel files reliably?

    Incidentally, when opening a number of models I find that some File blocks are automatically opened and refreshed whereas others require me to go and press the Execute button within them (not change anything, just Execute). These non-loading bloacks all say "configuration incomplete", but pressing "Execute" is the only 'configuration change' that I make. Is this expected behaviour?

    Thanks...
    Atheon Analytics Ltd
    w: www.atheonanalytics.com
    e: guy.cuthbert@atheon.co.uk
    t: +44 8444 145501
    m: +44 7973 550528
    s: guycuthbert
  • 4 Comments
  •     chris June 15, 2010 11:02AM
    Thanks for the questions.

    The 'Batch append' block will use the default Excel reader as long you have Excel installed on your PC. You can use the alternative Excel reader by selecting 'Settings > Advanced > Misc > Use alternative Excel reader (POI)'. This option is used to determine which reader to use for all Excel read operations (unless it's overridden, as in the File block).

    If you notice any problems with the Non-Excel reader, please report them as bugs so we investigate further.

    Some blocks inside DataManager have execute buttons, and others will execute automatically as soon as you change the configuration. This is deliberate and was done because we wanted to ensure that certain blocks only execute once the entire configuration process has completed. For example, the 'Join' operation has an execute button. If we attempted to execute a join as soon as the user changed one of the configuration settings this could lead to Omniscope running out of memory if a join was defined with a large number of duplicates on both input data-sets. The Execute button ensures that blocks that could potentially cause problems if executed with an invalid or incomplete configuration must be triggered by the user.
  •     steve June 15, 2010 11:32AM
    Guy, in what situation do you get the file locking problem? Have you reported this? And does Chris' answer make sense, or are there indeed some blocks you're finding with an Execute button which is genuinely not needed?
  • Guy_Cuthbert        Guy_Cuthbert June 15, 2010 7:16PM
    The file locking was observed in *very* early versions of the non-Excel reader - we had cases where we were attempting to read each of 10 different worksheets from an XLS workbook (each worksheet required a different File block, because the content in each is different e.g. different numbers of non-data lines to ignore). When we let the process run we would find 2-4 of the file reads would fail - apparently because multiple File blocks were trying to open and read the same content at the same time. I haven't done any significant testing on this, but can do a more thorough test if required?

    Chris' answer makes sense, but it's not quite what I was asking/observing. The "Execute" button observation relates to opening a previously defined (and run) DM model; if I open some of our more complex models (e.g. those with 10+ Batch Append blocks) then it is common to find that several of the initial Batch Append blocks have not run - hence the majority of the model reports that it is incomplete and waiting for data. If I turn to each of the 'stalled' Batch Append blocks in turn then all I need to do to 'activate' them is to press the Execute button... they then run quite happily and, in due course, the DM process executes correctly.

    I can walk you through one such case if it helps; this is a DM process which builds a composite of ~200k records from a suite of multi-worksheet workbooks and then merges in data from a range of master files. The process involves some reasonably complex data checks, and aggregations (to build multi-level overlapping data sets), and can take 20-30 minutes to run (hitting ~7GB RAM in the process!).

    Finally (and, I think, just about related)... am I right to understand that if I want to publish - automatically - one or more files (IOK, XLS etc.) from a DM process then I need an Enterprise licence? I can't seem to get this final step to complete with my laptop's DM licence.
    Atheon Analytics Ltd
    w: www.atheonanalytics.com
    e: guy.cuthbert@atheon.co.uk
    t: +44 8444 145501
    m: +44 7973 550528
    s: guycuthbert
  •     steve June 16, 2010 9:39AM
    Thanks for the high detail response, Guy - we'll look into these issues. It sounds like the locking problem might still be present, and we'll look into it.

    The Execute issue seems to be a bug, too. If you create some Batch Append blocks and click Execute, then save the file, it should open without needing to Execute. We'll look into this too, and will get back if we can't reproduce it.

    Yes, automated publishing is an enterprise feature. If you have an enterprise license, you get an "Auto-publish" checkbox in each output block. When ticked, as soon as upstream data updates, the publishing re-runs.
This discussion has been closed.
← All Discussions

Welcome!

It looks like you're new here. If you want to get involved, click one of these buttons!

Sign In Apply for Membership