Visokio website     Downloads     Video tutorials     KnowledgeBase  
Text Mining: Wildcard operation + list of functions - Visokio Forums
Text Mining: Wildcard operation + list of functions
  • tobasko May 2, 2014 7:34AM
    Hi everyone,

    I was testing the Text Mine Operation and think that the extraction of custom entities is a very nice feature. However, is it possible to extract regular expression (RegEx) or 'wildcard' patterns, like for example: "Host * has failed"? This would extract all matching phrases/tuples like "Host A has failed", "Host B has failed", and so on?

    I haven't been able to get this to work yet unfortunately. If it is not possible, is there any other way to achieve something like this?

    Many thanks in advance.

    Cheers,
    Tobi
  • 5 Comments
  •     paola May 2, 2014 1:43PM
    While wildcard is not yet supported in the text-mining formulas, you can use it in Search/Replace block.
    Option 1)
    You can create a duplicate field in the Field Organiser (in DataManager) and search/replace "Host*has failed" with something like "match", then use that result for filtering or formula criteria for your next steps.
    Option 2)
    In the Field Organiser block you can use a formula to quickly identify records that contain both "Host" and "has failed" and return value in those cells, leaving the others blank (you can replace "null" with other value).

    IF(
    (AND(CONTAINS([Value], "Host"),CONTAINS([Value], "has failed")))=true,
    [Value],
    null)
  • tobasko May 2, 2014 3:51PM
    Hi Paola,

    thanks a lot that was a surprisingly quick reply. I will check this out asap and get back with feedback.

    Cheers,
    Tobi
  •     paola May 6, 2014 11:37AM
    Another set of useful text-mining functions are:
    ENDSWITH(text, sub_text) - Returns true if [sub_text] occurs in the end of [text] (case insensitive).
    STARTSWITH(text, sub_text) - Returns true if [sub_text] occurs in the beginning of [text] (case insensitive).
    You could use them in the above formula instead of CONTAINS where appropriate.
  •     steve May 7, 2014 1:49AM
    Also useful:

    FINDBETWEEN(all, before, after) Returns the first shortest matching text surrounded by [before] and [after], or null if not found.
    For example, FINDBETWEEN("apple apple orange plum pear apple banana pear", "apple", "pear") would return " orange plum "

    FINDLASTBETWEEN(all, before, after) Returns the last shortest matching text surrounded by [before] and [after], or null if not found.
    For example, FINDLASTBETWEEN("apple apple orange plum pear apple banana pear", "apple", "pear") would return " banana "
  •     paola May 7, 2014 5:39AM
    For all those text-miners out there... here is a list of functions you can use in Omniscope.

Welcome!

It looks like you're new here. If you want to get involved, click one of these buttons!

Sign In Apply for Membership