Visokio website     Downloads     Video tutorials     KnowledgeBase  
Sources: Capture text data from images? - Visokio Forums
Sources: Capture text data from images?
  • nitiwan March 5, 2013 4:11AM
    Hi,

    Are there any functions on Omniscope that we can use to extract data (text) from scanned image files (pdf, jpg)?

    Thank you.
  • 5 Comments
  •     paola March 5, 2013 6:30AM
    You can select DataManager source: File Metadata block and select the images folder. Choose Images file type, and tick "in file content" tags. Omniscope will be able to populate the table with basic data including the File name/Extension/Category/Size (MB)/ Full path/Pixel width/height and more, depending on the metadata content.
  • nitiwan March 5, 2013 7:06AM
    Paola, thank you for your advice.

    Still have further question, can Omniscope recognize and extract characters (letters, numbers) inside an image itself?
  •     paola March 5, 2013 7:19AM
    If you scanned a card with number 5 on it, then saved it as a jpg image, Visokio would not be able to 'read' the character and place it in a table field. You need specialist OCR (optical character recognition) software to extract words from images, then connect the output to Omniscope.
  • nitiwan March 5, 2013 7:26AM
    Ok I see, thank you very much.
  •     tjbate March 5, 2013 9:21AM
    Nitiwan - Optical character recognition (OCR) programs are designed to solve the problem of converting images to more structured but still 'raw' data sets that can be futher processed.

    Seach the web for "OCR PDF JPG" and you will see many options to pre-process your scanned images in PDF format and convert into a data file format that can be text mined or otherwise enriched. There are some free trial services available to start with.

Welcome!

It looks like you're new here. If you want to get involved, click one of these buttons!

Sign In Apply for Membership

Tagged