Services
Infosiphon provides a number of services out of the box that enable an organization to jump start the information analysis process.

Services
Infosiphon provides a number of services out of the box that enable an organization to jump start the information analysis process.

Fact Extraction
Location of patterns using regular expressions and dictionary terms provides a more focused area of search. It is imperative to locate the full context of information, however, to make it more meaningful. This is made possible by extracting the full facts around the located terms.

Text Extraction
Location of a term of interest is far more useful if the complete context of its appearance is readily available. Using NLP techniques, it is possible to define paragraph and sentence boundaries around the region of interest. Each of these extracted sentences is referred to as a context based fact. The fact and its location is recorded and made available on demand as the user requires.

Table Extraction
Sometimes the pattern may exist within a single column of a table, such as an Excel file or a PDF table. Putting information in a table provides an order of structure by default, if the column contents can be aligned with the column headers. The said correlation is easily achieved for some file formats but it is extremely difficult for others, especially PDFs. Infospihon tools allow for this correlation for most formats to provide table based fact extraction.

Image Analysis
An additional order of difficulty is introduced if the original document content is not represented as text but as text documents scanned to images. Such documents are processed by using image enhancement filters and OCR based text extraction as the initial steps. This enables the text based fact extraction, as stated in an earlier case. The extracted terms’ positions are located on the original images and highlighted for easy viewing as well.

Text Extraction
Location of a term of interest is far more useful if the complete context of its appearance is readily available. Using NLP techniques, it is possible to define paragraph and sentence boundaries around the region of interest. Each of these extracted sentences is referred to as a context based fact. The fact and its location is recorded and made available on demand as the user requires.

Table Extraction
Sometimes the pattern may exist within a single column of a table, such as an Excel file or a PDF table. Putting information in a table provides an order of structure by default, if the column contents can be aligned with the column headers. The said correlation is easily achieved for some file formats but it is extremely difficult for others, especially PDFs. Infospihon tools allow for this correlation for most formats to provide table based fact extraction.

Image Analysis
An additional order of difficulty is introduced if the original document content is not represented as text but as text documents scanned to images. Such documents are processed by using image enhancement filters and OCR based text extraction as the initial steps. This enables the text based fact extraction, as stated in an earlier case. The extracted terms’ positions are located on the original images and highlighted for easy viewing as well.