Search text in PDF files

PDF search for text is a common operation performed on PDF files and XFINIUM.PDF library fully supports this feature.

When a PDF document is searched for a string of characters each page needs to be searched separately because the content is stored at page level. Each PDF search operation returns a collection of search results, each result specifying the text being searched and the collection of text fragments that compose the result.

XFINIUM.PDF provides several options when searching text in PDF file. By default the PDF search is case insensitive. Search options allow to specify case sensitive search or whole word search, these 2 options can be combined together. Another search option is regular expression search. If this option is combined with other options, those options are ignored.

The code below shows how to search text in a PDF page using various options. The search results are highlighted on the page by drawing a rectangle around the text.

Download XFINIUM.PDF library and give it a try.

6 thoughts on “Search text in PDF files”

  1. This works fine, I have my results as PdfTextFragmentCollection. Is there anyway to change/modify the text and save/write it back to same document?

    searchResults[0].TextFragments[0].Text is read only.

    1. At this moment text cannot be replaced directly. It can be implemented using a page transform to inspect each text fragment in the page content but it would work for basic situations.

    1. Text replace is not available at this moment. What you could do is to search the text, perform a redaction at the location of your text and then draw the new text at the same location. The problem is if the new text is longer than the old text then the content that follows the text might be overwritten.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.