Version 4.3 of XFINIUM.PDF brings support for enhanced PDF redaction of text and images and support for redaction annotations.
Redaction is the process of removing sensitive information from a document.
XFINIUM.PDF library can redact both text and images from a PDF file. You define a region on the page and the text and images that fall inside the region will be removed. If an image falls only partially in a redaction area, only that area will be redacted.
Note: XFINIUM.PDF 4.3 cannot redact JPEG2000 images or images used as part of a brush that fills/strokes a path.
Redaction of PDF content is very simple with XFINIUM.PDF library. You create a content redactor object for that page you want to redact and then redact each region of the page.
PdfFixedDocument document = new PdfFixedDocument("sample.pdf"); // Redact the first page of the document PdfContentRedactor cr = new PdfContentRedactor(document.Pages[0]); // Redact a rectangular area of 200*100 points and leave the redacted area uncovered. // Redaction is applied immediately cr.RedactArea(new PdfVisualRectangle(50, 50, 200, 100)); // Redact a rectangular area of 200*100 points and mark the redacted area with red. // Redaction is applied immediately cr.RedactArea(new PdfVisualRectangle(50, 350, 200, 100), PdfRgbColor.Red); document.Save("sample_redacted.pdf");
Dim document As New PdfFixedDocument("sample.pdf") ' Redact the first page of the document Dim cr As New PdfContentRedactor(document.Pages(0)) ' Redact a rectangular area of 200*100 points and leave the redacted area uncovered. ' Redaction is applied immediately cr.RedactArea(New PdfVisualRectangle(50, 50, 200, 100)) ' Redact a rectangular area of 200*100 points and mark the redacted area with red. ' Redaction is applied immediately cr.RedactArea(New PdfVisualRectangle(50, 350, 200, 100), PdfRgbColor.Red) document.Save("sample_redacted.pdf")
If you want to redact multiple regions on a page it is recommended to use batch redaction because it yields better performance, the page content is parsed only once and the redactions are applied in a single step.
The code above causes the page content to be parsed and processed every time the RedactArea method is called and this can cause a performance problem is the page content is very large and complex and you want to redact multiple regions on the page.
The code below shows how to use batch redaction:
PdfFixedDocument document = new PdfFixedDocument("sample.pdf"); // Redact the first page of the document PdfContentRedactor cr = new PdfContentRedactor(document.Pages[0]); // Initialize the batch redaction. cr.BeginRedaction(); // Prepare for redaction a rectangular area of 500*100 points and leave the redacted area uncovered. // The area is not actually redacted yet. cr.RedactArea(new PdfVisualRectangle(50, 50, 500, 100)); // Prepare for redaction a rectangular area of 200*100 points and mark the redacted area with red. // The area is not actually redacted yet. cr.RedactArea(new PdfVisualRectangle(50, 350, 500, 100), PdfRgbColor.Red); // Finish the redaction, all the areas set up above will be redacted in one step. cr.ApplyRedaction(); document.Save("sample_redacted.pdf");
Dim document As New PdfFixedDocument("sample.pdf") ' Redact the first page of the document Dim cr As New PdfContentRedactor(document.Pages(0)) // Initialize the batch redaction. cr.BeginRedaction() ' Prepare for redaction a rectangular area of 500*100 points and leave the redacted area uncovered. ' The area is not actually redacted yet. cr.RedactArea(New PdfVisualRectangle(50, 50, 500, 100)) ' Prepare for redaction a rectangular area of 200*100 points and mark the redacted area with red. ' The area is not actually redacted yet. cr.RedactArea(New PdfVisualRectangle(50, 350, 500, 100), PdfRgbColor.Red) ' Finish the redaction, all the areas set up above will be redacted in one step. crImages.ApplyRedaction() document.Save("sample_redacted.pdf")
When redacting an area on the page you have the option to fill that area with a color or leave it blank. For text content this is not a problem because you can see the background that remains after removing the text. With images the situation is different. Images are redacted by setting the corresponding bits to 0. Depending on the colorspace used by the image you can get a solid black or a different color.
So if you want to make sure that all the redactions look the same then use a specific color to fill the redacted area.
Redaction annotations are rectangular annotations that define areas on the page that will be later redacted. You can create redaction annotations with XFINIUM.PDF and then have them redacted with Adobe Acrobat or vice-versa.
A redaction annotation can be created like this:
PdfFixedDocument document = new PdfFixedDocument("sample.pdf"); // Create a visual appearance for the redacted area, this graphic will be used after the content is redacted. PdfFormXObject redactionAppearance = new PdfFormXObject(250, 150); redactionAppearance.Graphics.DrawRectangle(new PdfBrush(PdfRgbColor.LightGreen), 0, 0, redactionAppearance.Width, redactionAppearance.Height); PdfStringAppearanceOptions sao = new PdfStringAppearanceOptions(); sao.Brush = new PdfBrush(PdfRgbColor.DarkRed); sao.Font = new PdfStandardFont(PdfStandardFontFace.HelveticaBold, 32); PdfStringLayoutOptions slo = new PdfStringLayoutOptions(); slo.Width = redactionAppearance.Width; slo.Height = redactionAppearance.Height; slo.X = 0; slo.Y = 0; slo.HorizontalAlign = PdfStringHorizontalAlign.Center; slo.VerticalAlign = PdfStringVerticalAlign.Middle; redactionAppearance.Graphics.DrawString("This content has been redacted", sao, slo); // Create the redaction annotation and add it to first page PdfRedactionAnnotation redactionAnnotation = new PdfRedactionAnnotation(); document.Pages[0].Annotations.Add(redactionAnnotation); redactionAnnotation.Author = "XFINIUM.PDF"; redactionAnnotation.BorderColor = new PdfRgbColor(192, 0, 0); redactionAnnotation.BorderWidth = 1; redactionAnnotation.OverlayAppearance = redactionAppearance; redactionAnnotation.VisualRectangle = new PdfVisualRectangle(50, 100, 250, 150); document.Save("sample_redacted.pdf");
Dim document As New PdfFixedDocument("sample.pdf") ' Create a visual appearance for the redacted area, this graphic will be used after the content is redacted. Dim redactionAppearance As New PdfFormXObject(250, 150) redactionAppearance.Graphics.DrawRectangle(New PdfBrush(PdfRgbColor.LightGreen), 0, 0, redactionAppearance.Width, redactionAppearance.Height) Dim sao As New PdfStringAppearanceOptions() sao.Brush = New PdfBrush(PdfRgbColor.DarkRed) sao.Font = New PdfStandardFont(PdfStandardFontFace.HelveticaBold, 32) Dim slo As New PdfStringLayoutOptions() slo.Width = redactionAppearance.Width slo.Height = redactionAppearance.Height slo.X = 0 slo.Y = 0 slo.HorizontalAlign = PdfStringHorizontalAlign.Center slo.VerticalAlign = PdfStringVerticalAlign.Middle redactionAppearance.Graphics.DrawString("This content has been redacted", sao, slo) ' Create the redaction annotation and add it to first page Dim redactionAnnotation As New PdfRedactionAnnotation() document.Page(0).Annotations.Add(redactionAnnotation) redactionAnnotation.Author = "XFINIUM.PDF" redactionAnnotation.BorderColor = New PdfRgbColor(192, 0, 0) redactionAnnotation.BorderWidth = 1 redactionAnnotation.OverlayAppearance = redactionAppearance redactionAnnotation.VisualRectangle = New PdfVisualRectangle(50, 100, 250, 150) document.Save("sample_redacted.pdf")
Many people believe that drawing an opaque rectangle over text will cause that text to be redacted. This is not true, the text is there, it is just covered. Many PDF editors let you remove graphic objects from the PDF file so this overlay rectangle can be removed. Encrypting the file does not help very much. Also the text still being in the document any PDF viewer will let you select and copy it.
XFINIUM.PDF does not just masks the content with an opaque rectangle, it truly removes the content from the PDF file.
Hi, i’m trying to replace a snippet of text from a pdf. My idea is:
– Search the text to replace.
– Redact searched text area. Works fine and the text is removed.
– DrawString with new text at the redacted area. In this step something goes wrong and all the page content disappear.
My code look like that:
PdfContentRedactor crText = new PdfContentRedactor(page);
for (int i = 0; i < searchResults.Count; i++)
{
double minX = double.MaxValue;
double minY = double.MaxValue;
double maxX = double.MinValue;
double maxY = double.MinValue;
PdfTextFragmentCollection tfc = searchResults[i].TextFragments;
for (int j = 0; j < tfc.Count; j++)
{
for (int frag = 0; frag tfc [j].FragmentCorners [frag].X)
minX = tfc [j].FragmentCorners [frag].X;
if (maxX tfc [j].FragmentCorners [frag].Y)
minY = tfc [j].FragmentCorners [frag].Y;
if (maxY < tfc [j].FragmentCorners [frag].Y)
maxY = tfc [j].FragmentCorners [frag].Y;
}
}
crText.RedactArea(new PdfVisualRectangle(minX, minY, maxX - minX, maxY - minY));
PdfStringAppearanceOptions sao = new PdfStringAppearanceOptions();
sao.Brush = brush;
sao.Font = font;
PdfStringLayoutOptions slo = new PdfStringLayoutOptions();
slo.HorizontalAlign = PdfStringHorizontalAlign.Center;
slo.VerticalAlign = PdfStringVerticalAlign.Bottom;
slo.X = minX;
slo.Y = maxY;
slo.Width = maxX - minX;
page.Graphics.DrawString ("New text", sao, slo);
Any ideas?
Thanks in advance.
Can you send us (support@xfiniumpdf.com) the source PDF file and the text you want to replace? It will help us investigate this problem.
Thanks for your reply.
Yesterday at last i found the problem. I was generating de pdf and doing de redaction over the same pdf file. That fails.
If i open a previously generated pdf file and try to redact and then DrawString the new text it works fine.
However, is there a better solution to do this? It is, open a pdf template with some fields that i have to replace o fill, like name, phone number, etc. I try with forms fields, but it isn’t valid valid because the pdf template is generated with Microsoft Word and i have no idea how to include this form fields.
Thank you very much!
The best solution for this scenario is to use form fields. Design the template in Microsoft Word, save it as PDF, then load it with Adobe Acrobat and add the fields. If Adobe Acrobat is not available you can use your code above but instead of drawing text over the redacted zone you create a textbox form field at that location. You run the code only once to create the initial template. Then at application runtime you use the template and fill the form fields with actual data.