chenglong

How to Use PDF.js to Highlight Text Programmatically | PSPDFKit

notion image
In this tutorial, you’ll learn how to programmatically add highlight annotations to a PDF document using PDF.js. Highlight annotations are useful for emphasizing important text or passages within a PDF file. In the first part, we’ll walk you through the steps to integrate PDF.js into your project and demonstrate how to add highlight annotations to a loaded PDF. In the second part, we’ll show you how to add highlight annotations programmatically using PSPDFKit for Web.
PDF.js is primarily designed as a PDF viewer, and its manipulation capabilities may be limited compared to those of dedicated PDF editing software.

Downloading the PDF.js Library

To get started with PDF.js, download the library as a ZIP file, or clone the repository using Git.
Extract the ZIP file and copy the pdf.js and pdf.worker.js files from the build/ folder to your project directory.
In your HTML file, add the following script tag to load the PDF.js library:
<!DOCTYPE html> <html> <head> <script src="./pdf.js"></script> </head> <!-- Rest of your HTML code --> </html>

Loading and Rendering PDF Documents

To get started, you need to load the PDF document and render it in a specified container. For this tutorial, you’ll use a canvas element to display the PDF page.
Make sure you replace annotation.pdf with the actual URL or path to your PDF document. You can use this demo document as an example:
<body> <!-- Canvas to place the PDF --> <canvas id="canvas"></canvas> <script> // Get the canvas element. const canvas = document.getElementById('canvas'); // Get the PDF file URL. const pdfUrl = 'annotation.pdf'; // Replace with the URL of your PDF document // Configure the PDF.js worker source. pdfjsLib.GlobalWorkerOptions.workerSrc = './pdf.worker.js'; // Load the PDF file using PDF.js. pdfjsLib.getDocument(pdfUrl).promise.then(function (pdfDoc) { // Get the first page of the PDF file. pdfDoc.getPage(1).then(function (page) { const viewport = page.getViewport({ scale: 1 }); // Set the canvas dimensions to match the PDF page size. canvas.width = viewport.width; canvas.height = viewport.height; // Set the canvas rendering context. const ctx = canvas.getContext('2d'); const renderContext = { canvasContext: ctx, viewport: viewport, }; // Render the PDF page to the canvas. page.render(renderContext).promise.then(function () { console.log('Rendering complete'); // Call the function to render highlight annotations after the PDF page is rendered. renderHighlightAnnotations(page); }); }); }); </script> </body>

Adding Highlight Annotations Programmatically

Use the getAnnotations() method to retrieve existing annotations from the PDF page. For this tutorial, you’ll focus on highlight annotations specifically:
function renderHighlightAnnotations(page) { page.getAnnotations().then(function (annotations) { annotations.forEach(function (annotation) { if (annotation.subtype === 'Highlight') { const highlightRect = annotation.rect; const highlight = document.createElement('div'); highlight.style.position = 'absolute'; highlight.style.left = highlightRect[0] + 'px'; highlight.style.top = highlightRect[1] + 'px'; highlight.style.width = highlightRect[2] - highlightRect[0] + 'px'; highlight.style.height = highlightRect[3] - highlightRect[1] + 'px'; highlight.style.backgroundColor = 'yellow'; highlight.style.opacity = '0.5'; document.body.appendChild(highlight); } }); }); }
The renderHighlightAnnotations(page) function retrieves all annotations from the specified page. If the annotation’s subtype matches 'Highlight', it creates a yellow rectangle (highlight) using a div element positioned and sized according to the annotation’s rectangle coordinates. The highlight is then added to the document body.
After executing these steps, the highlight annotations present in the PDF document will be displayed as yellow rectangles with 50 percent opacity on top of the PDF pages.

PSPDFKit for Web

We at PSPDFKit work on the next generation of PDF viewers for the web. We offer a commercial JavaScript PDF viewer library that can easily be integrated into your web application. PSPDFKit for Web offers 30+ features, enabling users to view, annotate, edit, and sign PDFs directly within the browser.

Requirements

  • Node.js installed on your computer.
  • A code editor of your choice.
  • A package manager compatible with npm.

Adding PSPDFKit to Your Project

  1. Install the pspdfkit package from npm. If you prefer, you can also download PSPDFKit for Web manually:
npm install pspdfkit
  1. Next, copy the directory containing all the required library files (artifacts) to your project’s assets folder using the following command:
cp -R ./node_modules/pspdfkit/dist/ ./assets/
cp -R ./node_modules/pspdfkit/dist/ ./assets/
Ensure your assets directory contains the pspdfkit.js file and a pspdfkit-lib directory with the library assets.

Integrating into Your Project

Once you’ve added PSPDFKit to your project, you can integrate it into your HTML and JavaScript code.
  1. Add the PDF document you want to display to your project’s directory. You can use our demo document as an example.
  1. Create an empty <div> element with a defined height to where PSPDFKit will be mounted:
<div id="pspdfkit" style="height: 100vh;"></div>
  1. Include pspdfkit.js in your HTML page:
<script src="assets/pspdfkit.js"></script>
  1. Initialize PSPDFKit for Web in JavaScript by calling PSPDFKit.load():
<script> PSPDFKit.load({ container: "#pspdfkit", document: "document.pdf" // Add the path to your document here. }) .then(function(instance) { console.log("PSPDFKit loaded", instance); }) .catch(function(error) { console.error(error.message); }); </script>

Adding Highlight Annotations Programmatically

To add highlight annotations programmatically, you’ll use the PSPDFKit.Annotations.HighlightAnnotation constructor function provided by PSPDFKit. This function enables you to create a new highlight annotation object with specific properties, such as the page index and the rectangles to highlight:
<script> PSPDFKit.load({ container: "#pspdfkit", document: "document.pdf" // Add the path to your PDF document here. }) .then(async function(instance) { try { console.log("PSPDFKit loaded", instance); const rects = PSPDFKit.Immutable.List([ new PSPDFKit.Geometry.Rect({ left: 10, top: 120, width: 200, height: 10 }), new PSPDFKit.Geometry.Rect({ left: 10, top: 150, width: 200, height: 10 }) ]); const annotation = new PSPDFKit.Annotations.HighlightAnnotation({ pageIndex: 0, rects: rects, boundingBox: PSPDFKit.Geometry.Rect.union(rects) }); await instance.create(annotation); console.log("Highlight annotation added successfully."); } catch (error) { console.error("Error adding highlight annotation:", error.message); } }) .catch(function(error) { console.error("PSPDFKit failed to load:", error.message); }); </script>
The provided rects array defines the bounding boxes of the areas to be highlighted, and the pageIndex property specifies the page where the highlight annotation will be added.
notion image

Conclusion

In this tutorial, you explored two methods to programmatically add highlight annotations to a PDF document. First, you learned how to use PDF.js to retrieve existing highlight annotations from a PDF, offering a lightweight solution for accessing annotations in a web application. Next, you learned about PSPDFKit for Web, a commercial JavaScript PDF viewer library that provides advanced PDF editing capabilities, and you saw firsthand how to programmatically add highlight annotations using this powerful tool.
To see a list of all web frameworks, you can contact our Sales team. Or, launch our demo to see our viewer in action.

Copyright © 2024 chenglong

logo