In this tutorial, we are going to display a pdf file inside an html canvas object with the ability to select the text in the page.
1. Create a new folder(“root folder” later in the text) called testpdfjsselection.
2. Inside this folder run git clone https://github.com/mozilla/pdf.js.git
to download the latest version of the PDF.js library.
3. Put any pdf file you want inside the root folder(in the example below we’ll use “oasis.pdf”).
4. In the root folder create a file called index.html with the following code:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 |
<!DOCTYPE html><meta charset="utf-8"> <link rel="stylesheet" href="pdf.js/web/text_layer_builder.css" /> <script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.4/jquery.min.js"></script> <script src="pdf.js/web/ui_utils.js"></script> <script src="pdf.js/web/text_layer_builder.js"></script> <script src="https://mozilla.github.io/pdf.js/build/pdf.js"></script> <body> <div> <canvas id="the-canvas" style="border:1px solid black;"></canvas> <div id="text-layer" class="textLayer"></div> </div> <script> PDFJS.getDocument("oasis.pdf").then(function(pdf){ var page_num = 1; pdf.getPage(page_num).then(function(page){ var scale = 1.5; var viewport = page.getViewport(scale); var canvas = $('#the-canvas')[0]; var context = canvas.getContext('2d'); canvas.height = viewport.height; canvas.width = viewport.width; var canvasOffset = $(canvas).offset(); var $textLayerDiv = $('#text-layer').css({ height : viewport.height+'px', width : viewport.width+'px', top : canvasOffset.top, left : canvasOffset.left }); page.render({ canvasContext : context, viewport : viewport }); page.getTextContent().then(function(textContent){ console.log( textContent ); var textLayer = new TextLayerBuilder({ textLayerDiv : $textLayerDiv.get(0), pageIndex : page_num - 1, viewport : viewport }); textLayer.setTextContent(textContent); textLayer.render(); }); }); }); </script> </body> </html> |
5. Inside the root folder start a local php server via php -S localhost:8080
.
6. Open http://localhost:8080 in your browser.
You should see the following output:
You might see that the first page of your PDF document is displayed in the canvas object. Notice that the text is also selectable as there are absolutely positioned div tags with plain text inside above all text strings. Now you may add the annotation feature using the fabric.js library. Or create a neat flipbook using turn.js. Then you can convert all this html pieces into a single pdf file using jsPDF.
Thank You it works , finally a solution for text selection on PDFJS
Hello, Vladimir. It is not work for me. I don’t understand why. Please help me.
Hi – Everything renders EXCEPT the selectable text layer. I’m also getting two console errors: “TypeError: pdfjsLib is undefined” from ui_utils.js (line 38, col 5) and “ReferenceError: TextLayerBuilder is not defined” from index.html (line 43). Any idea?
Thomas, seems like you didn’t attach these two files:
https://github.com/mozilla/pdf.js/blob/master/web/ui_utils.js
https://github.com/mozilla/pdf.js/blob/master/web/text_layer_builder.js
Hi Vladimir – Thank you for your response. I actually had those files attached. It’s still not working for me (printing the above errors into the console). And I’ve been really hard pressed to find a working example online. Would you mind telling me what version of PDF.js version you are using? The latest stable version is 1.4.20 – that’s what I’m using. Maybe I should try the beta version?
I figured it out! The pdf.js library located here () is the very latest, which causes this example to break. I’ve searched high and low for a working example using the latest build and couldn’t find one (is it me, or is the documentation for PDF.js somewhat chaotic?). Anyway, I had to go and grab an earlier version of the plugin. I included it and although the text layer is a bit unaligned with the actual PDF, it works a charm.
One last question: Is there a way to only write certain words to the text layer while keeping their position/screen coordinates? I know I can do this with additional jQuery, but was hoping for a baked solution.
@Thomas, can you provide a working example?
I’ve tried almost all versions of PDF.JS from 1.1.125 to 1.4.11 as older versions, but keep getting the same error you had in the first place.
Unfortunately, your example is not working for me. I have the same problem as Thomas – everything renders except the selectable text layer. I’m also getting two console errors: “pdfjsLib is undefined” and “TextLayerBuilder is not defined”.
Can you please put your working example on JSFiddle? I am looking for a working PDF.js text selection example for more then a year now.
Thanks for your time.
Daman
damjan. are you using the source pdf.js? or the compiled one?
Hi,
Thanks for your help bro,
And there is any way to get
HTML script for a viewing
pdf file…
-thank you..
Unable to select.
Howdy!
You Need Leads, Sales, Conversions, Traffic for ryzhak.com ? Will Findet…
I WILL SEND 5 MILLION MESSAGES VIA WEBSITE CONTACT FORM
Don’t believe me? Since you’re reading this message then you’re living proof that contact form advertising works!
We can send your ad to people via their Website Contact Form.
IF YOU ARE INTERESTED, Contact us => lisaf2zw526@gmail.com
Regards,
Nowlin
doesn’t work for me, I see only black rectange, it doesn’t become pdf
no errors except [404]: GET /favicon.ico – No such file or directory
pdf.js is working normal, I did build and I can use it, but not ur script