Nodejs, How to Convert word document to pdf | Javascript


In this post, You will learn how to convert docx file to pdf document in JavaScript and nodejs.

Docx/doc are document file formats from Microsoft, contains images, text, tables and styles PDF files are from Adobe company, which is separate format for representing the content of images, texts and styles

There are lot of online tools to do the conversion doc to pdf. Sometimes, As a programmer you need to have a conversion of different formats in the JavaScript/NodeJS applications.

JavaScript/NodeJS offers multiple ways to convert using npm packages

  • docx-to-pdf
  • libreoffice-convert

How to Convert word document to pdf in Nodejs application

First, Create an Nodejs application from scratch.

Let’s create nodejs application from scratch using npm init -y command in new folder

B:\blog\jswork\nodework\doctopdf>npm init -y
Wrote to B:\blog\jswork\nodework\doctopdf\package.json:

{
  "name": "doctopdf",
  "version": "1.0.0",
  "description": "",
  "main": "index.js",
  "scripts": {
    "test": "echo \"Error: no test specified\" && exit 1"
  },
  "keywords": [],
  "author": "",
  "license": "ISC"
}

This create a package.json as follows

{
  "name": "doctopdf",
  "version": "1.0.0",
  "description": "",
  "main": "index.js",
  "scripts": {
    "test": "echo \"Error: no test specified\" && exit 1"
  },
  "keywords": [],
  "author": "",
  "license": "ISC"
}

using docx-pdf approach

docx-pdf is an simple library to convert docx to pdf document.

First, Install docx-pdf npm library

npm install docx-pdf --save

This will add a dependencies in package.json as follows

{
 "dependencies": {
    "docx-pdf": "0.0.1"
  }
}

In javascript import docx-pdf using require for ES5 modules

var converter = require('docx-pdf');

convert objects accepts input file which is word document output file is a name of pdf document callback which has err for error messages for conversion failed and result for successful conversion result is an object contains filename attribute and value is pdf document name and path

var converter = require('docx-pdf');

converter('test.docx', 'output.pdf', function(err, result) {
    if (err) {
        console.log("Converting Doc to PDF  failed", err);
    }
    console.log("Converting Doc to PDF succesfull", result);
});

And same code can be written with async and await keywords for asynchronous process.

Async and await

This will be useful for bigger files of sizes

Declared function which accepts input and output filename It return promise object with reject for failed conversions and resolve for successfully conversions. And, docxConverter logic is called inside async keyword with anonymous function for asynchronous processing.

    async function ConvertDocToPdf(inputfile, outputfile) {
            return new Promise((resolve, reject) =>{
                        const inputPath = path.join(__dirname, "test.docx");
        const outputPath = path.join(__dirname, `/test.pdf`);
        let docData = await fs.readFile(inputPath)
                docxConverter(inputfile, outputfile, (err, result) => {
                    return err ?
                        reject(err) :
                        resolve(result)
                })
            })
        }

You need to call same function with await keyword

    await ConvertDocToPdf("test.docx", "test.pdf")

It is simple library, only disadvantage with is not able to convert formatting styles.

libreoffice-convert npm package

libreoffice is opensource office package for managing office documents.

libreoffice-convert is an npm package in nodejs provides manipulation of word documents.

First install libreoffice-convert npm package using npm install command

npm install libreoffice-convert --save

Example code to convert docx to pdf using libreoffice-convert package:

const libre = require('libreoffice-convert');
const path = require('path');
const fs = require('fs');

async function ConvertDocToPdf() {
    try {
        const inputPath = path.join(__dirname, "test.docx");
        const outputPath = path.join(__dirname, `/test.pdf`);
        let docData = await fs.readFile(inputPath)
        return new Promise((resolve, reject) => {
            libre.convert(docData, '.pdf', undefined, (err, done) => {
                if (err) {
                    reject('Convertion Failed')
                }
                fs.writeFileSync(outputPath, done);
                resolve("Convertion successfull")

            });
        })
    } catch (err) {
        console.log("Error in input reading", err);
    }
}

sequence of steps for above code

  • Defined function with async keyword for asynchronous processing
  • import libreoffice-convert,fs and path modules into code
  • read the input file using readFile method of fs module in NodeJS
  • libre.convert convert the docx to pdf file
  • conversion code is wrapped in promise object
  • for conversion failed cases, reject promise is returned
  • For successful conversion, promise is resolved
  • Finally written output pdf file using writeFileSync method
Similar Posts