Monday, March 24, 2014

PDF Generation in Rails

The ability to download data in pdf format is a common requirement that you will encounter when building web applications. There are different ways that this can be achieved in Rails. We are going to look at the two major ways used to generate pdf documents: with Ruby using a DSL for defining and styling the documents, or by using a library that will convert your HTML to PDF.
There are three popular gems we’ll focus on today:
  • Prawn (which uses the DSL method)
  • PDFKit (which uses a generator)
  • Wicked PDF (also, uses a generator).

HTML to PDF or Ruby Generation?

The answer to this usually depends on preference and project requirements. HTML to PDF can be faster, especially if you already have a view that displays the content you want in your PDF. In this case, you won’t have to write that much more code to generate a PDF file. However, this method can make it harder to control the layout of the document, especially when dealing with multi-page documents. Content will tend to be cut off and split between pages. It’s true that, with some CSS styling, you can have some control over the page breaks. However, for more complicated PDF documents that span several pages and contain variable length content, headers and footers, it will be difficult to control how each page is rendered. In these cases, it might make more sense to use Prawn.
Using a library like Prawn, you have to do all the content styling and positioning on your own using Prawn’s DSL. The advantage here is more control over how things are displayed and where pages break.
We are going to generate a PDF file for the webpage shown below which contains some static text, an image and a table of some database records.
pdf_webpage_image

Prawn

To use Prawn, include the gem in your Gemfile and run bundle install
1
gem 'prawn'
Register the PDF mime type in the config/initializers/mime_types.rb file.
1
Mime::Type.register "application/pdf", :pdf
Now we need to set up the controller action to respond to requests for PDF format.
For my Products controller, I have an index action which I’m going to modify as shown.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
class ProductsController < ApplicationController
  def index
    @products = Product.all
    respond_to do |format|
      format.html
      format.pdf do
        pdf = Prawn::Document.new
        send_data pdf.render, filename: 'report.pdf', type: 'application/pdf'
      end
    end
  end
end
The above will generate a PDF file with no content when .pdf is appended to the end of the particular url. In my case http://localhost:3000/products.pdf.
To separate out the pdf generation code from the controller, I created an app/pdfs directory and added a new class in the file app/pdfs/report_pdf.rb.
I changed the controller code to use the new class.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
class ProductsController < ApplicationController
  def index
    @products = Product.all
    respond_to do |format|
      format.html
      format.pdf do
        pdf = ReportPdf.new(@products)
        send_data pdf.render, filename: 'report.pdf', type: 'application/pdf'
      end
    end
  end
end
The code below shows how to generate a PDF of the webpage shown above. I have commented it to show what I’m doing.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
class ReportPdf < Prawn::Document
  def initialize(products)
    super()
    @products = products
    header
    text_content
    table_content
  end
  def header
    #This inserts an image in the pdf file and sets the size of the image
    image "#{Rails.root}/app/assets/images/header.png", width: 530, height: 150
  end
  def text_content
    # The cursor for inserting content starts on the top left of the page. Here we move it down a little to create more space between the text and the image inserted above
    y_position = cursor - 50
    # The bounding_box takes the x and y coordinates for positioning its content and some options to style it
    bounding_box([0, y_position], :width => 270, :height => 300) do
      text "Lorem ipsum", size: 15, style: :bold
      text "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse interdum semper placerat. Aenean mattis fringilla risus ut fermentum. Fusce posuere dictum venenatis. Aliquam id tincidunt ante, eu pretium eros. Sed eget risus a nisl aliquet scelerisque sit amet id nisi. Praesent porta molestie ipsum, ac commodo erat hendrerit nec. Nullam interdum ipsum a quam euismod, at consequat libero bibendum. Nam at nulla fermentum, congue lectus ut, pulvinar nisl. Curabitur consectetur quis libero id laoreet. Fusce dictum metus et orci pretium, vel imperdiet est viverra. Morbi vitae libero in tortor mattis commodo. Ut sodales libero erat, at gravida enim rhoncus ut."
    end
    bounding_box([300, y_position], :width => 270, :height => 300) do
      text "Duis vel", size: 15, style: :bold
      text "Duis vel tortor elementum, ultrices tortor vel, accumsan dui. Nullam in dolor rutrum, gravida turpis eu, vestibulum lectus. Pellentesque aliquet dignissim justo ut fringilla. Interdum et malesuada fames ac ante ipsum primis in faucibus. Ut venenatis massa non eros venenatis aliquet. Suspendisse potenti. Mauris sed tincidunt mauris, et vulputate risus. Aliquam eget nibh at erat dignissim aliquam non et risus. Fusce mattis neque id diam pulvinar, fermentum luctus enim porttitor. Class aptent taciti sociosqu ad litora torquent per conubia nostra, per inceptos himenaeos."
    end
  end
  def table_content
    # This makes a call to product_rows and gets back an array of data that will populate the columns and rows of a table
    # I then included some styling to include a header and make its text bold. I made the row background colors alternate between grey and white
    # Then I set the table column widths
    table product_rows do
      row(0).font_style = :bold
      self.header = true
      self.row_colors = ['DDDDDD', 'FFFFFF']
      self.column_widths = [40, 300, 200]
    end
  end
  def product_rows
    [['#', 'Name', 'Price']] +
      @products.map do |product|
      [product.id, product.name, product.price]
    end
  end
end
For more about the PDF formatting rules available, check out the Prawn manual.

PDFKit

For PDFKit, first include the gem in your Gemfile
1
gem 'pdfkit'
and run bundle install
You can generate PDF documents by pointing to a html file or website as shown below.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
# PDFKit.new takes the HTML and any options for wkhtmltopdf
# run `wkhtmltopdf --extended-help` for a full list of options
kit = PDFKit.new(html, :page_size => 'Letter')
kit.stylesheets << '/path/to/css/file'
# Get an inline PDF
pdf = kit.to_pdf
# Save the PDF to a file
file = kit.to_file('/path/to/save/pdf')
# PDFKit.new can optionally accept a URL or a File.
# Stylesheets can not be added when source is provided as a URL of File.
kit = PDFKit.new('http://google.com')
kit = PDFKit.new(File.new('/path/to/html'))
# Add any kind of option through meta tags
PDFKit.new(')
PDFKit.new(')
PDFKit.new(')
You could also use a middleware solution that allows users to generate PDFs of any page on the website by appending .pdf to the end of the URL. That is what we are going to use here.
To add the middleware, include the following in the /config/application.rb file (This is for Rails version 3 and above).
1
2
3
4
5
6
7
8
module RailsPdf
  class Application < Rails::Application
      config.middleware.use PDFKit::Middleware
      .
      .
      .
  end
end
Restart the server, navigate to a page, and add .pdf to the end of the URL. You will get a PDF version of the webpage.
It’s also possible to use a link to download the page as a PDF file. In haml:
1
= link_to 'Download Report', products_path(format: 'pdf')
To exclude the link in the pdf file, add an id or class name to the tag and set a display property for it in CSS.
1
2
3
4
5
@media print {
  .pdf_exclude {
        display: none;
  }
}

Some Things To Note

I

If wkhtmltopdf is not installed on your system, then you will get a similar error as shown below
1
2
3
PDFKit::NoExecutableError in ProductsController#index
No wkhtmltopdf executable found at >> Please install wkhtmltopdf - https://github.com/pdfkit/PDFKit/wiki/Installing-WKHTMLTOPDF
For instructions on how to install wkhtmltopdf check out this wiki page.
Another option for installing the wkhtmltopdf binaries is through the gem wkhtmltopdf-binary. Add it to your Gemfile and run bundle install
1
gem 'wkhtmltopdf-binary'

II

If content is getting cut off where you don’t want it to, for example a table being split into two pages, you can specify a page break before the table is rendered so that it appears on its own page.
1
2
3
4
5
6
@media print {
  .page-break {
    display: block;
    page-break-before: always;
  }
}

III

wkhtmltopdf doesn’t play well with relative URLs of any external files (images, stylesheets, javascript) that you might be using. If you use the regular tags like stylesheet_link_tag and try to generate a document, wkhtmltopdf will hang when loading the assets. Using absolute paths (either file paths or urls including the domain) for assets solves this problem.
Another possible solution is to use inline styles. For example instead of the stylesheet_link_tag tag you could use:
1
2
3

IV

For any content that appears on the webpage but you don’t want displayed in the pdf document (e.g. the ‘Download PDF’ link in the above example), just mark it up and hide it with CSS.
1
2
3
4
5
@media print {
    .hide_in_pdf {
        display:none;
    }
}

Wicked PDF

To use Wicked PDF, first install wkhtmltopdf. Alternatively, you could use the wkhtmltopdf-binary gem by including it in your Gemfile.
Add the wicked_pdf to your Gemfile and run bundle install.
1
gem 'wicked_pdf'
Register the PDF mime type in config/initializers/mime_types.rb
1
Mime::Type.register "application/pdf", :pdf
Like PDFKit, Wicked PDF comes with a middleware that allows users to get a PDF view of any page on your site by appending .pdf to the URL. This is achieved by adding config.middleware.use WickedPdf::Middleware to the /config/application.rb file. I won’t be using the middleware here. Instead, I will create template and layout files for the PDF, and modify my controller action to handle PDF-format requests.
I modified the ProductsController as shown. Here, I am specifying the name of the PDF file and the layout used to generate it. For more information on the available options you can use, look at the README file.
1
2
3
4
5
6
7
8
9
10
11
class ProductsController < ApplicationController
  def index
    @products = Product.all
    respond_to do |format|
      format.html
      format.pdf do
        render :pdf => "report", :layout => 'pdf.html.haml'
      end
    end
  end
end
Here is the layout file app/views/layouts/pdf.html.haml.
1
2
3
4
5
6
7
8
9
!!!
%html
  %head
  %title RailsWickedPdf
    = wicked_pdf_stylesheet_link_tag    "application", :media => "all"
    = wicked_pdf_javascript_include_tag "application"
    = csrf_meta_tags
  %body
    = yield
And the template file app/views/products/index.pdf.haml.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
.container
  .row
    = wicked_pdf_image_tag('header.png')
  .row
    .col-xs-6
      %h3
        Lorem ipsum
      %p
        Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse interdum semper placerat. Aenean mattis fringilla risus ut fermentum. Fusce posuere dictum venenatis. Aliquam id tincidunt ante, eu pretium eros. Sed eget risus a nisl aliquet scelerisque sit amet id nisi. Praesent porta molestie ipsum, ac commodo erat hendrerit nec. Nullam interdum ipsum a quam euismod, at consequat libero bibendum. Nam at nulla fermentum, congue lectus ut, pulvinar nisl. Curabitur consectetur quis libero id laoreet. Fusce dictum metus et orci pretium, vel imperdiet est viverra. Morbi vitae libero in tortor mattis commodo. Ut sodales libero erat, at gravida enim rhoncus ut.
    .col-xs-6
      %h3
        Duis vel
      %p
        Duis vel tortor elementum, ultrices tortor vel, accumsan dui. Nullam in dolor rutrum, gravida turpis eu, vestibulum lectus. Pellentesque aliquet dignissim justo ut fringilla. Interdum et malesuada fames ac ante ipsum primis in faucibus. Ut venenatis massa non eros venenatis aliquet. Suspendisse potenti. Mauris sed tincidunt mauris, et vulputate risus. Aliquam eget nibh at erat dignissim aliquam non et risus. Fusce mattis neque id diam pulvinar, fermentum luctus enim porttitor. Class aptent taciti sociosqu ad litora torquent per conubia nostra, per inceptos himenaeos.
  .row
    %table.table.table-striped
      %thead
        %th #
        %th Product
        %th Price
      %tbody
        - @products.each do |product|
          %tr
            %td
              = product.id
            %td
              = product.name
            %td
              = product.price
You’ll notice in the above two files, I’m using the following Wicked helpers.
1
2
3
wicked_pdf_stylesheet_link_tag
wicked_pdf_javascript_include_tag
wicked_pdf_image_tag
These are necessary if you are using external files. wkhtmltopdf is run outside of the Rails application, therefore any file you link to must be included with an absolute address.
Using the regular stylesheet_link_tag and javascript_include_tag tags will cause the application to hang when a PDF is requested. Also, the image will not be rendered if the regular img tag is used.

Conclusion

We have looked at three different approaches to PDF generation. Deciding on which to use depends on a variety of factors, including your prefered language (HTML or DSL), ease of accomplishing a task with one library over the other, complexity of the format of the PDF document, among other considerations. I hope this article helps you make that decision.

No comments:

Post a Comment