Hugoware

The product of a web developer with a little too much caffeine

Posts Tagged ‘PDF

iTextSharp – Simplify Your HTML to PDF Creation

with 185 comments

Update: A second version that works in .NET 2.0 has been included at the end of this post.

If you’ve ever had to generate PDFs on the fly then you may have run into iTextSharp, which is a port of iText for C#. It isn’t as straight forward as some people might like, but it is certainly a powerful tool once you figure it out.

Normally, if you wanted to take some HTML and turn it into a PDF you would write a lot of extra code to recreate the document in PDF format, but lucky for all of us, iTextSharp also supports HTML. Again, it isn’t exactly straight forward, but to help I wrote a quick wrapper today to help with the process.

HtmlToPdfBuilder allows you to build a PDF using HTML and hides the complexities of working with iTextSharp. You still need iTextSharp to get this project to run, so make sure to include it.

To start, create a new HtmlToPdfBuilder object. As part of the constructor you’ll need to set the document size. It’s actually a Rectangle, but iTextSharp has predefined sizes already available in the PageSize class (constants)

//Page sizes are found in iTextSharp.text.PageSize
HtmlToPdfBuilder builder = new HtmlToPdfBuilder(PageSize.LETTER);

After you have the builder you can add as many pages as you would like using the .AddPage() method. You can also access each of the pages by their index on the builder you create.

HtmlPdfPage first = builder.AddPage();
//also found at builder[0]

HtmlPdfPage second = builder.AddPage();

Once you’ve added your pages you can start adding your HTML with .AppendHtml().

first.AppendHtml("<h1>Hello World</h1>");

//you can also use params for formatting
second.AppendHtml("<h1>{0}</h1><span>{0}</span>", "Hello Second Page", "Another Param");

Next, you’re going to want to apply some styles to your PDF document, you can use a couple methods. First, the .AddStyle() let’s you add a single style to your page. The first parameter is the selector, such as "H1" or ".totals", the second parameter is a single line of CSS such as "color:#F00;font-weight:bold;".

There is also a method called .ImportStylesheet() that accepts an absolute path (not relative) to a stylesheet and adds all of the styles it finds. I was pretty pleased with the method because I was able to do the entire thing with a single Regular Expression.

//add individual styles
builder.AddStyle("H1", "color:#F00");
builder.AddStyle("p", "font-weight:bold;text-decoration:underline;");

//import an entire sheet
builder.ImportStylesheet("c:\\stylesheets\\pdf.css");

It’s worth mentioning that all of my efforts to set heights, widths, paddings and margins didn’t go so well. I’m not sure what the rules are when it comes to that part so be warned.

Finally, you’re ready to save your document. Use the .RenderPdf() method to get the bytes of your PDF.

byte[] file = builder.RenderPdf();
File.WriteAllBytes("c:\\output\\final.pdf", file);

If you’ve worked with iTextSharp before, or you want a little more control over the rendering process, the builder has two events named BeforeRender and AfterRender that give you access to the iTextSharp classes PdfWriter and Document.

Hopefully, this helps simplify working with HTML and PDFs with iTextSharp. Share your thoughts or suggestions!

Source Code

Download HtmlToPdfBuilder.cs

.

Below is a test version for .NET 2.0 – Please let me know if you discover any bugs. Since I am using Visual Studio 2008 I’m not always aware when there is code that won’t work in previous versions.

Download HtmlToPdfBuilder.cs (for .NET 2.0)

Don’t forget, you still need to download iTextSharp and add it as a reference in your project.

Written by hugoware

May 8, 2009 at 3:04 am