Dompdf weird characters

If you ever have issues with weird characters appearing in PDF documents it is most likely a character encoding issue.

This often occurs when someone does a copy and paste from Microsoft Word, or similar. Things like quotes get converted into characters that are not properly displayed unless character encoding is kept consistent all the way.

Two places the UTF-8 charset is often missed, is in the <html> body, and the load_html() tag. Often the entire <html> body is missed and only the pdf content is passed. This body tag is required to ensure dompdf knows the encoding of the document it is rendering.

A quick example:


$pdf_content = "some pdf content here";

$html = '<html><head><meta http-equiv="Content-Type" 
content="text/html; charset=utf-8"/></head><body>' . 
$pdf_content . '</body></html>'; 

require 'dompdf/dompdf_config.inc.php';

$dompdf = new DOMPDF();

$dompdf->load_html($html,'UTF-8'); 
$dompdf->render();

$pdfFile = $dompdf->output();

header('Content-Type: application/pdf');
header('Content-Transfer-Encoding: binary');
header('Expires: 0');
header('Pragma: public');
header('Content-Length: ' . mb_strlen($pdfFile, '8bit'));

echo $pdfFile