If you look at the flow of the `eachWord` function, you'll notice that we add the word to the buffer if the width of the word `w` is less than the space available.
There's an edge case where `linebreaker` says that a line break is required on a word (because of a manual break: `theword\n`) and the width of the word being considered is greater than `spaceLeft`. In the current pdfkit, if this is the case:
* we do not append the word to the buffer
* we emit the line
* we break and move on to the next line, starting with a clean line
This omits the word entirely.
This commit makes it so that we only break the line and start with a clean line if we already had the space to write the line to the buffer. Otherwise, we append the word to the buffer and go to a new line.
No longer makes new pages after user specified heights, but still does so if no height is given (page edge by default). This works with multiple columns too. It will fill the columns in the horizontal and vertical space provided and then stop, rather than going to a new page.
Also adds the `ellipsis` option which can be used to append an ellipsis character like `…` to the end of the cut off text. If you set `ellipsis: true`, it will use the default ellipsis character, but you can also set the option to any string you want to use.
Encodes each word separately instead of relying on Tw operator, which only supports character code 32, which isn't used by embedded fonts.
Still to do: proper unicode word breaking with support for non-space separated languages.
The following example adds text with three sections (part1, part2, part3) in different fonts and colors.
doc.font('Helvetica', 13)
.text(part1, 100, 300, { continued: yes, indent: 50 })
.fillColor('red')
.text(part2, { indent: 50, continued: yes })
.font('Times', 13)
.fillColor('green')
.text(part3)
Should solve a huge number of issues. The regular express based word matching from before was not good. It was overzealous and caused the most bugs of anything in the project. It also didn't work at all for languages like Chinese which don't have spaces between words.
The new wrapping algorithm supports all of this. It is a separate module at http://github.com/devongovett/linebreak.
When using a PDF standard font, the encoding is set to WinAnsiEncoding.
In addition, a encoding function is used to translate from UTF-8 to
WinAnsiEncoding.
@startY + options.height will actually = the full remainder of the page, by subrtacting out a line of text it just wrapped a line of text too early as the last line of text could never fit.