Paged.js generates duplicate content (an example with single text element)
First of all, thank you for an awesome libary.
And now strictly to the point, if source html contains repeated text, generated page contains duplicate content:
See attached HTML:
- Source html contains single
<p>element with text
Coxswain Jack Tar heartiesrepeated 10 times.
- Generated page contains text
Coxswain Jack Tar hearties41 times:
After doing some debugging, it seems be an issue with chunking of the text element: https://gitlab.pagedmedia.org/tools/pagedjs/blob/master/src/chunker/layout.js#L393
I was able to work around the issue by replacing:
offset += node.textContent.indexOf(container.textContent);
offset += node.textContent.length - container.textContent.length;
Unfortunately I do not know your codebase too well to submit a pull request, but I hope it will be a good lead for fixing the bug.
HTML for reproducing the issue: pagedjs-bug-duplicated-text.html