PDF import column variances in alignment

I have a use case of data extraction from PDFs. The PDF file I’m trying to open in ReportMiner does not import in a position consistent with where the data is positioned within the original PDF document.

For example, the starting rows of the table are placed normally whereas the middle-order rows are tabbed too much to the right side of the page and the ending rows are placed in line with the starting rows, causing a misalignment of data which ultimately affects data extraction.

Similarly, some fields are extra tabbed to different positions when compared to the original PDF document.

Do you have a solution to ensure that Astera ReportMiner imports data from PDFs in the original layout?

Yes, there is a feature called ‘Scaling Factor’ that you can use to fix the alignment.

It is a trial-and-error practice and there is no way to determine if a specific value could be used. The scaling factor ranges between 0 and 9. The best practice would be to start with the median value and then move on to either side of it.

Refer to this documentation on PDF Scaling Factor for more details on the feature: How To Work With PDF Scaling Factor in ReportMiner — ReportMiner 9 documentation

The scaling factor changes the positioning of the pattern, and you may need to redefine the pattern after setting it to an appropriate value.