Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rules not working #141

Open
keendule opened this issue Sep 6, 2023 · 3 comments
Open

Rules not working #141

keendule opened this issue Sep 6, 2023 · 3 comments

Comments

@keendule
Copy link

keendule commented Sep 6, 2023

import { PdfReader } from 'pdfreader';
import { Rule } from 'pdfreader/Rule';

const processItem = Rule.makeItemProcessor([
Rule.on(/а/)
.extractRegexpValues()
.then((item) => console.log(item)),
]);

new PdfReader({}).parseFileItems('./pdf/1.pdf', (err, item) => {
if (err) console.error('error:', err);
else if (!item) console.warn('end of file');
else processItem(item);
});

What i need to paste in then() ?

@adrienjoly
Copy link
Owner

Here's what I think you want to do:

import { PdfReader, Rule } from "./index.js";

const processItem = Rule.makeItemProcessor([
  // detect the first text entry that contain "hello"
  Rule.on(/(hello.*)/i)
    .extractRegexpValues()
    .then((item) => {
      console.log("found match:", item); // => found match: [ 'Hello "world"' ]
    }),
  // trigger the rule above, as soon as "hello" is found in a text entry
  Rule.on(/hello/i).accumulateAfterHeading(),
]);

new PdfReader({}).parseFileItems("./test/sample.pdf", (err, item) => {
  if (err) console.error("error:", err);
  else if (!item) console.warn("end of file");
  else processItem(item);
});

@sebakthapa
Copy link

sebakthapa commented Aug 20, 2024

This is not working for me too.

I did the following to import

import { PdfReader, Rule } from 'pdfreader';

Importing Rule is giving me Module '"pdfreader"' has no exported member 'Rule'. This may be because of missing type declarations, so I used ts-ignore to ignore the error for testing.

Then running what was shown in the docs is not printing anything on the console

const displayValue = (val: unknown) => {
  console.log(val);
};

const displayTable = (val: unknown) => {
  console.log(val);
};

const processItem = Rule.makeItemProcessor([
  Rule.on(/^Hello "(.*)"$/)
    .extractRegexpValues()
    .then(displayValue),
  Rule.on(/^Value:/)
    .parseNextItemValue()
    .then(displayValue),
  Rule.on(/^c1$/).parseTable(3).then(displayTable),
  Rule.on(/^Values:/)
    .accumulateAfterHeading()
    .then(displayValue)
]);

new PdfReader().parseFileItems(pdfpath, (err, item) => {
  if (err) console.error(err);
  else {
    processItem(item);
  }
});

What actually is the issue here?
The pdfpath is correct and using it without Rule is working well.

@adrienjoly
Copy link
Owner

I was able to get the following output:

image

... from a basic nodejs project containing these files:

  • package.json
{
  "name": "npm-pdfreader-test",
  "version": "1.0.0",
  "main": "index.js",
  "type": "module",
  "scripts": {
    "test": "echo \"Error: no test specified\" && exit 1"
  },
  "keywords": [],
  "author": "",
  "license": "ISC",
  "description": "",
  "dependencies": {
    "pdfreader": "^3.0.5"
  }
}
  • test-2.js
import { PdfReader, Rule } from "pdfreader";

const displayValue = (val) => {
  console.log(val);
};

const displayTable = (val) => {
  console.log(val);
};

const processItem = Rule.makeItemProcessor([
  Rule.on(/^Hello "(.*)"$/)
    .extractRegexpValues()
    .then(displayValue),
  Rule.on(/^Value:/)
    .parseNextItemValue()
    .then(displayValue),
  Rule.on(/^c1$/).parseTable(3).then(displayTable),
  Rule.on(/^Values:/)
    .accumulateAfterHeading()
    .then(displayValue),
]);

new PdfReader().parseFileItems("../npm-pdfreader/test/sample.pdf", (err, item) => {
  if (err) console.error(err);
  else {
    processItem(item);
  }
});

=> if you want help troubleshooting, please provide us access to a project (e.g. git repo) that we can checkout and run easily on our end.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants