Skip to content

Java API: invalid file path crashes batch processing (CLI handles gracefully) #430

@bundolee

Description

@bundolee

Problem

When using OpenDataLoaderPDF.processFile in a loop, an invalid file path throws an exception and stops the entire batch. The CLI handles this gracefully, but the Java API does not.

Originally reported in #375 by @pavanpai769.

Reproduction

for (String pdf : new String[]{"valid.pdf", "invalid.pdf"}) {
    OpenDataLoaderPDF.processFile(pdf, config);
}

If invalid.pdf doesn't exist, the exception kills the loop and valid.pdf (if after) never gets processed.

Expected behavior

The API should throw a clear exception (e.g. IllegalArgumentException for invalid paths) so callers can catch and skip, rather than silently swallowing errors. The throws IOException contract must be preserved.

Design direction

  • Add input validation (null/blank/non-existent/non-PDF) that throws IllegalArgumentException
  • Keep throws IOException on the signature — callers decide error handling
  • Document the batch pattern with try-catch in Javadoc

See review discussion on #375 for details.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions