fix: scan inline XML/PDF MIME parts, not only attachment disposition #18

Closed
heiko wants to merge 77 commits from fix/11-inline-mime-parts into master
Owner

Closes #11

Relaxes the Walk disposition guard to also process parts with inline or missing disposition when they have an invoice-relevant content-type or filename. Also extracts filename from Content-Type name parameter.

Closes #11 Relaxes the `Walk` disposition guard to also process parts with inline or missing disposition when they have an invoice-relevant content-type or filename. Also extracts filename from Content-Type `name` parameter.
Author
Owner

Review findings (ai:codex)\n\nLow: internal/scanner/attachment.go:62\nThe new inline-part predicate treats any MIME part with a filename as invoice-like, even when it is not an attachment and has a non-invoice content type such as image/png; name="logo.png" or Content-Disposition: inline; filename="logo.png". That broadens Walk from invoice-relevant inline XML/PDF parts to common inline assets, which can add noisy verbose/debug attachment output and read unrelated inline bodies into memory. A negative test for an inline non-XML/PDF named part would catch this; the non-attachment case should likely require XML/PDF content type or a .xml/.pdf filename.\n\nVerification: go test ./internal/scanner ./internal/xrechnung passed on the PR branch.

Review findings (ai:codex)\n\nLow: internal/scanner/attachment.go:62\nThe new inline-part predicate treats any MIME part with a filename as invoice-like, even when it is not an attachment and has a non-invoice content type such as image/png; name="logo.png" or Content-Disposition: inline; filename="logo.png". That broadens Walk from invoice-relevant inline XML/PDF parts to common inline assets, which can add noisy verbose/debug attachment output and read unrelated inline bodies into memory. A negative test for an inline non-XML/PDF named part would catch this; the non-attachment case should likely require XML/PDF content type or a .xml/.pdf filename.\n\nVerification: go test ./internal/scanner ./internal/xrechnung passed on the PR branch.
heiko left a comment

Review: fix/11-inline-mime-parts

The change is correct and well-scoped.

Observations

Redundant filter: The isInvoiceLike condition appears twice — as the early-continue guard and again before calling fn. When disposition == "attachment", a non-invoice-like part passes the first guard, its body is read into memory, then discarded at the bottom. Consider collapsing both checks into one gate before io.ReadAll.

Filename-bearing text/plain parts now processed: A text/plain; name="notes.txt" inline part will reach fn() because filename != "" makes isInvoiceLike true. Downstream detection rejects non-XML/PDF, so harmless in practice, but a slight behavioural broadening.

multipart/alternative safety: HTML-body parts appear with text/html, no filename, no attachment disposition — correctly skipped.

Test coverage suggestions

  1. Negative case: text/plain body (no filename, no attachment) must NOT be processed
  2. PDF inline case: application/pdf with Content-Disposition: inline
  3. Filename only from Content-Type name param (no Content-Disposition at all)

Logic is sound, no regressions expected.

— reviewed by ai:claude

## Review: fix/11-inline-mime-parts The change is correct and well-scoped. ### Observations **Redundant filter:** The `isInvoiceLike` condition appears twice — as the early-continue guard and again before calling `fn`. When `disposition == "attachment"`, a non-invoice-like part passes the first guard, its body is read into memory, then discarded at the bottom. Consider collapsing both checks into one gate before `io.ReadAll`. **Filename-bearing text/plain parts now processed:** A `text/plain; name="notes.txt"` inline part will reach `fn()` because `filename != ""` makes `isInvoiceLike` true. Downstream detection rejects non-XML/PDF, so harmless in practice, but a slight behavioural broadening. **multipart/alternative safety:** HTML-body parts appear with `text/html`, no filename, no attachment disposition — correctly skipped. ### Test coverage suggestions 1. Negative case: `text/plain` body (no filename, no attachment) must NOT be processed 2. PDF inline case: `application/pdf` with `Content-Disposition: inline` 3. Filename only from `Content-Type name` param (no `Content-Disposition` at all) Logic is sound, no regressions expected. — reviewed by ai:claude
heiko closed this pull request 2026-05-11 22:25:43 +02:00

Pull request closed

Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
IUS/xr-invoiced!18
No description provided.