The Automated AP Invoices Scanner is a tool used for extracting detailed information from a variety of invoice and receipt files. Using optical character recognition (OCR) for images and scanned PDFs, it automatically reads text from documents. Additionally, the tool supports formats such as Word, Excel, and text-based PDFs, ensuring broad coverage for real-world scenarios.
First, you place need to create on your desktop the following folder “Automated Expense Receipt Scanner”, with “Inputs” and “Outputs” sub-folders, and place the relevant invoices in the Inputs subfolder:

Second, you need to update the code to use your OpenAI API key:

Once the invoices files are placed in a designated “Inputs” folder, the scanner processes each one, identifying key fields like invoice number, supplier name, total amount (including VAT), invoice dates, and any relevant accounting period. It then compiles these details into an Excel report, where blank or missing entries are highlighted for easy review. This approach significantly reduces time-consuming data entry, while also helping users spot discrepancies.
For scanned or image-based PDFs, the software leverages Poppler to convert pages into images and Tesseract OCR to recognize text. It then applies a natural-language model to classify and structure the extracted information. The final output includes a separate “Warnings & Info” section for ambiguities.
The output looks as follows:

Code Download:
