Introduction
The following tool identifies anomalous entries based on the overall general ledger data set. It can be potentially used by small enterprises that do not have sophisticated ERPs. One can load the general ledger in CSV format. The code then runs and analyzes the past journal entries in general ledger to understand common patterns and flags entries that deviate from these norms.
How Does It Work?
It uses an algorithm which learns what typical journal entries look like. It takes into account the selected features, such as dates, account names, debit amounts, credit amounts, or transaction descriptions. It then flags journals which appear unusual.
The mock input file looks as follows:

The sample general ledger CSV input file can be found here:

Based on the input file, the code then outputs the following transactions as unusual:
- Date: 2023-02-08
- Account: Cash
- Debit: 0
- Credit: 10500
- Description: Refund Processed
This entry differs from the normal pattern, not only in the ‘Debit’ and ‘Credit’ amounts but also in the ‘Description’. The file is just an example, without debit/credit entries. You can use the actual general ledger.
Anomaly Parameter
The anomaly parameter is currently set to 1% which may not be optimal for all datasets and could result in false alarms. Moving the “Expected anomaly proportion” slider tells the model roughly what percentage of rows you think might be suspicious, so it flags about that many as anomalies.
- if you see too many obviously normal entries marked as anomalies, slide the percentage down;
- if almost nothing is flagged—or you still spot suspicious rows that weren’t caught, slide it up until the results feel about right.