Optimizing the document verification process: innovative approaches required
Customer documents evidencing creditworthiness or the collateral value (e.g. proof of income, valuations, register excerpts), invoices, or photographs are of immense importance in loan application processes or when settling claims. The validity of these documents is a decisive factor in determining whether disbursement criteria are met and whether funds are subsequently released by the financial services provider. Checking documents for tampering, i.e. detecting document forgery to avoid financial or reputational damage, is therefore vitally important.
Manual document verification is still a challenge for banks and insurance companies, despite optimized processes and the considerable expertise of the employees involved. The reasons vary:
- Non-standardized supporting documents such as invoices or photographs are submitted.
- Manual verification of documents ties up resources and leads to high personnel costs for banks as well as long waiting times for customers.
- Susceptibility to errors correlates with the proportion of manual activities, because โto err is humanโ.
- New fraud patterns are only identified after a time lag, usually after the damage has already been done.
To further optimize the (manual) document verification process, additional new (technological) approaches to fraud detection are required.
AI as technological support in the document verification process
One approach to further optimize the document verification process is the use of optical character recognition (OCR), a method from the field of artificial intelligence (AI). It allows both native PDF files and paper documents in the form of scanned copies or photographs to be readout, processed, made usable for automated fraud detection.
In addition to an automated check for completeness of all required documents and their contents (e.g. address, name, invoice date, tax number, amount), the use of OCR makes it possible to detect the manipulation of characters in documents. To do so, various features of the characters are recorded (including height, width, rotation and the distance between them) and used as a basis for determining the probability of fraud.
Use of existing documents as data basis
The system uses existing documents, including already identified fraud cases, as a data basis. If enough fraud cases are not available, various statistical methods can help to create or improve a data basis.
Based on this (generated) data set, a series of decision trees, a so-called random forest, is trained for fraud detection using OCR. In this process, each of these decision trees learns to focus on a subset of a characterโs features and assess them for fraud according to self-learned rules.
In operation, the decision trees are fed with the features of the characters to be assessed. Each decision tree then produces a fraud score that is (potentially) different from that of the other decision trees. All scores are finally consolidated into a probability of fraud.
BankingHub-Newsletter
Analyses, articles and interviews about trends & innovation in banking delivered right to your inbox every 2-3 weeks
"(Required)" indicates required fields
OCR-based fraud detection: fields of application
The technical solution can be used in the document verification process at various levels of autonomy:
- Supporting:ย The system determines a probability of fraud and passes it on to the case handlers. They decide which cases the team wants to focus on.
- Filtering:ย The system assesses all documents for probability of fraud. Based on upper and lower limits, each document above the upper limit is classified as โfraudโ and below the lower limit as โno fraudโ. Only documents between the two limits are checked by case handlers (โgray casesโ).
- Decision-making:ย The system decides on the categorization of documents based on a threshold value. The case handlers only check selectively.
Regardless of the degree of process integration, technical implementation is possible both as on-premises software and as a service in the cloud. Ideally, in addition to its use in fraud detection, the solution is also integrated simultaneously into the credit decision and claims settlement processes. In this way, information read out by OCR could also be automatically transferred to the data processing system of the bank or insurance company.
Conclusion: OCR-based fraud detection in the document verification process
Overall, OCR-based fraud detection is a way to improve the current manual document verification process:
- Resource and time savings by focusing manual verification on a fraction of the processing cases
- A higher error detection rate
- Continuous learning and thus optimization of the process regarding new fraud patterns, even before the damage has occured
In addition, OCR-based fraud detection can be introduced gradually or in parallel as a benchmark test to the current process. In any case, existing knowledge of the employees about document verification is retained and continues to influence the process.