Duplicate payments are always a hot topic. Due to poor organization, invoices being paid twice is something that happens over and over again. Even in well-defined organizations, duplicate payments occur again and again when the volume of transactions is high. In this blog post, I will explain some advanced methods for detecting duplicate payments in SAP. This is where the Data Scientist really comes into his/her own. It also explains for how long you can still claim duplicate payments back from vendors. That by contrast is something for people who have a liking for the more legal side of things.
The story thus far
We have already discussed the subject of duplicate payments in several blog posts in the past:
However, after some recent projects involving work on duplicate payments, zapliance has come a long way in the interim and we have since been able to improve our algorithm for the recognition of duplicate payments. I would like to introduce you to these improvements in this blog post.
Refinement of a dual payment analysis
In fact, duplicate payment is not really the right term. Strictly speaking, what you are looking for are double-posted invoices that were paid in advance. In order to search for duplicate invoices, the system first compares certain characteristics of two invoices. If there is a sufficient number of matches (or invoices which differ, as the case may be), the bill pair is proposed as a promising candidate. In the 3 steps to uncover duplicate payments in SAP, we identified the following characteristics as useful:
- Same amount on different documents and
- Same posting date and
- Same document date and
- Same document type and
- Same accounting account and
- Same vendor and
- Same external invoice number
More recently, we have also started using the following characteristics:
- Same bank account to which the transfer was made.
- Identical vendor name if the vendor number is not the same (so that any master data duplicates can also be included).
- Frequency of the invoice amount; less frequent amounts tend to indicate a double invoice.
- Comparison of the entry date of both invoices: if they are close to each other, this speaks more strongly in favor of a double posted invoice.
- The invoices were recorded by different users (= by mistake – they didn't know anything about each other).
But there are also negative features that speak against a double posted bill:
- There is a credit memo for the invoice.
- There is an "offsetting item" with the same amount without it being referenced as a credit memo.
- This is a periodically recurring invoice / payment with a constant equal amount, as far as the same vendor is concerned.
Be liberal in the right places: fuzzy matches
For some characteristics, it does not make sense to check for an exact match. It is possible that texts such as vendor names may be written differently where master data duplicates are involved. For the purposes of such comparisons, one should not compare things exactly, but in a "fuzzy" way. This can be done using phonetic algorithms. One well-known phonetic algorithm is SOUNDEX. SOUNDEX calculates a (four-digit) code from a word that is suitable for comparing the pronunciation of two words. For example, the SOUNDEX codes for:
- “Maier” and' “Maier” and “Maier” and “Meyer” all result in “M600” and are therefore phonetically the same
- “Neier” gives the code “N600”.
- “Raier” gives the code “R600”'.
- “Maiser” gives the code “M260”.
Instead of comparing the original text, the trick is to compare the exact SOUNDEX code.
There is another case where exact comparisons are not always useful. For example, external invoice numbers / references may be entered differently, e. g. "2345-56" may be entered once as "2345 56" but on other occasions as "234556" or "2345_56". Such "subtle" differences should be overlooked. However, it is not possible to do this with an exact comparison. The solution here is to use regular expressions. Regular expressions are often confusing, but they are very powerful. For example, you can manipulate a character string according to the following rules:
- Delete all characters from the string that are not numbers or letters.
- Delete all characters from the string that are letters.
- Replace all characters from the string that are not numbers with spaces.
If you want to try your hand at regular expressions yourself, or need to create one, you can find a good resource here.
Instead of comparing the original text exactly, the texts manipulated by a suitable regular expression should be compared exactly. This way you get a "liberal" exact match of two characteristics.
False positives again and again: annoying recurring invoices
With regard to false positives when searching for duplicate invoices, the following phenomenon has turned out to be particularly annoying: down payments or prepayments which are always for the same constant amount. This is often the case with electricity bills, water bills, waste charges, etc. etc. Precisely because they concern same vendor, these cases are regularly put forward as possible candidates in an analysis of duplicate invoices, but then of course turn out to be false positives (=false alarm). We have found that such false alarms can be ruled out with a few statistical considerations. If you find n invoices with the same amount for a vendor, then there are always n-1 intervals between these n invoices with regard to the document date, for example. So with three bills, there are two intervals. If all these n-1 intervals in days are(approximately) the same, then these are periodic bills and should be suppressed as hits. That's the basic idea. We have even perfected this a little bit further, but we can’t tell you everything at this point.
Fuzzy hits for amounts
Up to now, we have only ever compared two invoices (items) if they have exactly the same amount. This is in general a good procedure, but you never get cases of duplicate invoices where the invoice amount entered was (accidentally) unequal. To take into account the eventuality of unequal invoice amounts, we have considered criteria for "fuzzy hits". At first, however, you run into a problem: If you compare invoices with the same amounts, you have a large number of invoice pairs that have to be compared. However, if you also compare invoices with unequal amounts, there are many more comparison pairs. This brings even the fastest databases to their knees! That is why we have defined certain criteria for unequal amounts. If these are not met, the invoice pair is not used for comparison. These criteria are:
- Typing errors: Two amounts are unequal, but contain a typing error in the numbers, e.g. 4578.00 compared to 4587.00, or in relation to the decimal point: 3457.78 compared to 3457.87
- Powers of ten: Two amounts are unequal, but differ only by powers of ten, e.g. 1567 compared to 15670
- Small discrepancies: The amounts differ only in the "cent" range after the decimal point
With "Fuzzy hits", maybe you will catch a few more that may otherwise have slipped through the net!
The longer, the better
One question that often comes up is how many transactions one should examine with regard to duplicate invoices. You can find information how far back you can go with reclaiming duplicate payments in the more detailed discussion of the legal issues below. Otherwise, the following generally applies: Examine a period of time that is as long as possible, not individual fiscal years separately. The reason for this is that when you examine separate fiscal years, you cannot find duplicate invoices that were posted in two different fiscal years. But the examination of a single financial year is still better than nothing.
Using different criteria
Now this is where things start to get a little more complicated. Depending on the payouts you are considering, you may need to use different comparison criteria. The usual – and without a doubt also the most fruitful – object of investigation is the "normal" credit-side invoice. That's where you usually find the most hits. All in all, it is possible to distinguish the following causes of payouts:
- Vendor invoices
- One-time vendor invoices (Conto pro Diverse (CpD) accounts, one-time-invoices)
- Credits which have been paid out to customers
So what’s next?
We've tried out a lot of things in terms trying to detect duplicate payments / duplicate bills, but we can always come up with something better! Our criteria are already very sophisticated. What we want to do next is to use feedback from our users about the proposed candidates for duplicate invoices to teach our algorithm. What we aim to achieve by doing that is to be able to forecast the probability of a duplicate invoice based on user feedback concerning the proposed duplicate invoices. By putting methods such as multivariate statistics, machine learning and artificial intelligence to use in the process!
I want my money back: detecting duplicate payments is only half the battle
Discovering duplicate payments is hard enough. But if you actually want to get your money back, you have to go and claim it back from the debtor/vendor.
How long will this take if the duplicate payment has happened in the past?
If your vendor is prepared to cooperate, you can of course reclaim and receive any old duplicate payments. This is often a question of the balance of power between you and your vendor. However, in Germany, there is a legal framework that sets limits in this respect (and there may well be in other countries too – we will discuss the case in Germany in what follows). At some point in time, all old claims that have passed the limitation period are excluded by the statute of limitations, so that "stability of the law" is maintained. But, strictly speaking, if you overpay a vendor, it is a case of Unjust enrichment. You can then claim the amount back on the basis of Section 812 of the German Civil Code:
Claim for restitution
(1) A person who obtains something as a result of the performance of another person or otherwise at his expense without legal grounds for doing so is under a duty to make restitution to him. This duty also exists if the legal grounds later lapse or if the result intended to be achieved by those efforts in accordance with the contents of the legal transaction does not occur.
OK, so that's the basis for a claim. But how far back into the past can we go? You can find details of that much earlier on in the German Civil Code:
Standard limitation period
The standard limitation period is three years."
That’s fine, nice and brief and succinct, just how you wish laws always would be. In practice, this means: For three years, you can claim back a duplicate payment. So now what we need to know is when this period begins:
Commencement of the standard limitation period and maximum limitation periods
(1) Unless another commencement of limitation of is determined, the standard limitation period commences at the end of the year in which:
- the claim arose and
- the obligee obtains knowledge of the circumstances giving rise to the claim and of the identity of the obligor, or would have obtained such knowledge if he had not shown gross negligence.
The period begins at the end of the year in which the claim arises. This is important. Not immediately, therefore, when the duplicate payment was made, but only at the end of the year in which the duplicate payment was made. The second prerequisite is a bit more difficult, but should be interpreted as follows: If you should have known about the duplicate payment. Since you knew your vendor and his service, for which you accidentally made the duplicate payment, at the same time, this prerequisite is then also fulfilled at the time of the duplicate payment.
Example: You settle the same liability for the second time on January 23,2017. The limitation period then begins on December 31,2017 and comes to an end three years later on December 31,2020, so you have to have successfully enforced your claim by December 31,2020.
What are the next steps you should take?
When all is said and done, it is always a good idea to start with the simple things first when analyzing duplicate invoices. If you then improve the algorithms, all possible rules and criteria overlap and you can quickly get a clear overview. In zap Audit, you will always have access to our sophisticated algorithms in their most up-to-date form. If you find yourself saying: "I want my money back" after reading this article, then please feel free to get in touch with us: