For the research, 300 receipts were collected from 3 supermarkets, 100 from each. Therefore, 3 different receipt types were collected. Then the receipt text was manually typed so as to avoid the errors caused by OCR accuracy in the experiments. Then to increase the data corpus, receipts from another 3 supermarkets were included. Thus, finally text from 600 receipts was collected, 100 from each of the 6 different supermarkets.
Associated publication: -
1350 receipt images captured in 3 environments, receipt text for 1750 images and annotated receipt chunks were to be published under Creative Commons Attribution 4.0 International.