Canopy managed to Anonymize sensitive information from PDF files using Aspose.PDF for .NET

About Canopy

Canopy is a Singapore based FinTech startup that offers next-generation account aggregation and portfolio visualization platform for high net-worth individuals and wealth professionals. We are the Singapore Winner and Global Finalist of the UBS Future of Finance Challenge (2015), and “Singapore’s #1 Hottest Startup” according to Singapore Business Review (2016).

The key to providing kick-ass data visualization is good and clean data. Unfortunately, most banks in this world are not willing or able to provide electronic data feeds about account information of their clients. We are referring to investment portfolio information and not just credit card expenses here.

The PDF file is, therefore, the go-to solution. Almost all banks are readily providing PDF files to their clients for the purpose of reporting. The formats are machine-generated and therefore very stable. Machine-reading a clients’ PDF file is, therefore, integration-free with the originating Institution.

Problem

As a data company, we have been helping clients visualize their own data. It comprised of personal financial information and hence very sensitive. We pride ourselves in the high data security standards that we kept, but nevertheless, we provided confidence to clients by keeping absolute anonymity.

Absolute Anonymity meant that we did not have any idea about our client’s identity – not their name, not their address and definitely not their mother’s maiden name. Clearly, most bank statements are addressed to the holder of the information. This meant that we were looking for a tool that we can implement that can allow our clients to easily anonymize their own PDF statements. The tool was required to be easy to use and must have a very high success rate in processing the amendments. The latter being the key point, we had to discover that not all PDF files are generated equally. A tool that shall work for one file might not w\ork for a different file, because, they have the possibility of being generated with different tools.

Solution

We built our own site which we made available to our users that enabled them to anonymize their data. The functionality was deliberately very simple i.e. upload the file and mention the words that were required to be removed. Done.

The Aspose PDF for .NET showed the highest success rates among the products we tried.

PDF anonymizer file upload preview

Screenshot of the landing page of our anonymizer.
PDF anonymizer after process completion

Screenshot of the output of the function.

Experience

Finding a solution

We looked at multiple different products. The quality of the output was the most important factor for us, due to the fact that we receive PDF files from many varied sources and the way they are generated differs a lot. We also used the free trial to verify that the output is compatible with our system.

Implementation

We already had an implementation of the tool with another PDF editing tool. However, the switch to Aspose.PDF for .NET was relatively easy to do and took our developers a couple of days. Due to the ease of the product, so far, no help from the support team was required.

Outcome

So far, we still have to find out what the user feedback is. Our initial tests are pointing to the very good quality of the outputs. We measure two factors for quality:

Aspose.PDF for .NET has provided the best outcome for our pre-defined metrics.

Next Steps

Right now, this has solved our key pain points and before we take any additional measures, we are interested to observe users feedback.

Summary

Overall we are very happy with the Aspose experience. The implementation was seamless and the quality of the output is up to our expectations. We highly recommend Aspose.PDF for .NET.

Martin Pickrodt
Chief Data Officer