How can we find the right documents with AI?

by | Aug 6, 2020

Reading time: 6 minutes

How can we use AI to get documents on the table faster?
It appears that in decision-making use is made of various (internal) documents and various programs, including Microsoft Teams and the Document Management System (DMS). These documents cannot always be found due to the large quantities. This is not only the case with the Social Insurance Bank, but also with other government organizations.

Increase findability

That is why we want to increase the findability of internal documents. These are documents that are not primarily used for and with customers. This does not mean that you can then access all documents. If you have rights, you can then also enter the relevant document. The ultimate goal is to speed up decision-making based on the availability of data. So the previous meeting information is faster and available at all.

What have we done?

We set to work for this by first investigating what type (.doc, .xls, etc.) and type of documents (decisions, reports) are available. Metadata is then added to these documents. Research has shown that of the 13 million files on a storage location within the SVB, a large part consists of minutes in Word format (.doc). These files are particularly suitable for concept extraction. This means that the core of the document is expressed in keywords. These keywords are then added to the file's metadata. Metadata can be seen as a post-it on a document with concise information about the document. For example, Teams or DMS can then search in the metadata of those files to retrieve the corresponding document.

How did we do it?

We first investigated how we could do this the best and fastest and ended up with the startup Textgain. This startup has its own technology. This turned out to be extremely suitable for this and together we set to work. The first phase in our AI process is adding metadata to these documents. In order to train AI, we need to use data from real meetings and therefore also real documents. Here we ran into privacy limits. We then started building a PoC (Proof of Concept). We have tested whether it is technically feasible. Finally, we created an app with Mendix that uses the concept extraction algorithm of TextGain. This allowed us to apply concept extraction to a text.

Using the power of startups

By optimally using the power of the startup, we were able to gain speed in this process. After all, we didn't have to build anything ourselves, but were able to quickly use Textgain's text analysis via an API.

Who did we do this for?

We initially set up this experiment for managers and staff, but in time it should be available to all SVB colleagues.

View the result for yourself in the demo

You can view and try the demo yourself. Enter your own piece of text, as a result you will see the keywords that have been extracted from your piece of text below. With the search function you can find your entered text with your keyword. This behavior shows the search in metadata. The demo may not work properly with frequent use, please try again tomorrow.

What about privacy?

Protecting personal data in combination with training real data is a challenge. That is why we have chosen to use documents without direct customer data. In the documents that we did start working with, we have at least removed the names.

And now further?

We will now proceed to build an MVP (Minimum Viable Product). This is the program in its most basic form, a first version, which can be used immediately. By collecting feedback on the operation of this MVP, further building and improvement can be done. As the ultimate goal, we want to reduce bureaucracy through the use of AI.
The first step is to make the available information available, both for humans and AI. A second step is to connect the information together, so that the AI can take a proactive role. For example, when a particular decision is discussed in minutes in terms of deadlines, colleagues involved, etc. By using AI, the available data can be linked to both the person who needs the data and the person who has drafted the decision. Another option yet to be designed is a notification. You will be informed with a message (and also reminded) of the action (s) still to be taken to meet the deadline.
Finally, a possible third step in this AI project could be that AI itself will take the actions to meet the deadline. This relieves the workload of colleagues and ultimately helps citizens more quickly.

An example of how a meeting is running now and how the participants get the information here: a meeting has been planned, then shortly before the start you will receive the agenda excluding the underlying documents and decisions (often you do not know where to get those underlying documents and this often has to be done within a very short period of time). The result is that you start a meeting insufficiently prepared, so that sometimes a new meeting has to be planned in order to reach a sound decision.


If, after reading, you would like to share an insight that we may have missed, please let us know directly. We share these insights to take you into the rest of our AI experiment. This in order to ultimately reach decisions more effectively, for now and in the future!

Work with us on this challenge

13 + 11 =

Share this article