Reporting issues in public space
TagsReportsComplaintsNatural language processing
When someone encounters rubbish or a maintenance issue on the street or in a park, they can report this to the municipality via an online reporting system. A dangerous traffic situation or disturbance from people or cafe’s can also be reported.
This system used to be a collection of drop-down menus, from which the user would pick the category that best suited their report. The department responsible for a certain category would then take care of the report. However, as the municipality is a complex organisation, there are countless categories. Many times the wrong category would be chosen, resulting in delays. Now, an algorithm recognizes certain keywords, for example, ‘waste’ and ‘sidewalk’. From these keywords, it determines which category it belongs to, and ultimately, which department within the municipality should examine the case.
As a result, there are fewer administrative steps for the person reporting on the issue. Also, the report can be processed much faster, because it arrives at the right department more quickly.
- Research, Information & Statistics (OIS)
Contact person for inquiries
- Adviser R&D (Adviseur Onderzoek en ontwikkeling)
- Developed in-house
More detailed information on the system
Here you can get acquainted with the information used by the system, the operating logic, and its governance in the areas that interest you.
- DatasetsShow MoreShow Less
Key data sources utilised in the development and use of the system, their content and utilisation methods. The different data sources are separated by subheadings.
The dataset consists of reports that have been made previously. The initial training set consisted of 300,000 reports. The model is periodically trained with new reports and corrections on prior reports. If the Action Service Center or other departments receive an incorrect categorisation (see Human Oversight), they can manually correct the mistake in the system. We are investigating whether additional training of the algorithm can be automated in the future.
The dataset cannot be published automatically via this register. Because the data is collected in a free text field, it is possible that the reporter enters personal data, even though this is explicitly not requested.
Contact information for follow-up questions
The person reporting can choose to provide their phone number and/or email address if they want to be notified of any updates or for follow-up questions. The information is kept no longer than is necessary for this specific purpose. This information is not used in the algorithm.
- Data processingShow MoreShow Less
The operational logic of the automatic data processing and reasoning performed by the system and the models used.
The text of the report is broken down into single words. The model has been trained to recognize the weight of each word by using ‘TF-IDF’ or ‘term frequency-inverse document frequency’. This representation will create weights for words that show how unique they are for the specific citizen report compared to the overall collection. A word such as ‘the’ will get a low weight, and a word such as ‘garbage’ will get a higher weight. This makes it perfect for classes that have very specific words describing them. It also helps with bigrams or unigrams (Like: “thank you”, “please”) occurring in all documents not to affect the classification too much.
A logistic regression (a machine-learning technique) of that combination of words is then used to determine which category is most likely to fit, and therefore which department within the municipality needs to act on the report.
This algorithm can detect very accurately which category a combination of words belongs to; the algorithm has a score of 0.88 (macro-weighted F1 score). Other methods have also been implemented (W2V, CNN + LSTM, BERT) but have been found to perform less. More information: https://medium.com/maarten-sukel/how-to-use-machine-learning-for-the-classification-of-citizen-service-requests-b71159a85f36
- Non-discriminationShow MoreShow Less
Promotion and realisation of equality in the use of the service.
The algorithm is language-based. If someone does not speak Dutch or uses ‘unusual’ words, the algorithm may not recognize those words. In that case, the Action Service Center will assess the report, and the algorithm will be retrained if needed.
- Human oversightShow MoreShow Less
Human oversight during the use of the service.
All reports that are assigned by the system to a specific category with less than 40% certainty are forwarded to the Action Service Center. An employee of the Action Service Center will assess which category is best suited for the report. Also, if reports are tagged to the wrong category, the department responsible will manually recategorise them.
- RisksShow MoreShow Less
Risks related to the system and its use and their management methods.
This is a low-risk algorithm. The algorithm is meant to speed up the process of allocating reports to the relevant department. That means that if the algorithm is incorrect in allocating a report, it will take a bit more time for the report to reach the right destination, but no longer than it would have before the use of the algorithm.
Providing personal information is optional if the person reporting wants to get updates. This information is not used in the algorithm. Encryption is used to store this information securely.
Was this information useful?