City management
Overview
Reporting issues in public space
Tags
Reports Complaints Natural language processingWhen someone encounters rubbish or a maintenance issue on the street or in a park, they can report this to the municipality via an online reporting system. A dangerous traffic situation or disturbance from people or cafe’s can also be reported.
This system used to be a collection of drop-down menus, from which the user would pick the category that best suited their report. The department responsible for a certain category would then take care of the report. However, as the municipality is a complex organisation, there are countless categories. Many times the wrong category would be chosen, resulting in delays. Now, an algorithm recognizes certain keywords, for example, ‘waste’ and ‘sidewalk’. From these keywords, it determines which category it belongs to, and ultimately, which department within the municipality should examine the case.
As a result, there are fewer administrative steps for the person reporting on the issue. Also, the report can be processed much faster, because it arrives at the right department more quickly.
Link to service
Contact information
Department
- Research, Information & Statistics (OIS)
Contact team for inquiries
- Adviser R&D (Adviseur Onderzoek en ontwikkeling)
External suppliers
- Developed in-house
Contact email
- CIO-office@amsterdam.nl
Contact phone
- +31 20 624 1111
More detailed information on the system
Here you can get acquainted with the information used by the system, the operating logic, and its governance in the areas that interest you.
- Datasets Show More Show Less
Key data sources utilised in the development and use of the system, their content and utilisation methods. The different data sources are separated by subheadings.
Name
ReportsDataset description
The dataset consists of reports that have been made previously. The initial training set consisted of 300,000 reports. The model is periodically trained with new reports and corrections on prior reports. If the Action Service Center or other departments receive an incorrect categorisation (see Human Oversight), they can manually correct the mistake in the system. We are investigating whether additional training of the algorithm can be automated in the future.
The dataset cannot be published automatically via this register. Because the data is collected in a free text field, it is possible that the reporter enters personal data, even though this is explicitly not requested.
Personal data
No personal data
Name
Contact information for follow-up questionsDataset description
The person reporting can choose to provide their phone number and/or email address if they want to be notified of any updates or for follow-up questions. The information is kept no longer than is necessary for this specific purpose. This information is not used in the algorithm.
For our privacy policy, see: https://www.amsterdam.nl/privacy/specifieke/privacyverklaringen-wonen/meldingen-overlast-privacy/
Personal data
Identified
- Human oversight Show More Show Less
Human oversight during the use of the service.
All reports that are assigned by the system to a specific category with less than 40% certainty are forwarded to the Action Service Center. An employee of the Action Service Center will assess which category is best suited for the report. Also, if reports are tagged to the wrong category, the department responsible will manually recategorise them.
- Data processing Show More Show Less
The operational logic of the automatic data processing and reasoning performed by the system and the models used.
System architecture description
The text of the report is broken down into single words. The model has been trained to recognize the weight of each word by using ‘TF-IDF’ or ’term frequency-inverse document frequency’. This representation will create weights for words that show how unique they are for the specific citizen report compared to the overall collection. A word such as ’the’ will get a low weight, and a word such as ‘garbage’ will get a higher weight. This makes it perfect for classes that have very specific words describing them. It also helps with bigrams or unigrams (Like: “thank you”, “please”) occurring in all documents not to affect the classification too much.
A logistic regression (a machine-learning technique) of that combination of words is then used to determine which category is most likely to fit, and therefore which department within the municipality needs to act on the report.
System architecture image
reporting_issues_in_public_space.png
Output data repository
Performance
This algorithm can detect very accurately which category a combination of words belongs to; the algorithm has a score of 0.88 (macro-weighted F1 score). Other methods have also been implemented (W2V, CNN + LSTM, BERT) but have been found to perform less. More information: https://medium.com/maarten-sukel/how-to-use-machine-learning-for-the-classification-of-citizen-service-requests-b71159a85f36
Source code repository
- Non-discrimination Show More Show Less
Promotion and realisation of equality in the use of the service.
The algorithm is language-based. If someone does not speak Dutch or uses ‘unusual’ words, the algorithm may not recognize those words. In that case, the Action Service Center will assess the report, and the algorithm will be retrained if needed.
Language
1- References Show More Show Less
Legal basis description
Live service address
Privacy policy address
- Risk management Show More Show Less
Risks related to the system and its use and their management methods.
This is a low-risk algorithm. The algorithm is meant to speed up the process of allocating reports to the relevant department. That means that if the algorithm is incorrect in allocating a report, it will take a bit more time for the report to reach the right destination, but no longer than it would have before the use of the algorithm.
Providing personal information is optional if the person reporting wants to get updates. This information is not used in the algorithm. Encryption is used to store this information securely.
Risk name
Personal data (by accident)Risk description
The person reporting does so in a free text field. There is a posibility that they enter personal data.
Frequency
Low
completedRisk mitigation description
There is an explicit warning above the text field not to enter any personal data
Probability
Low
Scale
Low
Severity
High
Risk type
Privacy lossRisk name
Personal data (voluntary)Risk description
Personal data (e-mail address and phone number) are registerd if the person reporting chooses to provide them in a specific field.
Personal data are registered (telephone number and email address) if the reporter chooses to provide it.
Frequency
Low
completedRisk mitigation description
The data is stored securely in concordance to the Baseline Information Security Government (BIO) standard
Probability
Low
Scale
Low
Severity
High
Risk type
Privacy loss
Was this information useful?