Reporting issues in public space

Algorithms > Reporting issues in public space
City management

Overview

Reporting issues in public space

Reporting

Tags

Reports Complaints Natural language processing

When someone encounters rubbish or a maintenance issue on the street or in a park, they can report this to the municipality via an online reporting system. A dangerous traffic situation or disturbance from people or cafe’s can also be reported.

This system used to be a collection of drop-down menus, from which the user would pick the category that best suited their report. The department responsible for a certain category would then take care of the report. However, as the municipality is a complex organisation, there are countless categories. Many times the wrong category would be chosen, resulting in delays. Now, an algorithm recognizes certain keywords, for example, ‘waste’ and ‘sidewalk’. From these keywords, it determines which category it belongs to, and ultimately, which department within the municipality should examine the case.

As a result, there are fewer administrative steps for the person reporting on the issue. Also, the report can be processed much faster, because it arrives at the right department more quickly.


Link to service

Contact information


  • Department

  • Research, Information & Statistics (OIS)
  • Contact team for inquiries

  • Adviser R&D (Adviseur Onderzoek en ontwikkeling)
  • External suppliers

  • Developed in-house
  • Contact email

  • CIO-office@amsterdam.nl
  • Contact phone

  • +31 20 624 1111


More detailed information on the system

Here you can get acquainted with the information used by the system, the operating logic, and its governance in the areas that interest you.


Datasets Show More Show Less

Key data sources utilised in the development and use of the system, their content and utilisation methods. The different data sources are separated by subheadings.

Name

Reports

Dataset description

The dataset consists of reports that have been made previously. The initial training set consisted of 300,000 reports. The model is periodically trained with new reports and corrections on prior reports. If the Action Service Center or other departments receive an incorrect categorisation (see Human Oversight), they can manually correct the mistake in the system. We are investigating whether additional training of the algorithm can be automated in the future.

The dataset cannot be published automatically via this register. Because the data is collected in a free text field, it is possible that the reporter enters personal data, even though this is explicitly not requested.

Personal data

No personal data

Name

Contact information for follow-up questions

Dataset description

The person reporting can choose to provide their phone number and/or email address if they want to be notified of any updates or for follow-up questions. The information is kept no longer than is necessary for this specific purpose. This information is not used in the algorithm.

For our privacy policy, see: https://www.amsterdam.nl/privacy/specifieke/privacyverklaringen-wonen/meldingen-overlast-privacy/

Personal data

Identified

Human oversight Show More Show Less

Human oversight during the use of the service.

All reports that are assigned by the system to a specific category with less than 40% certainty are forwarded to the Action Service Center. An employee of the Action Service Center will assess which category is best suited for the report. Also, if reports are tagged to the wrong category, the department responsible will manually recategorise them.

Data processing Show More Show Less

The operational logic of the automatic data processing and reasoning performed by the system and the models used.

System architecture description

The text of the report is broken down into single words. The model has been trained to recognize the weight of each word by using ‘TF-IDF’ or ‘term frequency-inverse document frequency’. This representation will create weights for words that show how unique they are for the specific citizen report compared to the overall collection. A word such as ‘the’ will get a low weight, and a word such as ‘garbage’ will get a higher weight. This makes it perfect for classes that have very specific words describing them. It also helps with bigrams or unigrams (Like: “thank you”, “please”) occurring in all documents not to affect the classification too much.

A logistic regression (a machine-learning technique) of that combination of words is then used to determine which category is most likely to fit, and therefore which department within the municipality needs to act on the report.

System architecture image

reporting_issues_in_public_space.png

Output data repository

Performance

This algorithm can detect very accurately which category a combination of words belongs to; the algorithm has a score of 0.88 (macro-weighted F1 score). Other methods have also been implemented (W2V, CNN + LSTM, BERT) but have been found to perform less. More information: https://medium.com/maarten-sukel/how-to-use-machine-learning-for-the-classification-of-citizen-service-requests-b71159a85f36

Source code repository

Non-discrimination Show More Show Less

Promotion and realisation of equality in the use of the service.

The algorithm is language-based. If someone does not speak Dutch or uses ‘unusual’ words, the algorithm may not recognize those words. In that case, the Action Service Center will assess the report, and the algorithm will be retrained if needed.

Language

1
References Show More Show Less

Legal basis description

Live service address

Privacy policy address

Risk management Show More Show Less

Risks related to the system and its use and their management methods.

This is a low-risk algorithm. The algorithm is meant to speed up the process of allocating reports to the relevant department. That means that if the algorithm is incorrect in allocating a report, it will take a bit more time for the report to reach the right destination, but no longer than it would have before the use of the algorithm.

Providing personal information is optional if the person reporting wants to get updates. This information is not used in the algorithm. Encryption is used to store this information securely.

Risk name

Personal data (by accident)

Risk description

The person reporting does so in a free text field. There is a posibility that they enter personal data.

Frequency

Low

completed

Risk mitigation description

There is an explicit warning above the text field not to enter any personal data

Probability

Low

Scale

Low

Severity

High

Risk type

Privacy loss

Risk name

Personal data (voluntary)

Risk description

Personal data (e-mail address and phone number) are registerd if the person reporting chooses to provide them in a specific field.

Personal data are registered (telephone number and email address) if the reporter chooses to provide it.

Frequency

Low

completed

Risk mitigation description

The data is stored securely in concordance to the Baseline Information Security Government (BIO) standard

Probability

Low

Scale

Low

Severity

High

Risk type

Privacy loss

Was this information useful?

Would you like to give feedback? Your feedback will help us develop our algorithms further.

Using this form, you can provide feedback on this system. No personal data, such as name or email, should be provided using this form. If you want to get a response to your feedback, please provide your feedback using our email address algoritmen@amsterdam.nl.

This form is not meant for objections to or appeals of specific decisions the municipality made. If you have objections, please contact us through this page.