“Red zones” in Sicily: a story of civic hacking

Note: this is a translation of an original piece published by Salvatore Fiandaca on the 10th April 2021. / Questa è una traduzione (più o meno letterale) di un articolo pubblicato da Salvatore Fiandaca il 10 aprile 2021. Thanks to Paola Chiara Masuzzo for translating it into English.


If you live in Sicily and want to know if your city is currently red, i.e. under specific COVID-19 restrictions (Italy uses a colour coded schema), all you need to do is read 197 pages of PDF files published by the regional government.

The official website of Sicily (Regione Siciliana) does not report any list of the cities currently red (which is the highest of the restrictions imposed): this information is trapped within a webpage that collects the so-called ‘ordinanze’ (we will refer to these documents as ‘decrees’ in the rest of this piece), a bunch of PDF files reporting the latest decisions made by the regional government.

This list is not really updated: at noon on the 10th April 2021, the files of the 9th April (number 37 and 38) were still not available.

A possible alternative way to consult these data is the map built by the ‘Dipartimento della Protezione Civile Regionale’, which, however, presents quite some limitations:

  • at 11:30 am on the 10th of April 2021, the map lists several cities that will only become red from the 11th of April, the following day
  • the map does not provide any information on the timeline of the restrictions: since when and until when the city has red zone restrictions; this information is absolutely key, and given the above point, we can’t even be sure that the map provides a precise photo of the current situation
  • it’s impossible, from the map, to go back to the files that delineate the restrictions, their timings, and their scope. It’s crucial to be able to link back this type of data to the raw, official source of information (especially given that this information is provided and published by the local authorities) 
  • the geographic representation of the map alone is not enough: a structured and updated list of files embedding the data represented is nowhere to be found

So, we thought about it: https://tinyurl.com/zonerossesicilia


Note: after the publication of the original post in Italian, the “Dipartimento della Protezione Civile Regionale” has updated its map, corrected it and added the necessary basic information. We appreciate that, it is a nice result of the action of this community.


The documents that list the governmental decisions for the red zones in Sicily (often not even real PDF, but simply the result of the documents scanning) are collected on the official website under the section “SERVIZI INFORMATIVI | ORDINANZE COVID-19”; when this article was written (10th of April 2021), there were 36 of these files. 

PDF files are not the easiest to read, especially if these are images (you can’t even search for text, in this latter case), not from machines, nor from humans, who would be lost in front of a big amount of bureaucratic jargon. So we thought of collecting all the cities in Sicily with red-zone restrictions in a public Google sheet.

The PDF files are on average 7 pages long, the section that contains the useful data is usually in the last two pages:

To remedy to these ugly PDF files (sigh) and make it easier to both create and read maps and structured lists of text, we created a Google sheet where we annotated (after having read every single PDF online from 2021) the cities in red code, together with the time interval, including the possible extension of the restrictions, and a final link to the decree file in PDF.

The main idea is to create a list like the one shown in the screenshot below, allowing these data to become public goods, or, in Italian, ‘bene comune’ (Open Data Sicilia is a big promoter of the running campaign).

What we would like to have is a dataset with at least the following five attributes:

  1. comune, the name of the city as reported in the PDF file
  2. pro_com, the unique, numeric ISTAT code
  3. dataInizio, the beginning date of the red code for the city
  4. dataFine, the end date of the red code for the city
  5. link, the URL of the PDF file.

We read all the PDF of 2021 (28 at the time of writing of this piece) and it took us about one hour, certainly a reasonable amount of time.

After we were done with the sheet, we initiated the second phase, which consisted in rendering these tabular data in a map easily accessible to everyone:

Important to notice, it is possible to download both the entire dataset and the PDF for each city reported directly from the map

Final notes

Creating and publishing an up-to-date list, a CSV file – really, any kind of format of structured text that machines and humans can read – isn’t that much work, but is, first and foremost, a matter of providing a public service to the citizens, a service that is very much needed and due. 

Scanned PDFs prevent text searches and make usability for blind people impossible. In our opinion, these two points alone suffice to say that Sicilian authorities need to do better, but there are also official regulations that point these matters out, like the Digital Administration Code (Codice dell’Amministrazione Digitale, CAD).

These documents must be accessible and usable regardless of personal abilities, according to the accessibility criteria defined by the technical requirements referred in Article 11 of the Law of 9 January 2004, n. 4.

For 2021, 16 out of 28 decrees are scanned.

We therefore ask the Sicilian Region and the ‘Dipartimento della Protezione Civile Regionale’ to create a list of the red areas in the Sicilian territory, readable by people and personal computers, complete with the essential information that we have listed, in which it is also possible to read the temporal changes of the assignments of the restrictions.

Finally, we also urge for a speedy solution of the aforementioned problems on accessibility.

References

The cover image of this piece refers to the situation until the 10th of April 2021, right before the decree number 38 of the 9th of April 2021.

What do we know about the air quality in Palermo?

Some weeks ago we came across an EU publication about the urban life in 79 European cities . Among the surveyed cities was Palermo. The following graphic on page 50 attracted our interest. The graphic shows the proportion of people satisfied with public transport services and the quality of air in their city. For both indicators Palermo shows very low levels.

image04

 

We became interested in what kind of air quality data are currently available in Palermo. The central agency which collects and publish information about air quality is RAP Palermo. According to their website the monitoring network in Palermo is build on ten stations. RAP Palermo collects data in five key pollutants (sulfur oxides (SO2), carbone dioxyde (CO2), nitrogen oxides (NO2), ozone (O3) and particulate matter (PM10)) 24 hours per day.

The results are publish on their website as daily and monthly reports and are available from the year 2013. This is the daily report for the September, 16.

image05

 

The page shows that not all stations collect data for every pollutant. Grey fields indicate combinations which are not measured. We call them “missing type 1” data. White fields on the other hand are under observation but for this particular day are not available (“nd”). These are “missing type 2” data. Valid measurements are the fields containing a value and are in green, yellow or red. Only the stations Boccadifalco and Castelnuovo collect data for all five pollutants.

The most interesting data to make some analysis are the daily one, but in RAP website there are two information barriers:

  • there is one file for every day, than it’s necessary to download one by one. In one year it’s necessary to do 365 downloads;
  • all files are in PDF format, than it’s impossible to use them in apps to make some analysis like a spreadsheet, or in some statistical framework.

So we had to deal with this. We downloaded in example all available daily reports for the year 2016 and extracted the data: here the 2016 RAP daily data in a single CSV file.

We have build 2 different procedures to do it: one in R (the one of Patrick) and one in Python (the one of Andy), and we will publish them in the next weeks.

Why it’s better in this way

To have a single file, in a format that you can use to make calculation, visualization and analysis it’s a little treasure.

image01

 

Some examples:

  • it’s possible to compare different time periods (think to “ZTL”, the restricted traffic area which began in Palermo on 2016 October 10);
  • it’s simple to build automation tools (alert me every time in this station PM10 overcomes legal limit); it will be always the data of the day before, but we think that it’s much better than nothing, much better than to look every day to a PDF file via a browser;
  • it’s possible to use this data coupled to other kind of data (rain and wind data in example) to do some more advanced environmental analysis;
  • it’s possible to discover that:
    • there are no records about Bellolampo station in the PDF. Why?
    • there are to much days in which there are no data, also for fundamental measures like PM10 one. Why?
      • along one year it is forbidden to have more than 35 days with PM10 > µg/m3. In Di Blasi station (outside the ZTL) we already have it 33 times; moreover for this station there had 21 days without any measures.
    • some files like bollettino_20161126.pdf, bollettino_20160508.pdf and bollettino_20160408.pdf, are not published.

This post is a first step, a kind of introduction. In the next weeks we will open a website dedicated to these data, in which we will publish: more data, the code we will use to download them, some maps, some charts, some interactive visualizations and some posts about these data.

Stay tuned!