Back on November 2nd, the City of Toronto launched their toronto.ca/open service – a project aimed to be the “official data set catalogue” of the city. Part of the OpenTO initiative to make various data that the city has collected available to developers in formats that make them easy to manipulate, toronto.ca/open is the first step to making the city and its services more “open and accessible”.
The hope is that if the city makes this information available in a readily-usable form, developers will take the time to create various services around them, helping citizens enjoy and take advantage of what Toronto has to offer. For example, data on garbage collection, public transit or upcoming city events could be used to create a service that would alert users through various means of communication. But at present, the amount and types of data available are fairly limited.
Announced back in April at the Mesh Conference in Toronto by Mayor David Miller, the openness initiative had a lot to live up to in the half-year following up to its release earlier this month at the Toronto Innovation Showcase.
Several cities have already implemented similar projects, leaving Toronto to play catch-up. Vancouver, for example, launched their own Open Data initiative earlier this year, and Toronto’s data catalogue appears to be modeled after it. Edmonton, a much smaller and less dense city than Toronto, has also recently begun an initiative to open their city’s data to the public in a similar manner, with promising results.
But the data is better late than never, so I applaud the City’s effort to democratize and help keep citizens better informed.
The data itself
Looking at what’s offered, you can see that most of the data sets or services have a geographical component to them. This only makes sense, as hyperlocal information is always of great interest to most people.
The data is presented in a variety of formats, but the most popular seems to be an ESRI Shapefile, a file format used by a popular suite of GIS software. With much of the data having a geospatial aspect to them, this makes sense as the data likely didn’t require much conversion before being released by the City. (I.E., they probably use ESRI’s application for city planning, etc.)
Thankfully, the format has been well-documented by its developer, and Wikipedia as well. All of these ESRI Shapefiles are intended to be downloaded for offline use, as they are typically made available inside of a zip file. Thus, the process of keeping them up to date may prove tedious, though I’m sure an enterprising developer could automate it nicely.
Other data has been made available in various XML formats. Some are intended to be accessed as online services, with a link to the XML document and the corresponding XSD file, while at least one is presumably meant to be used offline since it’s zipped up. Some of the XML “Feeds” are also in a format without a defined schema and are not RSS or Atom feeds. For example, the feed of City-sponsored events would be better served by an RSS or Atom feed, which would not only allow developers to integrate it into an application, but also regular users to sign up to the feed using any newsreader.
There are also some web services explicitly offered. The first is a geocoder service that can be used to validate addresses and other place names within the City. It is a WSDL service, and there is a full PDF file to document the input parameters and expected output.
The second is a series of services that provide access to “live geospatial data from the City of Toronto“. These are all Web Map Services that conform to a protocol specification developed by the Open Geospatial Consortium. It would have been nice if they had linked to the WMS Specification (PDF document) that explained how to query the services, instead of forcing you to find out the information yourself. (See page 14 and onwards of the mentioned document)
Lastly, some of the data is offered in plain old text files. In particular, the TTC Routes and Schedules is only available as a series of zipped text files that are periodically updated, presumably manually. This may be a bit backwards, but thankfully the formatting is well-defined.
Drawbacks
Though it may not be fair to criticize at this point, there are some clear limitations to the data. Firstly, much of the data is intended to be downloaded and then used in an “offline” capacity, that is, without further communication between the application and the City’s servers that host the data. Though this was probably done due to technical limitations (i.e. reducing bandwidth usage), it limits the ability of an application to stay up to date with the latest data. For things like the TTC Route and Schedules and the list of Apartment Bylaw Infractions, this severely reduced their effectiveness as the data will likely need to be periodically manually updated.
Furthermore, the current data set is somewhat limited and may impede the usefulness of any applications or services that can be developed. For example, making real-time TTC updates and other data available would be an immense benefit, yet the only TTC data currently available are in zipped text files that are only periodically updated. Thankfully, the City has setup a site to allow users to request more data, and there are already many requests for more TTC data.
It would also be nice if crime statistics for Toronto could be made available in an easy to use format. In particular, the underlying data used by the TPS to produce their crime maps and statistics could be beneficial in developing all sorts of interesting maps. In fact, this is what the Toronto Star has been doing for some time, but they had to get the data through an explicit request under access-to-information legislation. Since then, they’ve produced some very interesting maps and it would be beneficial if everyone had access to the same data so that the maps could be improved upon and kept up to date.
Producing such crime-based maps can provide people with one more factor to help them determine what part of the City they’d like to live in, or just allow them to have access to accurate statistics on crime. Not all of this is negative – as any study of crime in Toronto will show that it’s been following a downward trend for quite some time, so such information could help ward off any potential negative spin that media may put on crime. However, I do realize the privacy implications that come with releasing such data, so time should be taken to ensure that it is released in a manner where anonymity is preserved as best as possible.
Lastly, it’s important to note that the OpenTO initiative is still at a very early stage. Even with the way things move at City Hall, things will eventually improve. It’s also very important to give your feedback, either at DataTO or directly by email to opendata@toronto.ca. Additionally, if you’re a developer, the corresponding Google Group is a good place to get started. Even with the current limitations, I am sure we’ll see some great applications developed based on the data. One need only to look at MyTTC to see what can be done with some effort and ingenuity – and this was done well before the City officially released any data!