Democratizing Spatial Data

Chief technologist of Google Earth, Michael Jones recently shared his views pertaining the benefits of data standards and data sharing at the Map World Forum in Hyderabad, India. There is a summary article of his talk in Directions Magazine. He is correct, data sharing unlocks the true powerful potential of spatial data, allowing virtually anybody to utilize spatial data. However, if Google Earth truly supports these efforts, then why is the default format for Google Earth their own KML (Keyhole Markup Language) instead of the Open Geospatial Consortium's (OGC) GML (Geography Markup Language)?

KML and GML are very similar. They are both are XML standards for spatial data features. However, one is a standard defined by the OGC, an organization Google is supposedly supporting. However, Google is still using their own standard KML. Nobody is going to dispute Google is dominant when it comes to spatial data visualization, dwarfing even second place. Therefore, if touting the merits of standards, should they not be supporting these standards by using them

Jones makes a good point, the exact standard used is not the important point, so long as there is a standard for interoperability. This is entirely true, both regarding the data format, and the data organization or schema. Even if everybody on the planet is adhering to the same data storage standard, if there are no standard schemas for the data, then the datasets are still not very useful when combined together. This is why OCG's efforts are so important, to raise awareness and bring standards together.

Consider transportation, roads specifically. There are a myriad of different ways to categorize roads. Roads can be either paved or unpaved, with no other designation. Roads could be categorized as Interstates, US Highways, State Highways, County Roads, and City Streets. If this is the case, what about limited access US Highways? Standards should not only apply to spatial data storage formats, but to spatial data schemas as well.

The widespread use of spatial data is directly proportional to the ease of use. Jones points this out. Google is the perfect example. GIS has existed in some form since the 1970's. However, it was not until 2005, with the release of Google Earth, when use or consumption of GIS data became widespread. People who know nothing of geography are "flying" around the earth looking at their houses, schools, cities and anything else they find interesting. Consumption of GIS services thorough visualization on a home computer, Google brought it to the masses with Google Earth.

The next step is variety of data. Currently the data served through Google Earth is rather static and determined mostly by Google. Google Earth serves DOQQ (aerial imagery), hydrology (water), and transportation (railroads and roads) to be viewed. However, what about all the other spatial data? This spatial data could be made available through through viewers like Google Earth. This is provided the viewers supported standardized formats for data storage and schema.

Jones is also correct regarding the world of spatial data and GIS, it is changing very quickly. A year from now the way people interact with, consume and use spatial data services will be very different. Google Earth has become a household utility for consumption of GIS services. The next step is mobile consumption of GIS services. GPS enabled phones are becoming more and more common. Services taking advantage of this capability, such as the free navigational service amAze are showing up already. Navigational capabilities are only the beginning. With a GPS enabled phone, hikers can make notes of good campsites by using their phone, then possibly send this to a friend, telling them where to meet them. The friend's phone will then navigate the other hiker to the campsite. In a year, who knows how GIS services will be used and consumed?

For all of this to work, data storage and schema standardization is imperative. This way, no matter where the data is coming from, it can work with existing data seamlessly, regardless of how the data is being consumed.