All posts by Alex Leith

Hugh Saalmans: “No amount of machine learning could solve a 999999 error!”

Hugh Saalmans
Hugh Saalmans
Hugh Saalmans (@minus34) is a geogeek and IT professional that heads the Location Engineering team at IAG, Australia & New Zealand’s largest general insurer. He’s also one of the founders of GeoRabble -- an inclusive, zero-sales-pitch pub meetup for geogeeks to share their stories. His passion is hackfests & open data, and he’s big fan of open source and open standards.

Q: How did you end up in geospatial?

A: A love of maths and geography is the short answer. The long answer is I did a surveying degree that covered everything spatial from engineering to geodesy.

My first experience with GIS was ArcGIS on Solaris (circa 1990) in a Uni lab with a severely underpowered server. Out of the 12 workstations, only 10 of us could log in at any one time, and then just 6 of us could actually get ArcGIS to run. Just as well, considering most of the students who could get it to work, including myself, ballsed up our first lab assignment by turning some property boundaries into chopped liver.

Besides GIS, my least favourite subjects at Uni were GPS and geodesy. So naturally I chose a career in geospatial information.

Q: You work for IAG. What does the company do?

A: Being a general insurer, we cover about $2 trillion worth of homes, motor vehicles, farms, and businesses against bad things happening.

Geospatial is a big part of what we do. Knowing where those $2tn of assets are allows us to do fundamental things like providing individualised address level pricing — something common in Australia, but not so common in the US due to insurance pricing regulations. Knowing where assets are also allows us to help customers when something bad does happen. That goes to the core of what we do in insurance. That’s when we need to fulfill the promise we made to our customers when they took out a policy.

Q: What on Earth is Location Engineering?

A: We’re part of a movement that’s happening across a lot of domains that use geo-information: changing from traditional data-heavy, point & click delivery to scripting, automation, cloud, & APIs. We’re a team of geospatial analysts becoming a team of DevOps engineers that deliver geo-information services. So we needed a name to reflect that.

From a skills point of view — we’re moving from desktop analysis & publishing with a bit of SQL & Python to a lot of Bash, SQL, Python & Javascript with Git, JIRA, Bamboo, Docker and a few other tools & platforms that aren’t that well known in geo circles. We’re migrating from Windows to Linux, desktop to cloud, and licensed to open source. It’s both exciting and daunting to be doing it for an $11bn company!

Q: You’ve been working in the GIS industry for twenty years, how has that been?

A: It’s been great to be a part of 20+ years of geospatial evolutions and revolutions, witnessing geospatial going from specialist workstations to being a part of everyday life, accessible on any device. It’s also been exciting watching open source go from niche to mainstream, government data go from locked down to open, and watching proprietary standards being replaced with open ones.

It’s also been frustrating at times being part of an industry that has a broad definition, no defined start or end (“GIS is everywhere!”), and limited external recognition. In Australia we further muddy the waters by having university degrees and industry bodies that fuse land surveying and spatial sciences into a curious marriage of similar but sometimes opposing needs. Between the limited recognition of surveying as a profession and of geospatial being a separate stream within the IT industry, it’s no real surprise that our work remains a niche that needs to be constantly explained, even though what we do is fundamental to society. In the last 5 years we’ve tried to improve that through GeoRabble, creating a casual forum for anyone to share their story about location, regardless of their background or experience. We’ve made some good progress: almost 60 pub meetups in 8 cities across 3 countries (AU, NZ & SA), with 350 presentations and 4,500 attendees.

Q: How do you work in one industry for twenty years and keep innovating? Any tips on avoiding cynicism and keeping up with the trends?

A: It’s a cliche, but innovation is a mindset. Keep asking yourself and those around you two questions: Why? and Why Not? Asking why? will help you improve things by questioning the status quo or understanding a problem better, and getting focussed on how to fix or improve it. Saying why not? either gives you a reality check or lets you go exploring, researching and finding better ways of doing things to create new solutions.

Similarly, I try to beat cynicism by being curious, accepting that learning has no destination, and knowing there is information out there somewhere that can help fix the problem. Go back 15-20 years — it was easy to be cynical. If your chosen tool didn’t work the way you wanted it to, you either had to park the problem or come up with a preposterous workaround. Nowadays, you’ve got no real excuse if you put in the time to explore. There’s open source, GitHub and StackExchange to help you plough through the problem. Here’s one of our case studies as an example: desktop brand X takes 45 mins to tag several million points with a boundary id. Unsatisfied, we make the effort to learn Python, PostGIS and parallel processing through blogs, posts and online documentation. Now you’re cooking with gas in 45 seconds, not 45 minutes.

Another way to beat cynicism is to accept that things will change, and they will change faster than you want them to. They will leave you with yesterday’s architecture or process and you will be left with a choice to take the easy road and build up design debt into your systems (which will cost you at some point), or you take the hard road and learn as you go to future-proof the things you’re responsible for.

Q: What are some disruptive technologies that are on your watch list?

A: Autonomous vehicles are the big disruptor in insurance. KPMG estimate the motor insurance market will shrink by 60% in the next 25 years due to a reduction in crashes. How do we offset this loss of profitable income? By getting better at analysing our customers and their other assets, especially homes. Enter geospatial to start answering complicated questions like “how much damage will the neighbour’s house do to our insured’s house during a storm?”

The Internet of Things is also going to shake things up in insurance. Your doorbell can now photograph would-be burglars or detect hail. Your home weather sensor can alert you to damaging winds. Now imagine hundreds of thousands of these sensors in each city — imagine tracking burglars from house to house, or watching a storm hit a city, one neighbourhood at a time. Real-time, location-based sensor nets are going to change the way we protect our homes and how insurers respond in a time in crisis. Not to mention 100,000+ weather sensors could radically improve our ability to predict weather-related disasters. It’s not surprising IBM bought The Weather Channel’s online and B2B services arm last year, as they have one of the best crowdsourced weather services.

UAVs are also going to shake things up. We first used them last Christmas after a severe bushfire (wildfire) hit the Victorian coast. Due to asbestos contamination, the burnt out area was sealed off. Using UAVs to capture the damage was the only way at the time to give customers who had lost everything some certainty about their future. Jumping to the near future again — Intel brought their 100-drone lightshow to Sydney in early June. Whilst marvelling at a new artform, watching the drones glide and dance in beautiful formations, it dawned on me what autonomous UAVs will be capable of in the next few years — swarms of them capturing entire damaged neighbourhoods just a few hours after a weather event or bushfire has passed.

Q: What is the dirtiest dataset you’ve had to ingest, and what about the cleanest?

A: The thing about working for a large corporation with a 150-year history is your organisation knows how to put the L into legacy systems. We have systems that write 20-30 records for single customer transactions in a non-sequential manner; so you almost need a PhD to determine the current record. There are other systems that write proprietary BLOBs into our databases (seriously, in 2016!). Fortunately, we have a simplification program to clear up a lot of these types of issues.

As far as open data goes — that’d be the historical disaster data we used at GovHack in 2014.  Who knew one small CSV file could cause so much pain. Date fields with a combination of standard and American dates, inconsistent and incoherent disaster classifications, lat/longs with variable precisions.

I don’t know if there is such a thing as a clean dataset. All data requires some wrangling to make it productive, and all large datasets have quirks. G-NAF (Australia’s Geocoded National Address File) is pretty good on the quirk front, but at 31 tables and 39 foreign keys, it’s not exactly ready to roll in its raw form.

Q: You were very quick to release some tools to help people to work with the G-NAF dataset when it was released. What are some other datasets that you’d like to see made open?

A: It can’t be understated how good it was to see G-NAF being made open data. We’re one of the lucky few countries with an open, authoritative, geocoded national address file, thanks to 3 years of continual effort from the federal and state governments.

That said, we have the most piecemeal approach to natural peril data in Australia. Getting a national view of, say, flood risk isn’t possible due to the way the data is created and collected at the local and state government level. I’m obviously biased being in the insurance industry about wanting access to peril data, but having no holistic view of risk, nor having any data to share doesn’t help the federal government serve the community. It’s a far cry from the availability of FEMA’s data in the US.

Q: Uber drivers have robot cars, McDonald’s workers have robot cooks, what are geohipsters going to be replaced with?  

A: Who says we’re going to be replaced? No amount of machine learning could solve a 999999 error!

But if we are going to be replaced — on the data capture front it’ll probably be due to autonomous UAVs and machine learning. Consider aerial camera systems that can capture data at better than 5 cm resolution, but mounted on a winged, autonomous UAV that could fly 10,000s of sq km a day. Bung the data into an omnipotent machine learning feature extractor (like the ones Google et al have kind of got working), and entire 3D models of cities could be built regularly with only a few humans involved.

There’ll still be humans required to produce PDFs… oh sorry, you said what are geohipsters going to be replaced with. There’ll still be humans required to produce Leaflet+D3 web maps for a while before they work out how to automate it. Speaking of automation — one of the benefits of becoming a team of developers is the career future-proofing. If you’re worried about losing your job to automation, become the one writing the automation code!

Q: What are some startups (geo or non-geo) that you follow?

A: Mapbox and CartoDB are two of the most interesting geospatial companies to follow right now. Like Google before them, they’ve built a market right under the noses of the incumbent GIS vendors by focussing on the user and developer experience, not by trying to wedge as many tools or layers as they can into a single map.

In the geocoding and addressing space it’s hard to go past What3Words for ingenuity and for the traction they’ve got in changing how people around the World communicate their location.

In the insurance space, there’s a monumental amount of hot air surrounding Insuretech, but a few startups are starting to get their business models off the ground. Peer to peer and micro insurance are probably the most interesting spaces to watch. Companies like Friendsurance and Trov are starting to make headway here.

Q: And finally, what do you do in your free time that makes you a geohipster?

A: The other day I took my son to football (soccer) training. I sat on the sideline tinkering with a Leaflet+Python+PostGIS spatio-temporal predictive analytical map that a colleague and I put together the weekend prior for an emergency services hackathon. Apart from being a bad parent for not watching my son, I felt I’d achieved geohipster certification with that effort.

How a geohipster watches football (soccer) practice
How a geohipster watches football (soccer) practice

In all seriousness, being a geohipster is about adapting geospatial technology & trying something new to create something useful, something useless, something different. It’s what I love doing in my spare time. It’s my few hours a night to be as creative as I can be.