- According to the Department of Justice in California in 2009, the clearance rate for burglaries was just 12.9%. For violent crime it was just 43.7%.
- Despite costing 16 billion dollars, the California police still didn't clear 100,000 violent crimes and over 500,000 burglaries in a single year.
- These numbers only hint at the true cost of crime on society - and the scale of problems like terrorism and rioting is growing.
- Put another way, a 1% improvement represents 160 million dollars that can be used elsewhere, every year. A step change could save billions of dollars for America alone.
My idea was inspired by the London riots. Let's use cell tower records from crime locations to figure out who was present.
This is something that happens already - so often that some telcos are struggling to respond within legislated time frames. We know there is a tendency for offenders to repeat. So if we could analyse the records from lots of crime locations - then the cell phone numbers of repeat offenders would appear more often than those of innocent bystanders. Unfortunately one million cellphones can generate 7 billion tower connections per year - for a large city this translates to terabytes of data - and at the country level even petabytes. Traditional technology cannot process such volumes of data quickly enough to respond while the evidence trail is still warm.
What we need is an in-memory analytic appliance that can process millions of records in a few milliseconds. To demonstrate how SAP HANA fits the bill I created 3 tables - CRIMES, SITES and CONNECTIONS - and created subsets of sample data from my home town Sydney Australia - where we know a bit about convicts.
To get cell tower location data I used the communications authority website but it only shows 50 sites per page. So I wrote a PHP script on scraperWiki and ran it to extract 16,800 tower sites. For crime reports I downloaded 15 years of NSW crime statistics. These are summarised by month so I had to write a macro to explode the data into individual crime records - 6.8m of them. The tower connections were generated randomly and I used the HANA data provider in Excel to look up valid tower ids - for this demo I created 6m records.
Data loaded incredibly fast - in 1.5 minutes it loaded nearly 7m records. To count number of crimes per subscriber I needed a calculation view. First I joined subscribers to the cell towers using connection data. Then I joined the towers to crime locations. A filter in the projection removes non-overlapping times.
In the explorer view we can see the calculation results - individual subscribers and the number of crimes each one was near computed in seconds. In the video example a subscriber was present at 10 reported thefts all on the same day in locations around Sydney. 10 crimes solved in 3.6 seconds - not even time for an ad break.
Of course, all data in this demo was publicly available or randomly generated.
Where can we go from here? Well, for example, the power of detective HANA can be integrated into mobile devices. Police could be informed when any nearby cellphones have links to a suspiciously high number of crimes, and then triangulate directly to the suspect without even knowing subscriber details.
Detective HANA is not an end in itself but a shift in what is now feasible, the technological step change required to keep up with the challenges facing police today.
A new generation of innovation can emerge around real time analytics for intelligence on this scale.
- Solves a problem that would not be possible using traditional processing speeds.
- Takes a complex problem and distilling it down to simple components ideal for high performance processing.
- Tested our solution with realistic data to prove that it works.
- Financial benefits could reach billions of dollars
- Even greater social benefits - perhaps even saving lives.
Blog: SAP HANA InnoJam
Presentation Slides about Detective HANA
Reference: 2011 England riots on Wikipedia
Discussion Forum: Explore the discussion for this use case on SCN or share your view