Saturday, 17 August 2013

To Geo or not to Geo, that is the question...

... at least in oil&gas. I always wondered why petroleum was only 5% Esri's market (unofficial from my tenure as petroleum manager there, they publish no figures as a private company). A current rationalization project at an oil major hinted why - I've 'pushed Geo' for 25 yrs. so I saw that in my previous tenure at Halliburton, but that only crystallized later - whilst a large percentage of data has a spatial component in oil&gas, only a small part of it is stored in spatial databases. GIS are generally for surface infrastructure like geology, plants and pipelines, rather than for subsurface exploration and production. Surface data can actually be seen and measured directly on or near the ground, whereas subsurface are interpolated data from drilling and seismic deep in the subsurface. Indeed the challenges in oil exploration in the news of late revolve around this frontier.

[For an oil&gas primer you an start here.]

Did the Bard ever strike such a pose?

No, this doesn't mean most data is not spatial, it simply means that very large amounts of geo-processing of very large stores of data started very early on. And I mean in the pre-digital era, when post WWII drilling and seismic boom couldn't wait for the revolution to come. And did you know that petrodata is largely signal processing? [Not to discount text and unstructured data and metadata, they take orders of magnitude less space and challenge.] Analog at first and now digital (DSP), seismic and well-log data are sound and electromagnetic waves stored in various ways. And while location is always important, geo components could be stored and retrieved separately and in local coordinates. This partly due to early data storage constraints on originally expensive hardware - viz. Esri's early Arc/Info that stored spatial and tabular in separate and compact datasets - likewise seismic navigation data and well header data are stored in flat files or databases separately from the monstrous seismic and log data. Their DSP create huge blobs (binary large objects) as drilling tools improved (dozens and now hundred of logs each, over tens of thousands now millions of wells), and seismic data retrieval and processing ballooned (from 2D explosion to 3D vibration signals on land an on sea). And did anyone tell you that new tools allow all to be monitored in real time, heralding the advent of petabytes of data capture, transmission and processing?

[OilVoice for free and Oil&Gas Journal for fee are good places to start if you're interested in more, as is BP Statistical Review for free energy data.]

Cambridge University Darwin high performance cluster

Neither does this mean that anyone kept their data house in disorder - that is a whole topic unto itself! - it rather reflects the pre-eminence of relational database technology, Oracle in petroleum: On one hand data models were crafted by various industry segments over the past decades. On the other hand various Big Data schemes to address ballooning data amount to 'big iron' responses of scaling existing hardware and database technology to meet demand - to name but two, Oracle Spatial scales on hardware and SAP HANA on memory - if both have become more affordable recently, software licensing unfortunately hasn't. And while SAAS has been broached by a number of industries, storing private data on the web never mind processing same are so far a 'no-go zones' for petroleum mostly for reasons of security and bandwidth.

[Again many resources, but PNEC is one for fee (and if there are any for free then please let me know), and various data standards efforts help keep said houses in order (Energisitics and PPDM for wells, PODS and APDM for pipeline, SEG-Y and EPSG for seismic and geomatics).]

Laser camera takes photos around corners

Part of my day job is 'to look around the corner' at new ways of geo-processing petro-data especially sub-surface. I was struck by Amazon's chaotic storage in the brick&mortar realm, where not trying to sort items actually speeds up shipping via a non-linear process. Could this be applied in the petrodata realm? I've been 'into' data standards almost as long as GIS, but as a frequent data manager help I realised how often schemas need to be modified or extended to meet operational requirements - indeed I helped PPDM Lite offer such flexibility without gaining traction - enter Hadoop and Apache technologies well written up elsewhere. Esri's geoportal guru Marten Hogeweg helped open Esri to open source - Andrew Turner accepted a challenge responded to recently -

風向轉變時, 有人築牆, 有人造風 車
“When the wind of change blows, some build walls, while others build windmills” (Chinese proverb)

 If linear referencing moved pipeline and utilities services so far, could we move geodata processing also? Sean Gorman an acquisition from GeoIQ recently left Esri (ht @atanas), stating that geo-centric processing was no longer his 'cup of tea'.One can hardly compare twitter feeds to 4D seismic, but one of the geodata challenges still remains to have data needs met by the tech, not the other way around: I commented there "how accuracy in geological mapping and interpretation, need no longer be a compromise ofcomputer system speed and storage capacity". Watch this space if giving GIS a decisive advantage in subsurface data  remains elusive. In the words of my fave US TV station:

Don't touch that dial

[21 Feb 2014 update: from a geo-friendThe whole reason people started putting data into geospatial databases is they wanted to use the ability to localize data and use relational query syntax to speed up fetching data. Geospatial databases can physically store data that is spatially near each other in the same locations on disk, commonly called clustering. Spatial indexes can be created that makes it easier for the database to locate information. Combine this with a query that only requests a subset of the data and suddenly you can manipulate large datasets with ease.]

No comments:

Post a Comment