Analytics Vortexa

Wednesday, 22 January, 2020

Arming commodity experts with Vortexa’s Python SDK

Data Scientists care about data science, not engineering. Analysts need clear information at their fingertips, without hassle. Good engineers care deeply about the longevity of the systems they build, and waste no time with unnecessary low-level details.

Who should read this?

  • Analysts starting to code
  • Data Scientists that would prefer to focus on science
  • 10x Engineers that expect effective abstractions



Modern data science requires both surgical precision and broad stroke data collection. Algorithms often need vast amounts of data to function effectively, while algorithm builders need pinpoint control to inspect the finer details.

It’s no small feat to navigate through millions of global waterborne oil movements. It requires a keen eye and a fair degree of patience to explore live ship-to-ship transfers, terminal-level cargo imports, and historic national export figures.

As discussed earlier, there are multiple ways to access our data. Here we will focus on Vortexa’s Python SDK.


We built the Vortexa Python Software Development Kit (SDK) to provide fast, interactive, programmatic exploration of our data.

The Python SDK empowers users to efficiently explore the world’s waterborne oil movements, and to build custom models & reports with minimum setup cost.

For example, to inspect cargo movements in a pandas DataFrame, we can run the following python snippet:

>>> df = CargoMovements()\
filter_time_min=datetime(2019, 1, 1),
filter_time_max=datetime(2019, 12, 31))\

Returns the following:

Screenshot 2020-01-22 at 11.53.21

Let’s look deeper into this dataset, a quick monthly aggregate shows how Singapore’s Fuel Oil Imports are falling ahead of IMO 2020.


What should you expect from the Python SDK?

  • Analysts can examine the world’s waterborne oil movements with minimum coding knowledge. Clear examples lead users along a gradual learning curve.
  • Data Scientists can use an interactive python toolkit, deeply integrated with pandas. Data Scientists can use the SDK to efficiently combine multiple data sources, and methodically extract features relevant to both production & prototype models.
  • Software & Data Engineers can rely on a modular, clean, test-driven, open-source SDK, built & maintained by a committed team of world-class engineers. We welcome contributions, please check out our contributing guide!


As a Data Scientist myself, it’s an interesting time to be in the industry. Data Scientists are becoming increasingly redundant, yet increasingly powerful at the same time. Advances in AutoML and tools like Ludwig allow you to build production-grade machine learning models in only a few lines of code.

Such powerful tooling lets us focus on understanding the data itself, doing away with details concerning hyper-parameter optimisation or type detection.

The SDK also aligns with this tool-driven mindset. At Vortexa, we’ve found it invaluable to build powerful tooling that simplifies the process of handling data. We hope you find the SDK similarly valuable. 


Author: Kit Burgess, Data Scientist at Vortexa


For more practical tips on Data Science and Tech - follow the VorTECHsa team on Medium



Feb 12, 2020

Jacques Gabillon Joins Board of AI-powered Energy Intelligence Provider Vortexa

Former Goldman Sachs Global Head of Commodities becomes the newest member of Vortexa's board.

Feb 5, 2020

Vortexa Snapshot: Corpus Christi’s crude exports sets new record

A wave of pipeline and terminal projects coming on stream has led to a boom in Corpus Christi’s seaborne crude exports in recent months, with volumes reaching a record high of 1.5mn b/d in January, Vortexa data show.

Feb 4, 2020

Satellite Images Object Detection: 95% Accuracy in a Few Lines of Code

A quick guide for Analysts, Data Scientists and Engineers to get the most out of Vortexa's Python SDK.

Our newsletter

Subscribe below to receive Vortexa news and updates