Learn how to use MariaDB ColumnStore to quickly perform bulk imports and data analytics without having to do index management for query performance tuning.
MariaDB ColumnStore is a GPLv2 open-source columnar database built on MariaDB Server. It is a fork and evolution of the former InfiniDB. It can be deployed in the cloud (optimized for Amazon Web Services) or on a local cluster of Linux servers using either local or networked storage. In this blog post, I will show some examples of analysis of financial time series data using MariaDB ColumnStore.
First of all, we will download Forex historical data (GBPUSD M1 Full 2016 Year Data) , which is freely available on HistData.com. HistData.com forex historical data look like:
The first column is a timestamp of currency rate of every minute needed to convert the format in order to fit with ColumnStore DATETIME data type. I modified the timestamp format using the following simple Ruby script:
For example, you can convert the historical data as follows: ruby convert.rb DAT_ASCII_GBPUSD_M1_2016.csv > gbpusd2016.csv. The converted forex historical data will be:
This demo was performed on the following setup:
Next, we create a database and a table with MariaDB monitor:
Now it’s time to import CSV data whose timestamps are corrected. For bulk import, we use the cpimport command with ColumnStore. With -s (separator) option, we specify the delimiter character (in this example, comma) .
We could import 372,280 rows of forex data in 2 seconds. LOAD DATA LOCAL INFILE can be used as well with ColumnStore.
The reason why GBPUSD M1 data 2016 was chosen, is to track very quick moves of GBPUSD before and after the Brexit vote, which was held on June 23,2016.
Now we look at maximum, minimum and drop off rate during June 23-24,2016, using Windows Functions of ColumnStore.
GBPUSD fell from 1.50 to 1.32 (-13%) in 2 days (actually in half a day) .
Next graph shows scatter plot of GBPUSD vs. USDJPY during June 23-24,2016. (USDJPY M1 2016 were imported same as GBPUSD) .
Now, we calculate Pearson correlation coefficient using statistical aggregation functions. Following SELECT statement was used:
Query result:
GBPUSD and USDJPY were highly correlated during the UK EU membership referendum.
In Forex trading, a moving average is often used to smooth out price fluctuations. The following query allows moving average of sliding 13 rows window.
In this blog post, the following features of MariaDB ColumnStore were explained:
MariaDB ColumnStore can be downloaded here, detailed instructions to install and test MariaDB ColumnStore on Windows using Hyper-V can be found here .