Quick Start Guide
The 1st snippet
A code snippet is always worth a 1000 words.
import btalib import pandas as pd # Read a csv file into a pandas dataframe df = pd.read_csv('2006-day-001.txt', parse_dates=True, index_col='Date') sma = btalib.sma(df)
2006-day-001.txt is a sample data file available with the
sources of bta-lib. See bta-lib -
Download it if you want to follow the quickstart snippets.
Let's summarize what was just done
- Load a csv file
- Compute an
sma(Simple Moving Average)
Even if this seems simple, there are some initial questions which can be asked. Let's go for them.
The 1st questions
1. Over what was computed the
For starters a sample of the first two lines in the data file, which has a format very common for a stock market asset.
The first question can now be answered:
smawas calculated using the values in the
This is so because the standard convention for technical analysis uses the
close price as a default. You may of course use any of the other columns.
For example using the
sma = btalib.sma(df.High)
There are additional ways to configure/automate which fields are used for the calculations. See the section "Indicator Input"
2. How and in which format do I get the calculated values?
The default value retuned from
sma = btalib.sma(df) is an internal object from
the library itself. The reason is that it may be used for additional
calculations with other indicators. But getting something market ready is
sma = btalib.sma(df) # default period is 30 print(sma.df)
Et voilá! Getting a
DataFrame with the results is very easy. Even a one-liner
would have actually sufficed as in
sma = btalib.sma(df).df # default period is 30
But two-liners are in many occasions a lot more readable and convey the
information a lot better. Let's show the result of
sma Date 2006-01-02 NaN 2006-01-03 NaN 2006-01-04 NaN 2006-01-05 NaN 2006-01-06 NaN ... ... 2006-12-21 2029.800333 2006-12-22 2029.961333 2006-12-27 2030.773333 2006-12-28 2031.545667 2006-12-29 2031.730667 [255 rows x 1 columns]
Digging deeper: Parameters
The code contains a comment that says: "default period is 30". And seeing the
result with a bunch of initial
NaN values ("Not a Number"), the remark made
in the comment should be clear:
30data points to start producing values.
Hence the initial
NaN, which indicate that no sensible value can be
calculated and delivered. For the sake of it, an excerpt of
print(sma.df.to_string()) showing when the
sma calculation starts
... 2006-02-08 NaN 2006-02-09 NaN 2006-02-10 1822.221333 2006-02-13 1824.273667 ...
Unless a trading calendar for the asset from 2006 is at hand, it is not easy to
2006-02-10 is actually the 30th day and hence the first one for
which a value can be provided (Yes, it is!). To have a visual confirmation,
let's advance in the usage of the library by modifying the calculation period
sma = btalib.sma(df, period=4) # default period is 30, changed to 4 print(sma.df)
By simply passing
period=4 as a named argument to
sma we change the
calcuation window And the result now is:
sma Date 2006-01-02 NaN 2006-01-03 NaN 2006-01-04 NaN 2006-01-05 1815.1700 2006-01-06 1823.0050 ... ... 2006-12-21 2057.6475 ...
Blistering barnacles! Values start to be produced on the 4th day ... as expected.
period for the
sma is documented. Each indicator contains
complete documentatin on the supported parameters and the default
values. See the "Indicator Reference" section.
Specific input selection
For the sake of it and because it was mentioned above, let's repeat the feat
sma = btalib.sma(df.High, period=4) # default period is 30, changed to 4 print(sma.df)
It should not be a surprise that the results change, given the usage of a
different field, than the default
Close (which is automatically chosen, being
the industry de-facto standard)
sma Date 2006-01-02 NaN 2006-01-03 NaN 2006-01-04 NaN 2006-01-05 1819.8100 2006-01-06 1827.4400 ... ... 2006-12-21 2064.8175 2006-12-22 2060.8675 2006-12-27 2062.6000 2006-12-28 2064.0075 2006-12-29 2066.0975
Enough quickstarting! The following has bee shown:
DataFrameas input where the
Closecolumn was automatically selected by the indicator as the input (following the de-facto standard)
Using a specific column of the
df.Highas input, to run the calculations on it.
Fetching the calculation and getting a
DataFrame, by using the
dfattribute of the result with for example
Changing the default calculation parameters of the indicator by providing a named argument with
The following section will cover additional topics and expand on things like how the automatic selection of the inputs is done (and when being specific about it is important), using indicators with multiple inputs and outputs, reusing results with other indicator, plotting and more.