Quick Start Guide
The 1st snippet
A code snippet is always worth a 1000 words.
import btalib import pandas as pd # Read a csv file into a pandas dataframe df = pd.read_csv('2006-day-001.txt', parse_dates=True, index_col='Date') sma = btalib.sma(df)
Note
2006-day-001.txt
is a sample data file available with the
sources of bta-lib. See bta-lib -
GitHub
Download it if you want to follow the quickstart snippets.
Let's summarize what was just done
- Load a csv file
- Compute an
sma
(Simple Moving Average)
Even if this seems simple, there are some initial questions which can be asked. Let's go for them.
The 1st questions
1. Over what was computed the sma
?
For starters a sample of the first two lines in the data file, which has a format very common for a stock market asset.
Date,Open,High,Low,Close,Volume,OpenInterest 2006-01-02,1789.36,1802.98,1789.36,1802.16,0.00,0.00
The first question can now be answered:
- The
sma
was calculated using the values in theClose
column.
This is so because the standard convention for technical analysis uses the
close
price as a default. You may of course use any of the other columns.
For example using the High
:
sma = btalib.sma(df.High)
Hint
There are additional ways to configure/automate which fields are used for the calculations. See the section "Indicator Input"
2. How and in which format do I get the calculated values?
The default value retuned from sma = btalib.sma(df)
is an internal object from
the library itself. The reason is that it may be used for additional
calculations with other indicators. But getting something market ready is
straightforward:
sma = btalib.sma(df) # default period is 30 print(sma.df)
Et voilá! Getting a DataFrame
with the results is very easy. Even a one-liner
would have actually sufficed as in
sma = btalib.sma(df).df # default period is 30
But two-liners are in many occasions a lot more readable and convey the
information a lot better. Let's show the result of print(sma_df)
sma Date 2006-01-02 NaN 2006-01-03 NaN 2006-01-04 NaN 2006-01-05 NaN 2006-01-06 NaN ... ... 2006-12-21 2029.800333 2006-12-22 2029.961333 2006-12-27 2030.773333 2006-12-28 2031.545667 2006-12-29 2031.730667 [255 rows x 1 columns]
Digging deeper: Parameters
The code contains a comment that says: "default period is 30". And seeing the
result with a bunch of initial NaN
values ("Not a Number"), the remark made
in the comment should be clear:
- The
sma
needs30
data points to start producing values.
Hence the initial NaN
, which indicate that no sensible value can be
calculated and delivered. For the sake of it, an excerpt of
print(sma.df.to_string())
showing when the sma
calculation starts
delivering values
... 2006-02-08 NaN 2006-02-09 NaN 2006-02-10 1822.221333 2006-02-13 1824.273667 ...
Unless a trading calendar for the asset from 2006 is at hand, it is not easy to
see if 2006-02-10
is actually the 30th day and hence the first one for
which a value can be provided (Yes, it is!). To have a visual confirmation,
let's advance in the usage of the library by modifying the calculation period
of the sma
.
sma = btalib.sma(df, period=4) # default period is 30, changed to 4 print(sma.df)
By simply passing period=4
as a named argument to sma
we change the
calcuation window And the result now is:
sma Date 2006-01-02 NaN 2006-01-03 NaN 2006-01-04 NaN 2006-01-05 1815.1700 2006-01-06 1823.0050 ... ... 2006-12-21 2057.6475 ...
Blistering barnacles! Values start to be produced on the 4th day ... as expected.
Note
The parameter period
for the sma
is documented. Each indicator contains
complete documentatin on the supported parameters and the default
values. See the "Indicator Reference" section.
Specific input selection
For the sake of it and because it was mentioned above, let's repeat the feat
using the High
price.
sma = btalib.sma(df.High, period=4) # default period is 30, changed to 4 print(sma.df)
It should not be a surprise that the results change, given the usage of a
different field, than the default Close
(which is automatically chosen, being
the industry de-facto standard)
sma Date 2006-01-02 NaN 2006-01-03 NaN 2006-01-04 NaN 2006-01-05 1819.8100 2006-01-06 1827.4400 ... ... 2006-12-21 2064.8175 2006-12-22 2060.8675 2006-12-27 2062.6000 2006-12-28 2064.0075 2006-12-29 2066.0975
Conclusion
Enough quickstarting! The following has bee shown:
-
Using a
DataFrame
as input where theClose
column was automatically selected by the indicator as the input (following the de-facto standard) -
Using a specific column of the
DataFrame
such asHigh
withdf.High
as input, to run the calculations on it. -
Fetching the calculation and getting a
DataFrame
, by using thedf
attribute of the result with for exampledf.sma
) -
Changing the default calculation parameters of the indicator by providing a named argument with
period=4
The following section will cover additional topics and expand on things like how the automatic selection of the inputs is done (and when being specific about it is important), using indicators with multiple inputs and outputs, reusing results with other indicator, plotting and more.