Data Input
A sample input
The sample data provided with the library has this format
Date,Open,High,Low,Close,Volume,OpenInterest 20060102,1789.36,1802.98,1789.36,1802.16,0.00,0.00
It has the usual standard fields OHLC
(which obviously stands for
OpenHighLowClose
), plus an initial timestamp and it is followed by the
Volume
and OpenInterest
components. Rather standard.
And it can be easily transformed into a timeseriesbased DataFrame
with a
oneliner:
df = pd.read_csv('2006day001.txt', parse_dates=True, index_col='Date')
This DataFrame contains now the following columns:
['Open', 'High', 'Low', 'Close', 'Volume', 'OpenInterest']
The Date
column is not missing, it has simple been transformed into the
index
. A usual printout of the DataFrame
looks like this (skipping some
lines for brevity)
Open High Low Close Volume OpenInterest Date 20060102 1789.36 1802.98 1789.36 1802.16 0.0 0.0 20060103 1802.04 1819.21 1800.92 1807.17 0.0 0.0 ...
Most timeseries dataframes will have such a format or a very similar
one. Remapping column names (in pandas
or directly in the library) should be a
easy enough to get the input into the library.
Default Settings
The library provides the sample to show what the default expectations are, which can be summarized as follows:

Column names are transformed to its lowercase form before any comparison is made, i.e.: case insensitive comparisons are always made.
As such, the name of the
Close
column may also well beclose
orcLoSe

Date
timestamp in the indexThis is actually a pure expectation, because the library does not touch the index and does not look into its contens for anything. The index could simply be a sequence of integers
Note
The name of the column for the index is irrelevant

OHLCVOi
ordering, i.e.:OpenHighLowCloseVolumeOpenInterest
If the names of the columns do not match the expectation, the corresponding numeric index to the columns will be used, i.e:
Open = 0
,High = 1
,Low = 2
,Close = 3
,Volume = 4
,OpenInterest = 5
Single Input Indicators
This type of indicators have close
as the default input to look for, or
column index 3
as explained above. Let's see the reference documentation for
the archetype of such an indicator, the sma
or SimpleMovingAverage
.
Nonweighted average of the last n periods Formula:  movav = Sum(data, period) / period See also:  http://en.wikipedia.org/wiki/Moving_average#Simple_moving_average Aliases: SMA, SimpleMovingAverage Inputs: close Outputs: sma Params:  period (default: 30) Period for the moving average calculation
Working with this indicator can be done in the following ways (the loading of
the data into the dataframe df
is assumed)
Default Column
# Let the aagic of the libray find the `Close` column sma = btalib.sma(df)
Because the indicator sma
defines its single input with the name close
, the
data input machinery will look into the dataframe for a column matching that
name (case insensitive comparison) ... and will find it.
# Be specific about which column to use by passing the column sma = btalib.sma(df.High)
The indicator needs just a single input. Being specific and passing df.High
(which is a Series
) will perform the calculation directly on that data field.
# Reuse the sma object and pass it to itself saa = btalib.sma(df) sma1 = btalib.sma(sma)
The sma
has also a single output. It can therefore directly be used as single
input for another indicators ... like itself. No great deal (the delivery
period of the first result in sma1
will obviously increase)
# Let the aagic of the libray find the column by index df.rename(columns={'Close':'NewClose'}, inplace=True) sma = btalib.sma(df)
Ooops! By renaming Close
to NewClose
the column can no longer be found by
name, and no other column matches the name of the input sought by the sma
. As
explained above and following the standard OHLCVOi
ordering, the Close
has
an index of 3
and the column present at that index will be taken.
It can be the case that the dataframe has only two (2) columns (plus the index)
and therefore, only indices 0
and 1
are available. The machinery will then
default to using the first of the columns, i.e.: column 0
.
This is seen as a reasonable assumption and choice, because the indicator is expecting a singleinput, and a singleinput in the form of a dataframe is being provided. When name matchinng and default column index matching both fail, the first of the columns of the singleinput dataframe is chosen.
Multiple Input Indicators
The classic stochastic
is a good choice to understand how things work. The
relevant part of the documentation for it:
... Aliases: 'stoch', 'Stochastic', 'STOCHASTIC', 'STOCH' Inputs: high, low, close Outputs: k, d Params:  period (default: 14) Period to consider ...
It expects the inputs high
, low
and close
. Remember that the default for
singleinput indicators is to have close
and look for it. In this case, the
stochastic
has overridden that by still defining close
as an input but
putting it last. This is so for two reasons:

It respects the
OHLC
ordering 
It is the ordering of
talib
and making things like a wellknown library is seen as a good thing (library which probably also followed theOHLC
convention in the first place)
Giving the indicator the three (3) expected inputs can be done in two generic ways

Provide three (3) individual inputs which will be automatically matched to
high
,low
andclose
internally 
Provide a single input (1)
DataFrame
with several columns, that will be internally matched to the inputs (with column name matching, column index matching, ...)
Some examples (the sample data has already been loaded as a DataFrame
and is
available as df
)
Multiinput examples
stochastic = btalib.stochastic(df.High, df.Low, df.Close)
Three (3) inputs are expected and three (3) are provided. In this case the right inputs are used, but nothing prevents a different user choice.
stochastic = btalib.stochastic(df.Close, df.Volume, df.Low)
The stochastic
will not complain, because it will internally see the Close
,
Volume
and Low
remapped to high
, low
and close
respectively. The
calculations in this case will make no sense whatsoever, but the input
requirements have nonetheless be fulfilled.
A more advanced case with reinput
sma_low = btalib.sma(df.Low, period=10) sma_high = btalib.sma(df.High, period=8) stochastic = btalib.stochastic(sma_high, sma_low, df.Close)
Instead of passing the High
and Low
from the DataFrame directly into the
stochastic
, those fields are first transformed using sma
indicators of
periods 10
and 8
. And both sma
results are used, together with the
standard close.
SingleInput examples
stochastic = btalib.stochastic(df)
In this case and because the DataFrame
has the columns High
, Low
and
Close
available, the stochastic
will use its values for the calculations
(reminder: name matching is case insensitive, the column Close
could also be
named cLoSe
)
Should the DataFrame have other naming conventions, the default column
indices matching the OHLCVOi
ordering will be used, i.e.: Open = 0
,
High = 1
, Low = 2
, Close = 3
, Volume = 4
, OpenInterest = 5
# Let the aagic of the libray find the column by index df.rename(columns={'Low':'NewLow'}, inplace=True) stochastic = btalib.stochastic(df)
The column Low
has been renamed to NewLow
, which means that the second
input sought by the stochastic
indicator cannot be found by name. When this
happens, the indicator will then resort to applying the numeric index and will
still use the real Low
column.
It is possible for the user to really mess it up, like in this example.
# Let the aagic of the libray find the column by index df.rename(columns={'Close:'High', 'High': 'Low', 'Low': 'Close'}, inplace=True) stochastic = btalib.stochastic(df)
All required inputs will be found by name, but the real columns applied as inputs will most likely produce useless results.
Remapping Names/Indices
The remapping of names directly in the DataFrame is shown above and the default
numeric indices from 0
to 5
follow the OHLCVOi
convention. The library
offers the possibility of remapping names and indices, without having to touch
the DataFrame
or reorder columns.
This is done via the function set_input_indices(**kwargs)
. A first example
which remaps the location of the stochastic
inputs. This is useful if the
DataFrame has columns with names that have nothing to do with the usual
OpenHighLow ...
. For the sake of the example, the names of the columns in
the DataFrame will be mapped to alien names.
df.rename(columns={'High:'Alf', 'Low': 'ET', 'Close': 'Alien'}, inplace=True) # The names set above make no sense for the library # Let's give it some indication btalib.set_input_indices(high=3, low=0, close=1)
Now, because the names make no sense, the numeric indices will be used. Using
the set_input_indices
function, the library will use the indicated indices
to select the columns, overriding the default OHLC
ordering.
Column names can also be remapped to new names and not only to indices. Like in this example.
df.rename(columns={'High:'Alf', 'Low': 'ET', 'Close': 'Alien'}, inplace=True) # The names set above make no sense for the library # Let's give it some indication btalib.set_input_indices(high='Alf', low='ET', close='Alien')
Hallelujah! The alien names will now be used by the library to find the columns.
Using set_input_indices
to remap numeric indices or names, can be
particularly useful coupled with set_use_ohlc_indices_first(onoff=True)
. This
forces the library to use the configuration rather than the standard naming
OpenHighLowClose ...
.
As in here.
btalib.set_use_ohlc_indices_first(True) btalib.set_input_indices(high=3, low=0, close=1)
Regardless of the column names, the ordering high=3
, low=0
and close=1
will be used. Column name matching has been effectively disabled.
As seen in this snippet, the need to remap the column names to alien names is
gone. In a real scenario, the alien names would already be in the source
DataFrame
and the indices 3
, 0
and 1
are the desired inputs to be
processed.
Other Indicators as SingleInput
The outcome of an indicator can actually be used as input for another. Putting
together the stochastic
and the sma
stochastic = btalib.stochastic(df) sma = btalib.sma(stochastic)
The outputs of the stochastic
can be seen in the documentation
... Aliases: 'stoch', 'Stochastic', 'STOCHASTIC', 'STOCH' Inputs: high, low, close Outputs: k, d Params:  period (default: 14) Period to consider ...
It has actually two (2) outputs, and the sma
is expecting just one (1). The
rule when using library indicators (as opposed to DataFrames) is to take the
natural ordering.
Hence, the sma
needs one input and the first one is k
, which is the one
that will be processed by the sma
. It is like if the following had actually
been done
stochastic = btalib.stochastic(df) # stochastic.k is a shorthand for "stochastic.outputs.k" or "stochastic.o.k" sma = btalib.sma(stochastic.k)
In order to use output d
from the stochastic, it is necessary to be specific
about it, such as in this case.
stochastic = btalib.stochastic(df) sma = btalib.sma(stochastic.d)
or even a lot more specific, if one suspects collision between the output name and some instance attribute defined in the indicator itself.
stochastic = btalib.stochastic(df) sma = btalib.sma(stochastic.outputs.d)
Note
With the o
shorthand notation for the outputs
stochastic = btalib.stochastic(df) sma = btalib.sma(stochastic.o.d)
SingleInput Errors
When a single input is provided to a multiinput indicator like the
stochastic
the library can complain in the following situations:

The
DataFrame
has less columns than inputs are required 
An input cannot be found by name and the numeric mapping cannot be matched to an existing column (3 columns and the index is 10)
In this case an btalib.InputsError
exception will be raised.