finlab.ml

finlab.ml.feature

combine

combine(features, resample=None, sample_filter=None, **kwargs)

The combine function takes a dictionary of features as input and combines them into a single pandas DataFrame. combine 函數接受一個特徵字典作為輸入，並將它們合併成一個 pandas DataFrame。

PARAMETER	DESCRIPTION
`features`	a dictionary where values are dataframes or callables returning dataframes. 索引為日期時間，欄位為證券代碼的 DataFrame，或可呼叫以取得 DataFrame 的函式。 TYPE: `Dict[str, DataFrame \| Callable]`
`resample`	Optional argument to resample the data in the features. Default is None. 選擇性的參數，用於重新取樣特徵中的資料。預設為 None。 TYPE: `str` DEFAULT: `None`
`sample_filter`	a boolean dictionary where index is date and columns are instrument representing the filter of features. TYPE: `DataFrame` DEFAULT: `None`
`**kwargs`	Additional keyword arguments to pass to the resampler function. 傳遞給重新取樣函數 resampler 的其他關鍵字引數。 DEFAULT: `{}`

RETURNS	DESCRIPTION
	A pandas DataFrame containing all the input features combined. 一個包含所有輸入特徵合併後的 pandas DataFrame。

Examples:

這段程式碼教我們如何使用finlab.ml.feature和finlab.data模組，來合併兩個特徵：RSI和股價淨值比。我們使用f.combine函數來進行合併，其中特徵的名稱是字典的鍵，對應的資料是值。我們從data.indicator('RSI')取得'rsi'特徵，這個函數計算相對強弱指數。我們從data.get('price_earning_ratio:股價淨值比')取得'pb'特徵，這個函數獲取股價淨值比。最後，我們得到一個包含這兩個特徵的DataFrame。

from finlab import data
import finlab.ml.feature as f
import finlab.ml.qlib as q

features = f.combine({

    # 用 data.get 簡單產生出技術指標
    'pb': data.get('price_earning_ratio:股價淨值比'),

    # 用 data.indicator 產生技術指標的特徵
    'rsi': data.indicator('RSI'),

    # 用 f.ta 枚舉超多種 talib 指標
    'talib': f.ta(f.ta_names()),

    # 利用 qlib alph158 產生技術指標的特徵(請先執行 q.init(), q.dump() 才能使用)
    'qlib158': q.alpha('Alpha158')

    })

features.head()

datetime	instrument	rsi	pb
2020-01-01	1101	0	2
2020-01-02	1102	100	3
2020-01-03	1108	100	4

ta

ta(feature_names, factories=None, resample=None, start_time=None, end_time=None, adj=False, cpu=-1, **kwargs)

Calculate technical indicator values for a list of feature names.

PARAMETER	DESCRIPTION
`feature_names`	A list of technical indicator feature names. Defaults to None. TYPE: `Optional[List[str]]`
`factories`	A dictionary of factories to generate technical indicators. Defaults to {"talib": TalibIndicatorFactory()}. TYPE: `Optioanl[Dict[str, TalibIndicatorFactory]]` DEFAULT: `None`
`resample`	The frequency to resample the data to. Defaults to None. TYPE: `Optional[str]` DEFAULT: `None`
`start_time`	The start time of the data. Defaults to None. TYPE: `Optional[str]` DEFAULT: `None`
`end_time`	The end time of the data. Defaults to None. TYPE: `Optional[str]` DEFAULT: `None`
`**kwargs`	Additional keyword arguments to pass to the resampler function. DEFAULT: `{}`

RETURNS	DESCRIPTION
`DataFrame`	pd.DataFrame: technical indicator feature names and their corresponding values.

ta_names

ta_names(lb=1, ub=10, n=1, factory=None)

Generate a list of technical indicator feature names.

PARAMETER	DESCRIPTION
`lb`	The lower bound of the multiplier of the default parameter for the technical indicators. TYPE: `int` DEFAULT: `1`
`ub`	The upper bound of the multiplier of the default parameter for the technical indicators. TYPE: `int` DEFAULT: `10`
`n`	The number of random samples for each technical indicator. TYPE: `int` DEFAULT: `1`
`factory`	A factory object to generate technical indicators. Defaults to TalibIndicatorFactory. TYPE: `IndicatorFactory` DEFAULT: `None`

RETURNS	DESCRIPTION
`List[str]`	List[str]: A list of technical indicator feature names.

Examples:

import finlab.ml.feature as f


# method 1: generate each indicator with random parameters
features = f.ta()

# method 2: generate specific indicator
feature_names = ['talib.MACD__macdhist__fastperiod__52__slowperiod__212__signalperiod__75__']
features = f.ta(feature_names, resample='W')

# method 3: generate some indicator
feature_names = f.ta_names()
features = f.ta(feature_names)

finlab.ml.label

daytrading_percentage

daytrading_percentage(index, **kwargs)

Calculate the percentage change of market prices over a given period.

PARAMETER	DESCRIPTION
`index`	A multi-level index of datetime and instrument. TYPE: `Index`
`resample`	The resample frequency for the output data. Defaults to None. TYPE: `Optional[str]`
`period`	The number of periods to calculate the percentage change over. Defaults to 1. TYPE: `int`
`trade_at_price`	The price for execution. Defaults to `close`. TYPE: `str`
`**kwargs`	Additional arguments to be passed to the resampler function. DEFAULT: `{}`

RETURNS	DESCRIPTION
`Series`	pd.Series: A pd.Series containing the percentage change of stock prices.

excess_over_mean

excess_over_mean(index, resample=None, period=1, trade_at_price='close', **kwargs)

Calculate the excess over mean of market prices over a given period.

PARAMETER	DESCRIPTION
`index`	A multi-level index of datetime and instrument. TYPE: `Index`
`resample`	The resample frequency for the output data. Defaults to None. TYPE: `Optional[str]` DEFAULT: `None`
`period`	The number of periods to calculate the percentage change over. Defaults to 1. TYPE: `int` DEFAULT: `1`
`trade_at_price`	The price for execution. Defaults to `close`. TYPE: `str` DEFAULT: `'close'`
`**kwargs`	Additional arguments to be passed to the resampler function. DEFAULT: `{}`

RETURNS	DESCRIPTION
`Series`	pd.Series: A pd.Series containing the percentage change of stock prices.

excess_over_median

excess_over_median(index, resample=None, period=1, trade_at_price='close', **kwargs)

Calculate the excess over median of market prices over a given period.

PARAMETER	DESCRIPTION
`index`	A multi-level index of datetime and instrument. TYPE: `Index`
`resample`	The resample frequency for the output data. Defaults to None. TYPE: `Optional[str]` DEFAULT: `None`
`period`	The number of periods to calculate the percentage change over. Defaults to 1. TYPE: `int` DEFAULT: `1`
`trade_at_price`	The price for execution. Defaults to `close`. TYPE: `str` DEFAULT: `'close'`
`**kwargs`	Additional arguments to be passed to the resampler function. DEFAULT: `{}`

RETURNS	DESCRIPTION
`Series`	pd.Series: A pd.Series containing the percentage change of stock prices.

maximum_adverse_excursion

maximum_adverse_excursion(index, period=1, trade_at_price='close')

Calculate the maximum adverse excursion of market prices over a given period.

PARAMETER	DESCRIPTION
`index`	A multi-level index of datetime and instrument. TYPE: `Index`
`resample`	The resample frequency for the output data. Defaults to None. TYPE: `Optional[str]`
`period`	The number of periods to calculate the percentage change over. Defaults to 1. TYPE: `int` DEFAULT: `1`
`trade_at_price`	The price for execution. Defaults to `close`. TYPE: `str` DEFAULT: `'close'`
`**kwargs`	Additional arguments to be passed to the resampler function.

RETURNS	DESCRIPTION
`Series`	pd.Series: A pd.Series containing the percentage change of stock prices.

maximum_favorable_excursion

maximum_favorable_excursion(index, period=1, trade_at_price='close')

Calculate the maximum favorable excursion of market prices over a given period.

PARAMETER	DESCRIPTION
`index`	A multi-level index of datetime and instrument. TYPE: `Index`
`resample`	The resample frequency for the output data. Defaults to None. TYPE: `Optional[str]`
`period`	The number of periods to calculate the percentage change over. Defaults to 1. TYPE: `int` DEFAULT: `1`
`trade_at_price`	The price for execution. Defaults to `close`. TYPE: `str` DEFAULT: `'close'`
`**kwargs`	Additional arguments to be passed to the resampler function.

RETURNS	DESCRIPTION
`Series`	pd.Series: A pd.Series containing the percentage change of stock prices.

return_percentage

return_percentage(index, resample=None, period=1, trade_at_price='close', bfill=False, **kwargs)

Calculate the percentage change of market prices over a given period.

PARAMETER	DESCRIPTION
`index`	A multi-level index of datetime and instrument. TYPE: `Index`
`resample`	The resample frequency for the output data. Defaults to None. TYPE: `Optional[str]` DEFAULT: `None`
`period`	The number of periods to calculate the percentage change over. Defaults to 1. TYPE: `int` DEFAULT: `1`
`trade_at_price`	The price for execution. Defaults to `close`. TYPE: `str` DEFAULT: `'close'`
`**kwargs`	Additional arguments to be passed to the resampler function. DEFAULT: `{}`

RETURNS	DESCRIPTION
`Series`	pd.Series: A pd.Series containing the percentage change of stock prices.

finlab.ml.qlib

DumpDataBase

DumpDataBase(csv_path, qlib_dir, backup_dir=None, freq='day', max_workers=16, date_field_name='date', file_suffix='.csv', symbol_field_name='symbol', exclude_fields='', include_fields='', limit_nums=None)

Base class for dumping data to Qlib format.

PARAMETER	DESCRIPTION
`csv_path`	The path to the CSV file or directory containing the CSV files. TYPE: `str`
`qlib_dir`	The directory where the Qlib data will be saved. TYPE: `str`
`backup_dir`	The directory where the backup of the Qlib data will be saved. Defaults to None. TYPE: `str` DEFAULT: `None`
`freq`	The frequency of the data. Defaults to "day". TYPE: `str` DEFAULT: `'day'`
`max_workers`	The maximum number of workers for parallel processing. Defaults to 16. TYPE: `int` DEFAULT: `16`
`date_field_name`	The name of the date field in the CSV file. Defaults to "date". TYPE: `str` DEFAULT: `'date'`
`file_suffix`	The suffix of the CSV file. Defaults to ".csv". TYPE: `str` DEFAULT: `'.csv'`
`symbol_field_name`	The name of the symbol field in the CSV file. Defaults to "symbol". TYPE: `str` DEFAULT: `'symbol'`
`exclude_fields`	The fields to exclude from the dumped data. Defaults to "". TYPE: `str` DEFAULT: `''`
`include_fields`	The fields to include in the dumped data. Defaults to "". TYPE: `str` DEFAULT: `''`
`limit_nums`	The maximum number of CSV files to process. Defaults to None. TYPE: `int` DEFAULT: `None`

CatBoostModel

CatBoostModel()

CatBoostModel is a wrapper model for CatBoost model.

import finlab.ml.qlib as q

# build X_train, y_train, X_test

model = q.CatBoostModel()
model.fit(X_train, y_train)
y_pred = model.predict(X_test)

DEnsmbleModel

DEnsmbleModel()

DEnsmbleModel is a wrapper model for Double Ensemble model.

import finlab.ml.qlib as q

# build X_train, y_train, X_test

model = q.DEnsmbleModel()
model.fit(X_train, y_train)
y_pred = model.predict(X_test)

DNNModel

DNNModel()

DNNModel is a wrapper model for Deep Neural Network model.

import finlab.ml.qlib as q

# build X_train, y_train, X_test

model = q.DNNModel()
model.fit(X_train, y_train)
y_pred = model.predict(X_test)

LGBModel

LGBModel()

LGBModel is a wrapper model for LightGBM model.

import finlab.ml.qlib as q

# build X_train, y_train, X_test

model = q.LGBModel()
model.fit(X_train, y_train)
y_pred = model.predict(X_test)

LinearModel

LinearModel()

LinearModel is a wrapper model for Linear model.

import finlab.ml.qlib as q

# build X_train, y_train, X_test

model = q.LinearModel()
model.fit(X_train, y_train)
y_pred = model.predict(X_test)

SFMModel

SFMModel()

SFMModel is a wrapper model for SFM.

import finlab.ml.qlib as q

# build X_train, y_train, X_test

model = q.SFMModel()
model.fit(X_train, y_train)
y_pred = model.predict(X_test)

TabnetModel

TabnetModel()

TabnetModel is a wrapper model for Tabnet model.

import finlab.ml.qlib as q

# build X_train, y_train, X_test

model = q.TabnetModel()
model.fit(X_train, y_train)
y_pred = model.predict(X_test)

XGBModel

XGBModel()

XGBModel is a wrapper model for XGBoost model.

import finlab.ml.qlib as q

# build X_train, y_train, X_test

model = q.XGBModel()
model.fit(X_train, y_train)
y_pred = model.predict(X_test)

alpha

alpha(handler='Alpha158', **kwargs)

產生 Qlib 的特徵 Args: handler (str): 預設為 'alpha158' 也可以設定成 'Alpha360' Examples:

import finlab.ml.qlib as q
features = q.alpha('Alpha158')

dump

dump(freq='day')

產生Qlib 於台股的資料庫 Examples:

import qlib
import finlab.ml.qlib as q

q.dump() # generate tw stock database
q.init() # initiate tw stock to perform machine leraning tasks (similar to qlib.init)

import qlib
# qlib functions and operations

get_models

get_models()

Return a list of available models. Examples:

import finlab.ml._qlib as q

models = q.get_models()
print(models)

output:

{ 'LGBModel': LGBModel, 'XGBModel': XGBModel, 'DEnsmbleModel': DEnsmbleModel, 'CatBoostModel': CatBoostModel, 'LinearModel': LinearModel, 'TabnetModel': TabnetModel, 'DNNModel': DNNModel, 'SFMModel': SFMModel}

init

init()

Qlib 初始化 (類似於台股版 qlib.init() 但更簡單易用) Examples:

import qlib
import finlab.ml.qlib as q

q.dump() # generate tw stock database
q.init() # initiate tw stock to perform machine leraning tasks (similar to qlib.init)

import qlib
# qlib functions and operations