跳轉到

finlab.ml

finlab.ml.feature

combine

combine(features, resample=None, sample_filter=None, **kwargs)

The combine function takes a dictionary of features as input and combines them into a single pandas DataFrame. combine 函數接受一個特徵字典作為輸入,並將它們合併成一個 pandas DataFrame。

PARAMETER DESCRIPTION
features

a dictionary of features where index is datetime and column is instrument. 一個特徵字典,其中索引為日期時間,欄位為證券代碼。

TYPE: Dict[str, DataFrame]

resample

Optional argument to resample the data in the features. Default is None. 選擇性的參數,用於重新取樣特徵中的資料。預設為 None。

TYPE: str DEFAULT: None

sample_filter

a boolean dictionary where index is date and columns are instrument representing the filter of features.

TYPE: DataFrame DEFAULT: None

**kwargs

Additional keyword arguments to pass to the resampler function. 傳遞給重新取樣函數 resampler 的其他關鍵字引數。

DEFAULT: {}

RETURNS DESCRIPTION

A pandas DataFrame containing all the input features combined. 一個包含所有輸入特徵合併後的 pandas DataFrame。

Examples:

這段程式碼教我們如何使用finlab.ml.feature和finlab.data模組,來合併兩個特徵:RSI和股價淨值比。我們使用f.combine函數來進行合併,其中特徵的名稱是字典的鍵,對應的資料是值。 我們從data.indicator('RSI')取得'rsi'特徵,這個函數計算相對強弱指數。我們從data.get('price_earning_ratio:股價淨值比')取得'pb'特徵,這個函數獲取股價淨值比。最後,我們得到一個包含這兩個特徵的DataFrame。

from finlab import data
import finlab.ml.feature as f
import finlab.ml.qlib as q

features = f.combine({

    # 用 data.get 簡單產生出技術指標
    'pb': data.get('price_earning_ratio:股價淨值比'),

    # 用 data.indicator 產生技術指標的特徵
    'rsi': data.indicator('RSI'),

    # 用 f.ta 枚舉超多種 talib 指標
    'talib': f.ta(f.ta_names()),

    # 利用 qlib alph158 產生技術指標的特徵(請先執行 q.init(), q.dump() 才能使用)
    'qlib158': q.alpha('Alpha158')

    })

features.head()
datetime instrument rsi pb
2020-01-01 1101 0 2
2020-01-02 1102 100 3
2020-01-03 1108 100 4

ta

ta(feature_names, factories=None, resample=None, start_time=None, end_time=None, adj=False, cpu=-1, **kwargs)

Calculate technical indicator values for a list of feature names.

PARAMETER DESCRIPTION
feature_names

A list of technical indicator feature names. Defaults to None.

TYPE: Optional[List[str]]

factories

A dictionary of factories to generate technical indicators. Defaults to {"talib": TalibIndicatorFactory()}.

TYPE: Optioanl[Dict[str, TalibIndicatorFactory]] DEFAULT: None

resample

The frequency to resample the data to. Defaults to None.

TYPE: Optional[str] DEFAULT: None

start_time

The start time of the data. Defaults to None.

TYPE: Optional[str] DEFAULT: None

end_time

The end time of the data. Defaults to None.

TYPE: Optional[str] DEFAULT: None

**kwargs

Additional keyword arguments to pass to the resampler function.

DEFAULT: {}

RETURNS DESCRIPTION
DataFrame

pd.DataFrame: technical indicator feature names and their corresponding values.

ta_names

ta_names(lb=1, ub=10, n=1, factory=None)

Generate a list of technical indicator feature names.

PARAMETER DESCRIPTION
lb

The lower bound of the multiplier of the default parameter for the technical indicators.

TYPE: int DEFAULT: 1

ub

The upper bound of the multiplier of the default parameter for the technical indicators.

TYPE: int DEFAULT: 10

n

The number of random samples for each technical indicator.

TYPE: int DEFAULT: 1

factory

A factory object to generate technical indicators. Defaults to TalibIndicatorFactory.

TYPE: IndicatorFactory DEFAULT: None

RETURNS DESCRIPTION
List[str]

List[str]: A list of technical indicator feature names.

Examples:

import finlab.ml.feature as f


# method 1: generate each indicator with random parameters
features = f.ta()

# method 2: generate specific indicator
feature_names = ['talib.MACD__macdhist__fastperiod__52__slowperiod__212__signalperiod__75__']
features = f.ta(feature_names, resample='W')

# method 3: generate some indicator
feature_names = f.ta_names()
features = f.ta(feature_names)

finlab.ml.label

daytrading_percentage

daytrading_percentage(index, **kwargs)

Calculate the percentage change of market prices over a given period.

PARAMETER DESCRIPTION
index

A multi-level index of datetime and instrument.

TYPE: Index

resample

The resample frequency for the output data. Defaults to None.

TYPE: Optional[str]

period

The number of periods to calculate the percentage change over. Defaults to 1.

TYPE: int

trade_at_price

The price for execution. Defaults to close.

TYPE: str

**kwargs

Additional arguments to be passed to the resampler function.

DEFAULT: {}

RETURNS DESCRIPTION

pd.Series: A pd.Series containing the percentage change of stock prices.

excess_over_mean

excess_over_mean(index, resample=None, period=1, trade_at_price='close', **kwargs)

Calculate the excess over mean of market prices over a given period.

PARAMETER DESCRIPTION
index

A multi-level index of datetime and instrument.

TYPE: Index

resample

The resample frequency for the output data. Defaults to None.

TYPE: Optional[str] DEFAULT: None

period

The number of periods to calculate the percentage change over. Defaults to 1.

TYPE: int DEFAULT: 1

trade_at_price

The price for execution. Defaults to close.

TYPE: str DEFAULT: 'close'

**kwargs

Additional arguments to be passed to the resampler function.

DEFAULT: {}

RETURNS DESCRIPTION

pd.Series: A pd.Series containing the percentage change of stock prices.

excess_over_median

excess_over_median(index, resample=None, period=1, trade_at_price='close', **kwargs)

Calculate the excess over median of market prices over a given period.

PARAMETER DESCRIPTION
index

A multi-level index of datetime and instrument.

TYPE: Index

resample

The resample frequency for the output data. Defaults to None.

TYPE: Optional[str] DEFAULT: None

period

The number of periods to calculate the percentage change over. Defaults to 1.

TYPE: int DEFAULT: 1

trade_at_price

The price for execution. Defaults to close.

TYPE: str DEFAULT: 'close'

**kwargs

Additional arguments to be passed to the resampler function.

DEFAULT: {}

RETURNS DESCRIPTION

pd.Series: A pd.Series containing the percentage change of stock prices.

maximum_adverse_excursion

maximum_adverse_excursion(index, period=1, trade_at_price='close')

Calculate the maximum adverse excursion of market prices over a given period.

PARAMETER DESCRIPTION
index

A multi-level index of datetime and instrument.

TYPE: Index

resample

The resample frequency for the output data. Defaults to None.

TYPE: Optional[str]

period

The number of periods to calculate the percentage change over. Defaults to 1.

TYPE: int DEFAULT: 1

trade_at_price

The price for execution. Defaults to close.

TYPE: str DEFAULT: 'close'

**kwargs

Additional arguments to be passed to the resampler function.

RETURNS DESCRIPTION

pd.Series: A pd.Series containing the percentage change of stock prices.

maximum_favorable_excursion

maximum_favorable_excursion(index, period=1, trade_at_price='close')

Calculate the maximum favorable excursion of market prices over a given period.

PARAMETER DESCRIPTION
index

A multi-level index of datetime and instrument.

TYPE: Index

resample

The resample frequency for the output data. Defaults to None.

TYPE: Optional[str]

period

The number of periods to calculate the percentage change over. Defaults to 1.

TYPE: int DEFAULT: 1

trade_at_price

The price for execution. Defaults to close.

TYPE: str DEFAULT: 'close'

**kwargs

Additional arguments to be passed to the resampler function.

RETURNS DESCRIPTION

pd.Series: A pd.Series containing the percentage change of stock prices.

return_percentage

return_percentage(index, resample=None, period=1, trade_at_price='close', bfill=False, **kwargs)

Calculate the percentage change of market prices over a given period.

PARAMETER DESCRIPTION
index

A multi-level index of datetime and instrument.

TYPE: Index

resample

The resample frequency for the output data. Defaults to None.

TYPE: Optional[str] DEFAULT: None

period

The number of periods to calculate the percentage change over. Defaults to 1.

TYPE: int DEFAULT: 1

trade_at_price

The price for execution. Defaults to close.

TYPE: str DEFAULT: 'close'

**kwargs

Additional arguments to be passed to the resampler function.

DEFAULT: {}

RETURNS DESCRIPTION

pd.Series: A pd.Series containing the percentage change of stock prices.

finlab.ml.qlib

DumpDataBase

DumpDataBase(csv_path, qlib_dir, backup_dir=None, freq='day', max_workers=16, date_field_name='date', file_suffix='.csv', symbol_field_name='symbol', exclude_fields='', include_fields='', limit_nums=None)

Base class for dumping data to Qlib format.

PARAMETER DESCRIPTION
csv_path

The path to the CSV file or directory containing the CSV files.

TYPE: str

qlib_dir

The directory where the Qlib data will be saved.

TYPE: str

backup_dir

The directory where the backup of the Qlib data will be saved. Defaults to None.

TYPE: str DEFAULT: None

freq

The frequency of the data. Defaults to "day".

TYPE: str DEFAULT: 'day'

max_workers

The maximum number of workers for parallel processing. Defaults to 16.

TYPE: int DEFAULT: 16

date_field_name

The name of the date field in the CSV file. Defaults to "date".

TYPE: str DEFAULT: 'date'

file_suffix

The suffix of the CSV file. Defaults to ".csv".

TYPE: str DEFAULT: '.csv'

symbol_field_name

The name of the symbol field in the CSV file. Defaults to "symbol".

TYPE: str DEFAULT: 'symbol'

exclude_fields

The fields to exclude from the dumped data. Defaults to "".

TYPE: str DEFAULT: ''

include_fields

The fields to include in the dumped data. Defaults to "".

TYPE: str DEFAULT: ''

limit_nums

The maximum number of CSV files to process. Defaults to None.

TYPE: int DEFAULT: None

CatBoostModel

CatBoostModel()

CatBoostModel is a wrapper model for CatBoost model.

import finlab.ml.qlib as q

# build X_train, y_train, X_test

model = q.CatBoostModel()
model.fit(X_train, y_train)
y_pred = model.predict(X_test)

DEnsmbleModel

DEnsmbleModel()

DEnsmbleModel is a wrapper model for Double Ensemble model.

import finlab.ml.qlib as q

# build X_train, y_train, X_test

model = q.DEnsmbleModel()
model.fit(X_train, y_train)
y_pred = model.predict(X_test)

DNNModel

DNNModel()

DNNModel is a wrapper model for Deep Neural Network model.

import finlab.ml.qlib as q

# build X_train, y_train, X_test

model = q.DNNModel()
model.fit(X_train, y_train)
y_pred = model.predict(X_test)

LGBModel

LGBModel()

LGBModel is a wrapper model for LightGBM model.

import finlab.ml.qlib as q

# build X_train, y_train, X_test

model = q.LGBModel()
model.fit(X_train, y_train)
y_pred = model.predict(X_test)

LinearModel

LinearModel()

LinearModel is a wrapper model for Linear model.

import finlab.ml.qlib as q

# build X_train, y_train, X_test

model = q.LinearModel()
model.fit(X_train, y_train)
y_pred = model.predict(X_test)

SFMModel

SFMModel()

SFMModel is a wrapper model for SFM.

import finlab.ml.qlib as q

# build X_train, y_train, X_test

model = q.SFMModel()
model.fit(X_train, y_train)
y_pred = model.predict(X_test)

TabnetModel

TabnetModel()

TabnetModel is a wrapper model for Tabnet model.

import finlab.ml.qlib as q

# build X_train, y_train, X_test

model = q.TabnetModel()
model.fit(X_train, y_train)
y_pred = model.predict(X_test)

XGBModel

XGBModel()

XGBModel is a wrapper model for XGBoost model.

import finlab.ml.qlib as q

# build X_train, y_train, X_test

model = q.XGBModel()
model.fit(X_train, y_train)
y_pred = model.predict(X_test)

alpha

alpha(handler='Alpha158', **kwargs)

產生 Qlib 的特徵 Args: handler (str): 預設為 'alpha158' 也可以設定成 'Alpha360' Examples:

import finlab.ml.qlib as q
features = q.alpha('Alpha158')

dump

dump(freq='day')

產生Qlib 於台股的資料庫 Examples:

import qlib
import finlab.ml.qlib as q

q.dump() # generate tw stock database
q.init() # initiate tw stock to perform machine leraning tasks (similar to qlib.init)

import qlib
# qlib functions and operations

get_models

get_models()

Return a list of available models. Examples:

import finlab.ml._qlib as q

models = q.get_models()
print(models)
output:

{ 'LGBModel': LGBModel, 'XGBModel': XGBModel, 'DEnsmbleModel': DEnsmbleModel, 'CatBoostModel': CatBoostModel, 'LinearModel': LinearModel, 'TabnetModel': TabnetModel, 'DNNModel': DNNModel, 'SFMModel': SFMModel}

init

init()

Qlib 初始化 (類似於台股版 qlib.init() 但更簡單易用) Examples:

import qlib
import finlab.ml.qlib as q

q.dump() # generate tw stock database
q.init() # initiate tw stock to perform machine leraning tasks (similar to qlib.init)

import qlib
# qlib functions and operations