MRMD3.0.md

MRMD3.0 | Spark Version| Chinese

MRMD3.0使用了多种特征选择方法，并集合了PageRank,LeaderRank,Hits和TrustRank算法。

1.安装：

代码： github

环境(如若安装anaconda，无需执行下面的命令)：

pip3 install -r requirements.txt --ignore-installed

2. 参数:

parameters	description
-s, --start	start index, default=1
-i, --inputfile	input file (require:arff ,csv or libsvm format)
-e, --end	end index, default=-1
-l, --length	step length, default=1
-n, --n_dim	mrmd2.0 features top n,default=-1
-t, --type_metric	evaluation metric, default=f1
-m, --metrics_file	output the metrics file’s name
-o, --outfile	output the dimensionality reduction file’s name
-p, --picture	The scatter plots before and after dimension reduction are generated by tsne,defalult=false
-r, --rank_method	the rank method for features,choices=[“PageRank”,“Hits_a”,“Hits_h”,“LeaderRank”,“TrustRank”],default=“PageRank”
——————————————————	————————————————

3.用法样例

python3  mrmd2.0.py  -i test.csv -o out.csv -r PageRank
python3  mrmd3.0.py  -i test.csv -o out.csv -r LeaderRank
python3  mrmd3.0.py  -i test.csv -o out.csv -r TrustRank
python3  mrmd3.0.py  -i test.csv -o out.csv -r Hits_a
python3  mrmd3.0.py  -i test.csv -o out.csv -r Hits_h

4. mrmd3.0中使用的特征选择方法：

method	the number of the implement method
anova	*1 f_classif
chisquare	*1 chi2
F value	*1 f_regression
linear model	*3 Lasso,LogisticRegression,Ridge
mutual inforamtion	*3 MI NMI MIC
mrmd	*3 pearson+Euclidean/Tanimoto/Cosine
mrmr	*2 miq
recursive feature elimination	*5 inearSVC,LogisticRegression, RandomForestClassifier,GradientBoostingClassifier, ComplementNB
tree_feature_importance	*3 DecisionTreeClassifier,RandomForestClassifier,GradientBoostingClassifier

联系方式： heshida@tju.edu.cn