MRMD3.0使用了多种特征选择方法,并集合了PageRank,LeaderRank,Hits和TrustRank算法。
代码: github
环境(如若安装anaconda,无需执行下面的命令):
pip3 install -r requirements.txt --ignore-installed
| parameters | description |
|---|---|
| -s, --start | start index, default=1 |
| -i, --inputfile | input file (require:arff ,csv or libsvm format) |
| -e, --end | end index, default=-1 |
| -l, --length | step length, default=1 |
| -n, --n_dim | mrmd2.0 features top n,default=-1 |
| -t, --type_metric | evaluation metric, default=f1 |
| -m, --metrics_file | output the metrics file’s name |
| -o, --outfile | output the dimensionality reduction file’s name |
| -p, --picture | The scatter plots before and after dimension reduction are generated by tsne,defalult=false |
| -r, --rank_method | the rank method for features,choices=[“PageRank”,“Hits_a”,“Hits_h”,“LeaderRank”,“TrustRank”],default=“PageRank” |
| —————————————————— | ———————————————— |
python3 mrmd2.0.py -i test.csv -o out.csv -r PageRank
python3 mrmd3.0.py -i test.csv -o out.csv -r LeaderRank
python3 mrmd3.0.py -i test.csv -o out.csv -r TrustRank
python3 mrmd3.0.py -i test.csv -o out.csv -r Hits_a
python3 mrmd3.0.py -i test.csv -o out.csv -r Hits_h
| method | the number of the implement method |
|---|---|
| anova | *1 f_classif |
| chisquare | *1 chi2 |
| F value | *1 f_regression |
| linear model | *3 Lasso,LogisticRegression,Ridge |
| mutual inforamtion | *3 MI NMI MIC |
| mrmd | *3 pearson+Euclidean/Tanimoto/Cosine |
| mrmr | *2 miq |
| recursive feature elimination | *5 inearSVC,LogisticRegression, RandomForestClassifier,GradientBoostingClassifier, ComplementNB |
| tree_feature_importance | *3 DecisionTreeClassifier,RandomForestClassifier,GradientBoostingClassifier |
联系方式: heshida@tju.edu.cn