Data Preparation Component
CREX-D
General configurations
Basic configuration
Sample size - doc.
Distance metric
cosine
euclidean
Vectorize
Process
Preprocessed data
Process Doc2vec
Process TFIDF
Different custering data
Advanced configuration
n_clustering_processes
n_evaluation_processes
location
indexing
Result folder
Raw data folder
Raw clustering data folder
Sample clusterings
Sampling fitness
RMSE
MinMax
Min samples per clust.
Max sample size
Max iterations
Evaluate
Clustering
Kmeans
| DBSCAN
| Agglomerative
Vectorizing
TFIDF
| Doc2vec
| PACT
Evaluation measure
Hom-Com-V
| Silhouette
| Co-oc
Clustering algorithms
K-MEANS
K-means configuration
Number of clusters
Use Minibatch
Batch size
More configurations
kmeans_init
kmeans_n_init
kmeans_n_job
kmeans_max_iter
kmeans_verbose
DBSCAN:
DBSCAN configuration
Minimium points
Epsilon
More configurations
dbscan_algorithm
dbscan_leaf_size
dbscan_p
Agglomerative:
Agglomerative configuration
Number of clusters
Linkage
ward
Average
Complete
Vectorizing algorithms
Doc2vec
Doc2vec configuration
Vector size
Window size
More configurations
doc2vec_dm
doc2vec_alpha
doc2vec_min_alpha
doc2vec_min_count
doc2vec_iter
doc2vec_negative
TFIDF
TFIDF configuration
Vector size
Use PCA
PCA Vector size
PACT
PACT configuration
Use P feature
Use A feature
Use C feature
Generated Configuration
Please select you parameters
Main config
More config
Browse to the CREXD folder and run :