交通数据挖掘技术（Data Mining for Transportation）(东南大学) 中国大学mooc答案满分完整版章节测试

暮糜彤摩法联群咕吠碌渺沮惨

Week 1. Introduction to data mining Test 1

1、 Which one is not the description of Data mining?

答案: Appropriate statistical analysis methods to analyze the data collected

2、 Which one describes the right process of knowledge discovery?

答案: Selection-Preprocessing-Transformation-Data mining-Interpretation/Evaluation

3、 Which one is not belong to the process of KDD?

答案: Data description

4、 Which one is not the right alternative name of data mining??

答案: Data harvesting

5、 Which one is not the nominal variables?

答案: Age

6、 Which one is wrong about classification and regression??

答案: We can construct classification models (functions) without some training examples.

7、 Which one is wrong about clustering and outliers?

答案: Clustering belongs to supervised learning.

8、 About data process, which one is wrong?

答案: When making data classification, we predict categorical labels excluding unordered one.

9、 Outlier mining?such as density based method belongs to supervised learning.

答案: 错误

10、 Support vector machines can be used for classification and regression.

答案: 正确

Week 2. Data pre-processing Test 2

1、 Which is not the reason we need to preprocess the data?

答案: to make result meet our hypothesis

2、 Which is not the major tasks in data preprocessing?

答案: Transition

3、 How to construct new feature space by PCA?

答案: New feature space by PCA is constructed by eliminating the weak components to reduce the size of the data.

4、 Which one is wrong about methods for discretization?

答案: Clustering analysis only belongs to top-down split.

5、 Which one is wrong about Equal-width (distance) partitioning and Equal-depth (frequency) partitioning?

答案: The interval of the former one is not equal.

6、 Which one is wrong way to normalize data?

答案: Simple scaling

7、 Which are the right way to fill in missing values?

答案: Smart mean;
Probable value;
Ignore

8、 Which are the right way to handle noise data?

答案: Regression;
Cluster;
WT;
Manual

9、 Which one is right about wavelet transforms?

答案: The DWT decomposes each segment of time series via the successive use of low-pass and high-pass filtering at appropriate levels.;
Wavelet transforms can be used for reducing data and smoothing data.

10、 Which are the common used ways to sampling?

答案: Simple random sample without replacement;
Simple random sample with replacement;
Stratified sample;
Cluster sample

11、 Discretization means dividing the range of a continuous attribute into intervals.

答案: 正确

Week 3. Instance based learning Test 3

1、 What’s the difference between eager learner and lazy learner?

答案: Eager learners would generate a model for classification while lazy learner would not.

2、 How to choose the optimal value for K?

答案: Cross-validation can be used to determine a good value by using an independent dataset to validate the K values.;
Low values for K (like k=1 or k=2) can be noisy and subject to the effect of outliers.;
Historically, the optimal K for most datasets has been between 3-10.

3、 What’s the major components in KNN?

答案: How to measure similarity?;
How to choose “k”?;
How are class labels assigned?

4、 Which one of the following ways can be used to obtain attribute weight for Attribute-Weighted KNN?

答案: Prior knowledge / experience.;
PCA, FA (Factor analysis method).;
Information gain.;
Gradient descent, simplex methods and genetic algorithm.

5、 At learning stage KNN would find the K closest neighbors and then decide classify K identified nearest label.

答案: 错误

上方为免费预览版答案，如需购买完整答案，请点击下方红字

点击这里,购买完整版答案

点关注，不迷路，微信扫一扫下方二维码

关注我们的公众号：阿布查查 随时查看答案，网课轻松过

为了方便下次阅读，建议在浏览器添加书签收藏本网页

电脑浏览器添加/查看书签方法

1.按键盘的ctrl键+D键，收藏本页面

2.下次如何查看收藏的网页？

点击浏览器右上角-【工具】或者【收藏夹】查看收藏的网页

手机浏览器添加/查看书签方法

一、百度APP添加/查看书签方法

1.点击底部五角星收藏本网页

2.下次如何查看收藏的网页？

点击右上角【┇】-再点击【收藏中心】查看

二、其他手机浏览器添加/查看书签方法

1.点击【设置】-【添加书签】收藏本网页

2.下次如何查看收藏的网页？

点击【设置】-【书签/历史】查看收藏的网页

闹胳溪屋辨垒背常颊钒疚须舰