Advanced Data Mining - Quiz 3
BITS WILP - Mtec Software Systems - 2017
Question 1
What are the stop words mean in a text document.
Select one:
a. Less commonly used words that have high information
b. Punctuation and special symbols
c. Most commonly used words in a text that contribute to no information d. Designated keywords
answer : Most commonly used words in a text that contribute to no information
Question 2
Consider tree mining in a larger tree database. Which of the following kind support does NOT maintain anti monotone property?
Select one:
a. Hybrid
b. Occurance-Based Correct
c. Transactional-Based
answer : Occurance-Based
Question 3
Consider Gibson, David, Kumar algorithm for determining dense subgraph in massive graph. Choose the best suitable choice for fingerprinting in this algorithm.
Select one:
a. Fingerprinting is parameter independent
b. Fingerprinting helps to compute Jaccard Coefficient
c. Fingerprinting can be avoided
d. Fingerprinting reduces comparison time Correct
answer : Fingerprinting reduces comparison time
Question 4
Consider representation of a text data using features. Which one of the following is typically supposed to work best
Select one:
a. tfidf Correct
b. term document index matrix
c. word wise sorted array
d. term document count matrix
answer : tfidf
Question 5
Which of the following statements are true with respect to hyperlink induced topic search.
Select one or more:
a. Base set contains both the hub and authoritative pages. Correct
b. Authoritative pages cannot point to other pages in the network
c. HUB pages contain data that is searched by user Incorrect
d. Page-rank can only efficiently be defined recursively.
answer : Page-rank can only efficiently be defined recursively., Base set contains both the hub and authoritative pages.
Question 6
Which one of the following is worst representation of a tree database
Select one:
a. DFS
b. Link List
c. Flat File (Text form) Correct
d. BFS
answer : Flat File (Text form)
Question 7
In a social network graph where node represent person and directional edge represent "following" (A is following B). Which of the statement below is true in general.
Select one:
a. Higher out-degree of a node represents influential person
b. Higher in-degree of a node represents influential person Correct
c. Both A and B
d. None of the above
answer : Higher in-degree of a node represents influential person
Question 8
Consider the following graph.
Determine similarity between node A and B using Jaccard Coefficient.
Select one:
a. 2/6
b. 2/7 Correct
c. 3/8
d. 2
answer : 2/7
Question 9
Extensible Markup Language (XML) is a format to store database in following form
Select one:
a. Sequential
b. Relational
c. Unstructured
d. Hierarchical Correct
answer : Hierarchical
Question 10
Consider a graph representing social network where nodes represent persons and edges to friends. Now consider a 2D matrix A of integers where A[i,j] represents length of the path between node i and j. Let d be the maximum value in the matrix A.
Since social networks are dynamic in nature. When number of nodes increases in a social network graph what is expected effect on the value of d?
Select one:
a. Value of d is expected to increase
b. Value of d is expected to decrease Correct
c. Value of d is independent of this change
d. None of the above.
answer : Value of d is expected to decrease
Question 11
A system that has denied 5 genuine (right person wanting access) attempts of authentication out of 20. And have allowed 5 imposer attempts (attacker wanting access to the system) out of 15. Has accuracy
Select one:
a. (20/35)*100 %
b. (10/35)*100 %
c. (15/35)*100 %
d. (25/35)*100 % Correct
answer : (25/35)*100 %
Question 12
PK-Means algorithm can sometime provide non optimal clustering
Select one or more:
a. Because of the dependence on initial choice of centroids Correct
b. Because of the dependence on processing done at combiner
c. Because of the dependence on distribution of data on map machines Incorrect
d. Because of the dependence on processing done at reducer
answer : Because of the dependence on initial choice of centroids
Question 13
What is the support of sequence <{1}{3,4}> in following database
D= <{2,3}{1,3}{3,4,5}>,<{2,5}{1}{3,4,5}>,<{1,5}{2,5}{3}{3,5}{4}>,<{1,5}{2,3,5}{3,4}{1,4,6}>
Select one:
a. 50%
b. 45%
c. 90%
d. 75% Correct
answer : 75%
Question 14
Consider following architecture of a parallel crawler.
Which part is responsible to implement freshness property
Select one:
a. URL Frontier Correct
b. Parse
c. URL Filter
d. Host Splitter
answer : URL Frontier
Question 15
Handling of Big Data is challenging because of
Select one:
a. Large number of data points or Volume
b. Data may be from various sources and could have different formatting
c. Data may be continuously arriving that makes processing difficult
d. All of above Correct
answer : All of above
No comments:
Post a Comment