HADOOP-PR000007 Online Practice Questions and Answers

Questions 4

You need to perform statistical analysis in your MapReduce job and would like to call methods in the Apache Commons Math library, which is distributed as a 1.3 megabyte Java archive (JAR) file. Which is the best way to make this library available to your MapReducer job at runtime?

A. Have your system administrator copy the JAR to all nodes in the cluster and set its location in the HADOOP_CLASSPATH environment variable before you submit your job.

B. Have your system administrator place the JAR file on a Web server accessible to all cluster nodes and then set the HTTP_JAR_URL environment variable to its location.

C. When submitting the job on the command line, specify the 璴ibjars option followed by the JAR file path.

D. Package your code and the Apache Commands Math library into a zip file named JobJar.zip

Browse 108 Q&As

Questions 5

Given a directory of files with the following structure: line number, tab character, string: Example: 1abialkjfjkaoasdfjksdlkjhqweroij 2kadfjhuwqounahagtnbvaswslmnbfgy 3kjfteiomndscxeqalkzhtopedkfsikj You want to send each line as one record to your Mapper. Which InputFormat should you use to complete

the line: conf.setInputFormat (____.class) ; ?

A. SequenceFileAsTextInputFormat

B. SequenceFileInputFormat

C. KeyValueFileInputFormat

D. BDBInputFormat

Browse 108 Q&As

Questions 6

Which one of the following classes would a Pig command use to store data in a table defined in HCatalog?

A. org.apache.hcatalog.pig.HCatOutputFormat

B. org.apache.hcatalog.pig.HCatStorer

C. No special class is needed for a Pig script to store data in an HCatalog table

D. Pig scripts cannot use an HCatalog table

Browse 108 Q&As

Questions 7

All keys used for intermediate output from mappers must:

A. Implement a splittable compression algorithm.

B. Be a subclass of FileInputFormat.

C. Implement WritableComparable.

D. Override isSplitable.

E. Implement a comparator for speedy sorting.

Browse 108 Q&As

Questions 8

Which TWO of the following statements are true regarding Hive? Choose 2 answers A. Useful for data analysts familiar with SQL who need to do ad-hoc queries

B. Offers real-time queries and row level updates

C. Allows you to define a structure for your unstructured Big Data

D. Is a relational database

Browse 108 Q&As

Questions 9

Indentify the utility that allows you to create and run MapReduce jobs with any executable or script as the mapper and/or the reducer?

A. Oozie

B. Sqoop

C. Flume

D. Hadoop Streaming

E. mapred

Browse 108 Q&As

Questions 10

Which project gives you a distributed, Scalable, data store that allows you random, realtime read/write access to hundreds of terabytes of data?

A. HBase

B. Hue

C. Pig

D. Hive

E. Oozie

F. Flume

G. Sqoop

Browse 108 Q&As

Questions 11

You want to populate an associative array in order to perform a map-side join. You've decided to put this information in a text file, place that file into the DistributedCache and read it in your Mapper before any records are processed.

Indentify which method in the Mapper you should use to implement code for reading the file and populating the associative array?

A. combine

B. map

C. init

D. configure

Browse 108 Q&As

Questions 12

What is a SequenceFile?

A. A SequenceFile contains a binary encoding of an arbitrary number of homogeneous writable objects.

B. A SequenceFile contains a binary encoding of an arbitrary number of heterogeneous writable objects.

C. A SequenceFile contains a binary encoding of an arbitrary number of WritableComparable objects, in sorted order.

D. A SequenceFile contains a binary encoding of an arbitrary number key-value pairs. Each key must be the same type. Each value must be same type.

Browse 108 Q&As

Questions 13

Which one of the following statements is true about a Hive-managed table?

A. Records can only be added to the table using the Hive INSERT command.

B. When the table is dropped, the underlying folder in HDFS is deleted.

C. Hive dynamically defines the schema of the table based on the FROM clause of a SELECT query.

D. Hive dynamically defines the schema of the table based on the format of the underlying data.

Browse 108 Q&As

Exam Code: HADOOP-PR000007

Exam Name: Hortonworks Certified Apache Hadoop 2.0 Developer (Pig and Hive Developer)

Last Update: May 11, 2024

Questions: 108 Q&As

PDF

$49.99

ADD TO CART

VCE

$59.99

ADD TO CART

PDF + VCE

$67.99

ADD TO CART