Cloudera - Celebrate 2025 with Discount Offer - Ends In 1d 00h 00m 00s Coupon code: Y2530OFF
  1. Home
  2. Cloudera
  3. CCA175 Dumps
  4. Free CCA175 Questions

Free CCA175 Questions for Cloudera CCA175 Exam as PDF & Practice Test Software

Page:    1 / 14   
Total 96 questions

Question 1

Problem Scenario 71 :

Write down a Spark script using Python,

In which it read a file "Content.txt" (On hdfs) with following content.

After that split each row as (key, value), where key is first word in line and entire line as value.

Filter out the empty lines.

And save this key value in "problem86" as Sequence file(On hdfs)

Part 2 : Save as sequence file , where key as null and entire line as value. Read back the stored sequence files.

Content.txt

Hello this is ABCTECH.com

This is XYZTECH.com

Apache Spark Training

This is Spark Learning Session

Spark is faster than MapReduce



Answer : A


Question 2

Problem Scenario 75 : You have been given MySQL DB with following details.

user=retail_dba

password=cloudera

database=retail_db

table=retail_db.orders

table=retail_db.order_items

jdbc URL = jdbc:mysql://quickstart:3306/retail_db

Please accomplish following activities.

1. Copy "retail_db.order_items" table to hdfs in respective directory p90_order_items .

2. Do the summation of entire revenue in this table using pyspark.

3. Find the maximum and minimum revenue as well.

4. Calculate average revenue

Columns of ordeMtems table : (order_item_id , order_item_order_id , order_item_product_id, order_item_quantity,order_item_subtotal,order_item_subtotal,order_item_product_price)



Answer : B


Question 3

Problem Scenario 78 : You have been given MySQL DB with following details.

user=retail_dba

password=cloudera

database=retail_db

table=retail_db.orders

table=retail_db.order_items

jdbc URL = jdbc:mysql://quickstart:3306/retail_db

Columns of order table : (orderid , order_date , order_customer_id, order_status)

Columns of ordeMtems table : (order_item_td , order_item_order_id , order_item_product_id, order_item_quantity,order_item_subtotal,order_item_product_price)

Please accomplish following activities.

1. Copy "retail_db.orders" and "retail_db.order_items" table to hdfs in respective directory p92_orders and p92_order_items .

2. Join these data using order_id in Spark and Python

3. Calculate total revenue perday and per customer

4. Calculate maximum revenue customer



Answer : B


Question 4

Problem Scenario 79 : You have been given MySQL DB with following details.

user=retail_dba

password=cloudera

database=retail_db

table=retail_db.orders

table=retail_db.order_items

jdbc URL = jdbc:mysql://quickstart:3306/retail_db

Columns of products table : (product_id | product categoryid | product_name | product_description | product_prtce | product_image )

Please accomplish following activities.

1. Copy "retaildb.products" table to hdfs in a directory p93_products

2. Filter out all the empty prices

3. Sort all the products based on price in both ascending as well as descending order.

4. Sort all the products based on price as well as product_id in descending order.

5. Use the below functions to do data ordering or ranking and fetch top 10 elements top()

takeOrdered() sortByKey()



Answer : A


Question 5

Problem Scenario 92 : You have been given a spark scala application, which is bundled in jar named hadoopexam.jar.

Your application class name is com.hadoopexam.MyTask

You want that while submitting your application should launch a driver on one of the cluster node.

Please complete the following command to submit the application.

spark-submit XXX -master yarn \

YYY SSPARK HOME/lib/hadoopexam.jar 10



Answer : B


Page:    1 / 14   
Total 96 questions