PWC Interview Questions and Answers for Hadoop

Hola friends 🙂 I am back with another interview experience and this time for PWC. Check out this blog for all PWC interview questions and answers for Hadoop. PWC is a big 4 accounting firm and a very good company to work with. So, I met one of my friend this Sunday and he shared his PWC interview experience with me including all the PWC interview Question and Answers for Hadoop programmers.

 

PWC Interview Questions and Answers

 

PWC Interview Questions and Answers – Test Pattern

The PWC test paper for Hadoop contains multiple sections from Map-reduce program, Hadoop architecture, Unix scripting, SQL queries, Oozie and Sqoop. So, if you are preparing for a big 4 MNC read the following questions and get yourself ready if you are a Hadoop programmer.

Let’s analyze the PWC Interview Questions and Answers for Hadoop section wise:

  1. Hadoop Architecture:

You must be very much clear about the core concepts of Hadoop architecture to answer the question of this section. The questions will be based upon split count required while saving a file in HDFS against the provided maximum and minimum split size.

Also, this involves questions on the understanding of node manager like what if the name node fails. Also, you would be required to write the whole process when a client submits a request to Hadoop System.

2. Map-Reduce Programming Paradigm

This section contains questions like how many map and reduce tasks will this SQL query takes. Hence, check out such questions and increase your knowledge. There will be a programming questions too wherein you would be asked to write the code in any of the following language: Python/Java/Ruby.

Another question will be how Hadoop allocates Map tasks and reduce tasks.

3. SQL Query Questions

This section contains questions mainly on joins (inner join, outer join, left join etc.) like writing queries for the desired join and to tell the count of elements returned in the output.  Also, there will be questions where you will be asked to write a query to find out the nth highest or lowest salary of an employee from a given table.

4. Hive

This sections of PWC interview questions and answer set contains questions like how to write a command to send a task in background. Then, again to take that command in foreground and later on how to kill that particular task.

5. Unix Commands

This section will ask you questions like how to print the nth line of a csv file. Also, some commands like print the nth line where condition = “some value”.

6. Sqoop and Oozie

This sections covers questions on the basic understanding of commands needed to transfer data between RDBMS and Hadoop system and also on the inter cluster Hadoop transfer system.

Oozie section contains two questions – one on Oozie workflow configuration and second on the Fork and join parameter.

Hope this article will help future aspirants 🙂 Also, check my next article of Dunnhumby interview questions.

Please share your comments below.

9 thoughts on “PWC Interview Questions and Answers for Hadoop”

    1. Hello There,

      Allow me to show my gratitude bloggers. You guys are like unicorns. Never seen but always spreading magic
      Your content is yummy. So satisfied.

      I’m testing to use a for loop that stops by 2 conditions. For example:
      Python Code: (Double-click to select all)
      1
      2
      3
      4 l1 = [‘hello’, ‘bye’, ‘now’, ‘before’, ‘after’]
      count = 3
      for word in l1:
      print(word)
      But I want to stop not only when l1 is finished, also when count is 0, and I do it:
      Python Code: (Double-click to select all)
      1
      2
      3
      4
      l1 = [‘hello’, ‘bye’, ‘now’, ‘before’, ‘after’]
      count = 3
      for word in l1:
      count -=1
      print(word)
      if count == 0:
      break

      Thanks a lot. This was a perfect step-by-step guide. Don’t think it could have been done better.

      Kind Regards,
      Preethi

  1. Hey people, this is what they are asking in written. Questions are on above topics only and almost same. Please prepare accordingly. Best of luck !

      1. Hi Ashu, if you are good with split concepts and code a map-reduce program in java then questions will be easy for you. The test paper comprises a set of questions from different sections to understand your basic + moderate skills. But for map-reduce program, yes you have to be good with it. Not sure why map-reduce program is still being asked to code when everything is getting shifted to Spark now. Hope you find it useful. Good luck for your interview.

  2. Hey Brother,

    You make learning and reading addictive. All eyes fixed on you. Thank you being such a good and trust worthy guide.
    I am having a website which contains buy/sell information of multiple distributor, for each distributor i need to choose the invoice date say for a period of one month and download data.

    For that i am having userid & password for each distributor so that i can download each distributor data seperately. this is by design and we are using it as this is third party website. The result will be displayed in columns & rows and if we want to download it as excel/csv options are there with button as “export to”.
    Is there a way to dowload the files as csv /.xlsx with my information using python.

    will python access the userid and password from an xml file/ any prescribed and login to the website and navigate to reports and then filter and then download ?

    why this requirement is to automate the process as we have more than 1K distributors, in-order to minimize the manual work.
    Will Python can help me ?
    is there are any relevant links available to do this kind ?
    If so can it be shared here

    THANK YOU!! This saved my butt today, I’m immensely grateful.

    Merci Beaucoup,
    Ajeeth Kapoor

  3. Hi There,

    I love all the posts, I really enjoyed.
    I would like more information about this, because it is very nice., Thanks for sharing.

    codeNUMBEROFPRIMES = 100
    NUMBEROFPRIMESPERLINE = 10
    count = 0
    number = 2
    while (count < NUMBEROFPRIMES):
    isPrime = True
    divisor = int
    divisor = 2

    for divisor in range(2,int(number/2)):
    If (number % divisor) == 0
    isPrime = False
    break
    if isPrime:
    count = count + 1
    print(number)
    if (count % NUMBEROFPRIMESPERLINE) == 0:
    print(number/n)
    number = number + 1
    By the way do you have any YouTube videos, would love to watch it. I would like to connect you on LinkedIn, great to have experts like you in my connection (In case, if you don’t have any issues).
    Please keep providing such valuable information.

    Merci Beaucoup,
    Pavani

Leave a Reply

Your email address will not be published. Required fields are marked *