Big Data Engineer

Our Big Data capability team needs hands-on developers who can produce beautiful & functional code to solve complex analytics problems. If you are an exceptional developer with an aptitude to learn and implement using new technologies, and who loves to push the boundaries to solve complex business problems innovatively, then we would like to talk with you.

Position expectations in detail:

  • Solve business problems & develop business solution: Use problem solving methodologies to propose creative solutions to solve business problem.
  • Evaluate, Develop, Maintain and test big data solutions for advanced analytics projects
  • Big data pre-processing & reporting workflows including collecting, parsing, managing, analyzing and visualizing large sets of data to turn information into business insights
  • Client relationship management: Build deep client relationship, network & be a thought partner. Anticipate business problems & deliver par excellence.
  • Testing various machine learning models on Big Data, and deploying learned models for ongoing scoring and prediction. An appreciation of the mechanics of complex machine learning algorithms would be a strong advantage.

QUALIFICATIONS & EXPERIENCE:

  • 5+ years of demonstrable experience designing technological solutions to complex data problems, developing & testing modular, reusable, efficient and scalable code to implement those solutions.
  • Over all experience of  5 to  7 yrs with at least 3 to 4 years of hands on experience in running data analytics projects and managing offshore-onsite business model
  • Excellent communication and Problem-solving skills, Project management & Creative thinking
  • Experience in Big data tools such as Hadoop and Python
  • Expertise in SQL /R
  • Expert-level proficiency in at-least one of Java, C++ or Python (preferred). Scala knowledge a strong advantage.
  • Strong understanding and experience in distributed computing frameworks, particularly Apache Hadoop 2.0 (YARN; MR & HDFS) and associated technologies — one or more of Hive, Sqoop, Avro, Flume, Oozie, Zookeeper, etc..
  • Hands-on experience with Apache Spark and its components (Streaming, SQL, MLLib) is a strong advantage.
  • Operating knowledge of cloud computing platforms (AWS, especially EMR, EC2, S3, SWF services and the AWS CLI)
  • Experience working within a Linux computing environment, and use of command line tools including knowledge of shell/Python scripting for automating common tasks
  • Ability to work in a team in an agile setting, familiarity with JIRA and clear understanding of how Git works
  • In addition, the ideal candidate would have great problem-solving skills, and the ability & confidence to hack their way out of tight corners.

MUST HAVE (HANDS ON) EXPERIENCE:

  • Java or Python or C++ expertise
  • Linux environment and shell scripting
  • Distributed computing frameworks (Hadoop or Spark)
  • Cloud computing platforms (AWS).
  • DESIRABLE (WOULD BE A PLUS):
  • Distributed and low latency (streaming) application architecture
  • Row store distributed DBMSs such as Cassandra
  • Familiarity with API design

EDUCATION:

B.E/B.Tech in Computer Science or related technical degree