Job Description
Batch : 2016 & before
—- What the Candidate Will Do —-
- Responsible for defining the Source of Truth (SOT), Dataset designfor multiple Uber teams.
- Identify unified data models collaborating with Data Science teams
- Streamline data processing of the original event sources and consolidate them in source of truth event logs
- Build and maintain real-time/batch data pipelines that can consolidate and clean up usage analytics
- Build systems that monitor data losses from the different sources and improve the data quality
- Own the data quality and reliability of the Tier-1 & Tier-2 datasets including maitaining their SLAs, TTL and consumption
- Devise strategies to consolidate and compensate the data losses by correlating different sources
- Solve challenging data problems with cutting edge design and algorithms.
—- Basic Qualifications —-
- 7+ years of extensive Data engineering experience working with large data volumes and different sources of data.
- Strong data modeling skills, domain knowledge and domain mapping experience.
- Strong experience of using SQL language and writing complex queries.
- Experience with using other programming languages like Java, Scala, Python
- Good problem solving and analytical skills
- Good communication, mentoring and collaboration skills.
—- What the Candidate Will Do —-
- Extensive experience in data engineering and working with Big data
- Experience with ETL or Streaming data and one or more of, Kafka, HDFS, Apache Spark , Apache Flink , Hadoop
- Experience backend services and familiarity with one of the cloud platform ( AWS/ Azure / Google cloud)