Principal Data Engineer
Work Location: Bangalore
Reports To: Head of Data Science and Engineering
Rapido is India’s largest bike taxi player focused on solving the first and last mile connectivity problem for India. The primary focus is mobility and changing all facets of mobility across India.
We believe that 2 Wheeler are the right mode of transport for developing countries like India and have much more scope than 4 wheelers, which is also reflected in the fact that the number of 2 wheelers is significantly more than the number of 4-wheelers.
We have operations in close to 100 cities and are the undisputed market leader in this space. Growing close to 500% year-on-year, we have ambitious targets set for ourselves in the future as well.
Role and Responsibilities:
- Architect complex data driven systems involving multiple realtime and batch touchpoints
- Creating complex data processing pipelines, as part of diverse, high energy teams
- Designing scalable implementations of the models developed by our Data Scientists
- Being able to deploy models in real-time applications either as part of a microservice(HTTP or RPC) with bounded context or as realtime pipelines producing events in response to user actions on ground
- Hands-on programming based on TDD, usually in a pair programming environment
- Deploying data pipelines in production based on Continuous Delivery practices.
- Able to build and operate Data Pipelines, Build and operate Data Storage, Is familiar with Infrastructure definition and automation in this context. Is aware of adjacent technologies to the ones they have worked on. Good understanding of Data Modelling.
- Involve in building and deploying large scale data processing pipelines in a production environment.
- Experience building data pipelines and data centric applications using distributed storage solutions(including and not limited to HDFS like storage, Elasticsearch, Mongo, Kafka, Postgres/Mysql etc)
EXPERTISE AND QUALIFICATIONS
- Experience in architecting complex data processing systems including and not limited to Data platform, Experiment Platform, Analytics/BI Platform
- Experience in HDFS, S3, NoSql Databases and distributed platforms like Hadoop, Spark, Flink, Hive, Kafka, Oozie, Airflow, Elasticsearch etc.
- Experience in any of MapR, Cloudera, and HortonWorks and/ or cloud based Hadoop Distributions(GCP preferred).
- Experience in architecting and building data centric applications involving statistical methods and ML models
Functional / Behavioural Competencies:
- Excellent understanding of technology landscape
- Is able to zoom in to details and zoom out to see big picture
- Learning ability: Applies theoretical knowledge to practice
- Focus on excellence
- Mentoring team mates. Will have to closely mentor tech leads in data engineering
Education & Experiences:
- B. Tech, M. Tech (in Computer Sciences preferred)
- Around 7+ years of experience. Minimum of 4+ years as a tech lead or equivalent in Data engineering space
- Round 1 – Technical Discussion 1
- Round 2 – Technical Discussion 2
- Round 3 - Leadership discussion
- Round 4 – HR Round
WHY SHOULD YOU JOIN RAPIDO
We've scaled 10x within 1 year and currently doing 3.5+ Lakh orders per day. Our growth outshines our goals and we want you to be a part of the growth solving fundamental mobility problems for India. You can be part of the team that is helping daily commuters with economic and quicker rides. At Rapido, we take our work seriously and are proud of the associations we have built along the way. But then, we also know how to have fun. With a seamless communication structure and a “no cubicle culture”, the people here are extremely approachable. You will have several opportunities to exercise your potential, you won't be disappointed. We break the regular office monotony and believe in free-flowing work culture.