Senior Big Data Engineer
Islamabad, Federal Territory, Pakistan
Full Time
Data Engineering
Experienced
We are hiring for Senior Big Data Engineer with solid experience in data engineering and a passion for Big Data. The ideal candidate will have experience of over 4 to 6+ years specifically in Data Engineering.
You will have the chance to join a global, multi-cultural team of Data Engineers, and be part of the growth and development of it, participate in a large variety of exciting and challenging projects and work with our exceptional team of Technology professionals.
Technology Stack Used & Required Knowledge:
- 4 to 6+ years of experience on ETL Data Engineer, Python, Big Data (Hadoop, spark, hive)
- Experience with distributed processing technologies and frameworks, such as Hadoop, Spark, Kafka, and distributed storage systems (e.g., HDFS, S3)
- Excellent understanding of ETL cycle.
- Experience in creating and driving large-scale ETL pipelines in AWS-based environment and experience of services such as EMR, Glue, Lambda, Athena
- Experience with integration of data from multiple data sources.
- Understanding of data processing pipelines, data management, and data engineering principles, including experience working with large-scale datasets and distributed computing frameworks (e.g., Spark).
- Good knowledge and understanding of Big data engineering, ETL, Data Modeling, SQL Scripting, and Data Architecture.
- Strong software development and programming skills with a focus on data using Python, PySpark and/or Scala for data engineering.
- Experience and understanding of developing and maintaining complex stored procedures
- Strong proficiency in ETL schedulers such as StepFunction or Apache Airflow
- Solid understanding of data warehousing concepts and hands-on experience with relational databases (e.g., PostgreSQL, MySQL) and columnar databases (e.g., Redshift, BigQuery, HBase, ClickHouse)
- Experience and understanding of various core GCP services such as BigQuery, Cloud Composer, Cloud Storage
- Knowledge of containerization technologies, CI/CD pipelines and IAC tools like Docker, Jenkins, Terraform for automating software delivery process
- Experience with the AWS data management tools such as Data lake and Databricks or AWS Snowflake is a plus
- Understanding of descriptive and exploratory statistics, predictive modeling, evaluation metrics, decision trees, machine learning algorithms is a plus
- Experience and understanding of AWS data migration services such as SCT, DataSync, Data Extractor will be a plus
- Should have experience in Jira and Agile development methodologies.
- Must be open to learning and have the ability to grasp new technologies quickly.
- Be motivated and passionate about technology.
- Team player with strong integrity and professionalism who can work well with others in the group.
- Self-starter with a proven ability to take initiative and strong problem-solving skills.
To keep it short, below are some key responsibilities:
- Design and develop using Big Data technologies.
- Design and develop cutting edge Big Data, and Cloud technologies to build critical, highly complex distributed systems from scratch
- Design, develop and manage complex ETL jobs and pipelines.
- Act as a team member, consulting other teammates as your new systems transform the way this entire firm leverages data
- Deploying data models into production environments. This entails providing the model with data stored in a warehouse or coming directly from sources, configuring data attributes, managing computing resources, setting up monitoring tools, etc.
- Support and help them learn and improve their skills, as well as innovate and iterate on best practices.
- Solve complex issues and provide guidance to team members when needed.
- Make improvement and process recommendations that have an impact on the business.
Apply for this position
Required*