Senior/Lead Data Engineer
GFT Technologies Xem tất cả việc làm
- Tp Hồ Chí Minh
- Lâu dài
- Toàn thời gian
- Architect, develop, and maintain scalable data infrastructure, including data lakes, pipelines, and metadata repositories, ensuring the timely and accurate delivery of data to stakeholders.
- Work closely with data scientists to build and support data models, integrate data sources, and support machine learning workflows and experimentation environments.
- Develop and optimize large-scale, batch, and real-time data processing systems to enhance operational efficiency and meet business objectives.
- Leverage Python, Apache Airflow, and AWS services to automate data workflows and processes, ensuring efficient scheduling and monitoring.
- Utilize AWS services such as S3, Glue, EC2, and Lambda to manage data storage and compute resources, ensuring high performance, scalability, and cost-efficiency.
- Implement robust testing and validation procedures to ensure the reliability, accuracy, and security of data processing workflows.
- Stay informed of industry best practices and emerging technologies in both data engineering and data science to propose optimizations and innovative solutions.
- Core Expertise: Proficiency in Python for data processing and scripting (pandas, pyspark), workflow automation (Apache Airflow), and experience with AWS services (Glue, S3, EC2, Lambda).
- Containerization & Orchestration: Experience working with Kubernetes and Docker for managing containerized environments in the cloud.
- Data Engineering Tools: Hands-on experience with columnar and big data databases (Athena, Redshift, Vertica, Hive/Hadoop), along with version control systems like Git.
- Cloud Services: Strong familiarity with AWS services for cloud-based data processing and management.
- CI/CD Pipeline: Experience with CI/CD tools such as Jenkins, CircleCI, or AWS CodePipeline for continuous integration and deployment.
- Data Engineering Focus (75%): Expertise in building and managing robust data architectures and pipelines for large-scale data operations.
- Data Science Support (25%): Ability to support data science workflows, including collaboration on data preparation, feature engineering, and enabling experimentation environments.
- Langchain Experience: Familiarity with Langchain for building data applications involving natural language processing or conversational AI frameworks.
- Advanced Data Science Tools: Experience with AWS Sagemaker or Databricks for enabling machine learning environments.
- Big Data & Analytics: Familiarity with both RDBMS (MySQL, PostgreSQL) and NoSQL (DynamoDB, Redis) databases.
- BI Tools: Experience with enterprise BI tools like Tableau, Looker, or PowerBI.
- Messaging & Event Streaming: Familiarity with distributed messaging systems like Kafka or RabbitMQ for event streaming.
- Monitoring & Logging: Experience with monitoring and log management tools such as the ELK stack or Datadog.
- Data Privacy and Security: Knowledge of best practices for ensuring data privacy and security, particularly in large data infrastructures.
- Competitive salary
- 13th-month salary guarantee
- Performance bonus
- Professional English course for employees
- Premium health insurance
- Extensive annual leave
Feel it. We are #one team collaboratively working towards the same goal.Not Ready To Apply?Stay connected! Enter your e-mail and we will keep you informed about upcoming events and opportunities that match your interests.