Support EST Hours, at least through 2pm EST (MUST)
We are seeking a skilled Data Engineer with a strong background in Python, Pyspark, and cloud-based big data applications. The ideal candidate will develop in an Agile environment, contributing to the architecture, design, and implementation of Big Data product. This role involves hands-on coding and collaboration with multi-disciplined teams to achieve project objectives.
Responsibilities:
- Collaborate with developers to meet product deliverables.
- Implement scalable solutions.
- Develop ETL with large datasets using source data from various sources.
- Work independently and collaboratively in an Agile development environment.
- Contribute to detailed design, architectural discussions, and customer requirements sessions.
- Design and develop clear, maintainable code with automated testing using Pytest, unittest, etc.
- Learn and integrate with various systems, APIs, and platforms.
- Interact with a multi-disciplined team to clarify, analyze, and assess requirements.
- Actively participate in the design, development, and testing of big data product.
Required Skills and Experience:
- Hands-on experience with Python, Pyspark, Jupyter Notebooks, and Python environment controllers (Poetry, PipEnv).
- Hands-on experience with Spark.
- Proficiency with Relational Databases (Oracle, SQL Server, MySQL, Postgres, or similar) and NonSQL Databases (MongoDB or Table).
- Familiarity with Databricks; experience with Azure Databricks is a plus.
- Expertise in data cleansing, transformation, and validation.
- Hands-on experience with any of these code versioning tools (GitHub, Azure DevOps, Bitbucket, etc.).
- Experience building pipelines in GitHub (or Azure DevOps, Jenkins, etc.).
- Experience using Markdown to document code or automated documentation tools (e.g., PyDoc).
- Strong written and verbal communication skills.
- Self-motivated with the ability to work well in a team.
Additional Valuable Skills:
- Experience with data visualization tools (Power BI, Tableau).
- Familiarity with DEVOPS CI/CD tools and automation processes (Azure DevOps, GitHub, BitBucket).
- Knowledge of containers and their environments (Docker, Podman, Docker-Compose, Kubernetes, Minikube, Kind, etc.).
- Experience with Azure Cloud Services and Azure Data Factory.
Education:
Bachelor of Science degree from an accredited university.