Senior Data Engineer
Remote Developer role with clear candidate location fit.
Senior Data Engineer
Requirements and responsibilities
Readable role content extracted into sections for faster review.
About the role
We are looking for a Senior Data Engineer to design and build scalable data lakes, warehouses, and lakehouse architectures supporting a thematic research platform that processes large volumes of financial data daily. You will implement Python-based ETL/ELT pipelines, orchestrate workflows with Airflow, develop ingestion workflows from third-party APIs, and work with Snowflake, Spark, and AWS to deliver high-performance data infrastructure. The role combines hands-on engineering with technical consulting responsibilities, translating business goals into data architecture roadmaps.
What you will do
- Design and implement Python Data Engineering solutions;
- Design and build scalable Data Lakes, Data Warehouses, and Data Lakehouses;
- Design and implement robust ETL/ELT processes at scale using Python, incorporating modern pipeline orchestration tools like Airflow;
- Develop sophisticated ingestion workflows from diverse 3rd party APIs and data sources;
- Manage and optimize various file formats (Parquet, Avro, ORC) and columnar storage to ensure high-performance data retrieval;
- Work with AI development tools to support and accelerate ongoing development, machine learning initiatives and advanced analytics;
- Act as a technical consultant for stakeholders and leadership to gather requirements, understand business goals, and translate them into technical roadmaps;
- Work with Terraform and other tools to build AWS and on-prem infrastructure.
Must haves
- You must be authorized to work for ANY employer in the US (e.g., Green card holders, TN visa holders, GC EAD, H4 EAD, U4U with EAD), as we are unable to sponsor or take over employment visa sponsorship at this time;
- Bachelor’s degree in computer science/engineering or other technical field, or equivalent experience;
- 5+ years of experience with Python;
- 5+ years of experience with data processing, manipulation, and analytics libraries like Pandas, Polars, PySpark or DuckDB;
- 2+ years of experience with Big Data technologies (Spark, Snowflake);
- Expert-level knowledge of pipeline orchestration using Airflow or similar industry-standard tools;
- Deep understanding of Medallion Architecture, columnar file formats, and diverse database technologies (SQL, NoSQL, and Lakehouse architectures);
- Proven ability to work with 3rd party APIs for complex data ingestion tasks;
- Proficiency with modern Cloud platforms (AWS, GCP, Snowflake) and advanced SQL optimization;
- Exceptional soft skills with a proven ability to gather requirements from leadership and collaborate effectively across cross-functional teams;
- Excellence in optimizing complex data pipelines and troubleshooting data latency or consistency issues in massive datasets;
- A self-starter mindset, regularly investigating more efficient data architectures and AI development tools to improve pipeline performance;
- Taking pride in data integrity and the accuracy of the end-to-end pipelines and architectures you build;
- Strong communication skills for seamless global collaboration with stakeholders and distributed teams;
- Upper-intermediate English level.
Nice to haves
- Familiarity with the fintech industry, understanding of financial data, regulatory requirements, and business processes specific to the domain;
- Documentation skills to document data pipelines, architecture designs, and best practices for knowledge sharing and future reference;
- OpenSearch, Elasticsearch;
- AWS Sagemaker Studio, Jupyter for analyze data;
- Terraform;
- Scala.
Keep a backup shortlist.
Stack
Use these tags to compare similar remote roles.
Location eligibility
Candidates should apply only when their profile country is listed here.
Hiring flow
Applications are saved in WithMira for review and follow-up.