SugarCRM

Senior Data Engineer- Databricks

Vaga remota de Data Engineering com fit claro de localização do candidato.

Publicada4 de jul. de 2026

Países elegíveis1 país aceito

Sinal de senioridadeSenior

Modelo de trabalhoRemoto

Locais aceitos para candidatos

Estados Unidos

AWS Azure CI/CD PostgreSQL Python Spark SQL

Posso mesmo aplicar?Confira a lista de países

Países aceitos para candidatos estão listados (1).

Atualidade da fonte4 de jul. de 2026

Fit de localização1 país aceito

Match de stackAWS, Azure

Caminho de aplicaçãoSite da empresa

Resumo de fit da MiraPor que vale revisar esta vaga

Fit de localização1 país aceitoAdicione seu país

Match de stackAdicione skills ao perfil para compararAWS, Azure

Sinal de senioridadeSeniorDefina seu nível para uma análise mais precisa.

Prontidão para aplicarSite da empresaA aplicação continua no site da empresa.

Aplicação

Aplicar no site da empresa

Aplicação externa

Aplicando paraSenior Data Engineer- DatabricksSugarCRM

Fit de país1 país aceito

Caminho de aplicaçãoSite da empresa

WithMiraSalve ou assine antes de sair

Aplicação da empresa

O WithMira mantém esta vaga para descoberta. A aplicação continua no site da empresa.

Aplicar no site da empresa

Salvar vaga

Resumo da vaga

Senior Data Engineer- Databricks

Requisitos e responsabilidades

Conteúdo da vaga extraído em seções para revisão mais rápida.

Impact You Will Make in the Role:

Own Databricks production support for the Sugar Predict data platform, including monitoring, alerting, and incident response across all production data flows
Maintain and report on SLA performance metrics for data pipeline delivery, ensuring visibility into platform health and accountability across internal and external stakeholders
Identify and implement pipeline optimizations that reduce Databricks compute costs, improve throughput, andreduce processing windows while tracking impacts through measurable KPIs
Migrate legacy ETL/ELT pipelines to Databricks, building automation tooling to reduce manual intervention and ensure uninterrupted data delivery during transitions
Support new customers onboarding by provisioning, validating, and hardening tenant data pipelines that deliver reliable, isolated data from day one
Design and build high-performance Databricks pipelines that ingest, transform, and serve ERP and CRM data at scale across both Azure and AWS environments
Own the Delta Lake architecture including schema design, partitioning strategies, data quality enforcement, and incremental processing patterns
Enforce data security best practices across Databricks environments, including role-based access control, secrets management, and compliance requirements for enterprise CRM and ERP data
Implement data quality monitoring and observability across pipeline health and ML model inputs, ensuring data integrity that directly supports Sugar Predict prediction accuracy
Apply and enforce multi-tenant data isolation patterns ensuring reliable, secure data delivery across Sugar Predict enterprise customers
Partner with the Enterprise Architecture team to ensure Sugar Predict data pipelines integrate seamlessly with the broader SugarAI product ecosystem
Support a globally distributed operation through on-call rotation and after-hours incident response, meeting SLAs across multiple time zones
Maintain technical documentation, runbooks, and architectural decision records, contributing to team knowledge sharing and operational readiness across on-call and incident response scenarios
Apply CI/CD best practices to data pipeline development, including version control, automated testing, and deployment tooling to ensure reliable and repeatable pipeline delivery

Details

Own Databricks production support for the Sugar Predict data platform, including monitoring, alerting, and incident response across all production data flows
Maintain and report on SLA performance metrics for data pipeline delivery, ensuring visibility into platform health and accountability across internal and external stakeholders
Identify and implement pipeline optimizations that reduce Databricks compute costs, improve throughput, andreduce processing windows while tracking impacts through measurable KPIs
Migrate legacy ETL/ELT pipelines to Databricks, building automation tooling to reduce manual intervention and ensure uninterrupted data delivery during transitions
Support new customers onboarding by provisioning, validating, and hardening tenant data pipelines that deliver reliable, isolated data from day one
Design and build high-performance Databricks pipelines that ingest, transform, and serve ERP and CRM data at scale across both Azure and AWS environments
Own the Delta Lake architecture including schema design, partitioning strategies, data quality enforcement, and incremental processing patterns
Enforce data security best practices across Databricks environments, including role-based access control, secrets management, and compliance requirements for enterprise CRM and ERP data
Implement data quality monitoring and observability across pipeline health and ML model inputs, ensuring data integrity that directly supports Sugar Predict prediction accuracy
Apply and enforce multi-tenant data isolation patterns ensuring reliable, secure data delivery across Sugar Predict enterprise customers
Partner with the Enterprise Architecture team to ensure Sugar Predict data pipelines integrate seamlessly with the broader SugarAI product ecosystem
Support a globally distributed operation through on-call rotation and after-hours incident response, meeting SLAs across multiple time zones
Maintain technical documentation, runbooks, and architectural decision records, contributing to team knowledge sharing and operational readiness across on-call and incident response scenarios
Apply CI/CD best practices to data pipeline development, including version control, automated testing, and deployment tooling to ensure reliable and repeatable pipeline delivery
4+ years of data engineering experience
At least 2 years on Databricks or the Apache Spark ecosystem across Azure and/or AWS
Proficiency in PySpark, SQL, and Python with a strong track record building and operating production-grade pipelines under SLA constraints
Hands-on experience with Delta Lake including schema evolution, ACID transactions, optimize/vacuum lifecycle, and both incremental and streaming processing patterns
Hands-on experience with pipeline performance tuning and compute optimization in production Databricks environments
Solid working knowledge of PostgreSQL including query optimization, schema design, and use as a source or sink in production data pipelines
Experience supporting and maintaining legacy ETL tooling (SSIS, Informatica, custom Python/SQL pipelines, or similar) in production
Experience supporting large-scale multi-tenant architectures with a focus on tenant isolation, per-tenant performance, and data privacy, including navigating tools and platforms that default to single-tenant assumptions
Proven ability to work collaboratively across data science, product, and infrastructure teams, owning end-to-end delivery in a cross-functional environment
Strong understanding of data governance, security, and compliance principles, including access control, data privacy, and protection of sensitive enterprise data across multi-tenant environments
Experience operating Databricks workspaces across both Azure and AWS, including cost governance, cluster management, and cross-cloud data access
Experience optimizing Databricks workloads in a Serverless environment, including compute cost governance and performance tuning for serverless compute
Experience with Microsoft SQL Server in a data engineering or ETL context
Exposure to ML feature engineering or feature stores (Databricks Feature Store, Feast, or similar) supporting predictive analytics
Experience with customer onboarding automation or IaC patterns for provisioning tenant data pipelines at scale
Databricks Certified Data Engineer Associate or Professional certification

What You Will Bring:

4+ years of data engineering experience
At least 2 years on Databricks or the Apache Spark ecosystem across Azure and/or AWS
Proficiency in PySpark, SQL, and Python with a strong track record building and operating production-grade pipelines under SLA constraints
Hands-on experience with Delta Lake including schema evolution, ACID transactions, optimize/vacuum lifecycle, and both incremental and streaming processing patterns
Hands-on experience with pipeline performance tuning and compute optimization in production Databricks environments
Solid working knowledge of PostgreSQL including query optimization, schema design, and use as a source or sink in production data pipelines
Experience supporting and maintaining legacy ETL tooling (SSIS, Informatica, custom Python/SQL pipelines, or similar) in production
Experience supporting large-scale multi-tenant architectures with a focus on tenant isolation, per-tenant performance, and data privacy, including navigating tools and platforms that default to single-tenant assumptions
Proven ability to work collaboratively across data science, product, and infrastructure teams, owning end-to-end delivery in a cross-functional environment
Strong understanding of data governance, security, and compliance principles, including access control, data privacy, and protection of sensitive enterprise data across multi-tenant environments

Preferred Qualifications/Experience:

Experience operating Databricks workspaces across both Azure and AWS, including cost governance, cluster management, and cross-cloud data access
Experience optimizing Databricks workloads in a Serverless environment, including compute cost governance and performance tuning for serverless compute
Experience with Microsoft SQL Server in a data engineering or ETL context
Exposure to ML feature engineering or feature stores (Databricks Feature Store, Feast, or similar) supporting predictive analytics
Experience with customer onboarding automation or IaC patterns for provisioning tenant data pipelines at scale
Databricks Certified Data Engineer Associate or Professional certification

Vagas similares