Photon Logo

Photon

Sr Data Engineer, Python + Spark (Data Federation skillset - Data Lakehouse - Eg: Starburst) - New York

Posted 3 Days Ago
Be an Early Applicant
In-Office or Remote
Hiring Remotely in United States
40K-140K Annually
Senior level
In-Office or Remote
Hiring Remotely in United States
40K-140K Annually
Senior level
Design and implement data federation and lakehouse architectures, build scalable ETL/ELT pipelines with Python and Spark, optimize performance across federated queries, manage Delta/Iceberg/Hudi tables, and enforce governance, security, and access controls for analytics and AI teams.
The summary above was generated by AI

Senior Data Engineer (Data Federation & Lakehouse)

As a Senior Data Engineer, you will be responsible for breaking down data silos. This role focuses on building a unified, high-performance data layer using Data Federation techniques. You won't just move data; you will architect a Data Lakehouse environment where disparate sources feel like a single, cohesive database for our analytics and AI teams.

### Core Responsibilities

  • Data Federation Architecture: Design and implement federated query layers (e.g., Starburst/Trino) to allow high-speed analytics across distributed data sources without unnecessary data movement.
  • ETL/ELT Pipeline Development: Build scalable, distributed data processing pipelines using Python and Apache Spark (PySpark).
  • Lakehouse Implementation: Manage and optimize modern table formats like Delta Lake, Apache Iceberg, or Hudi to bring ACID transactions to our data lake.
  • Performance Tuning: Optimize Spark jobs and SQL queries across the federation layer to minimize latency and manage compute costs.
  • Governance & Security: Implement fine-grained access control and data masking within the federation engine to ensure data privacy across all connected platforms.

### Technical Requirements

  • Python & Spark: 5+ years of experience with Python and deep expertise in Apache Spark tuning (partitioning, shuffling, caching).
  • Data Federation Tools: Hands-on experience with Starburst Enterprise, Trino (Presto), or Dremio.
  • Lakehouse Ecosystem: Proven track record working with Delta Lake or Iceberg architectures.
  • Cloud Platforms: Extensive experience with AWS (EMR, S3, Glue), Azure (Databricks, ADLS), or GCP.
  • SQL Mastery: Expert-level SQL skills for complex analytical queries and query plan analysis.
  • Data Modeling: Proficiency in designing Star/Snowflake schemas and understanding "Medallion Architecture" (Bronze, Silver, Gold layers).

### Preferred "Bonus" Skills

  • Experience with Infrastructure as Code (IaC) like Terraform or Pulumi.
  • Familiarity with dbt (data build tool) for modeling within the federation layer.
  • Knowledge of Kubernetes (K8s) for deploying and scaling Spark/Trino clusters.
  • Background in Data Mesh or Data Fabric methodologies.
    Compensation, Benefits and Duration

    Minimum Compensation: USD 40,000
    Maximum Compensation: USD 140,000
    Compensation is based on actual experience and qualifications of the candidate. The above is a reasonable and a good faith estimate for the role.
    Medical, vision, and dental benefits, 401k retirement plan, variable pay/incentives, paid time off, and paid holidays are available for full time employees.
    This position is not available for independent contractors
    No applications will be considered if received more than 120 days after the date of this post

Similar Jobs

An Hour Ago
Remote or Hybrid
Texas, USA
Senior level
Senior level
Digital Media • Information Technology • News + Entertainment
Field-based enterprise seller responsible for developing territory strategy, prospecting and closing mid-market and enterprise multi-location accounts. Delivers face-to-face presentations, builds partner relationships, positions Comcast Business solutions, ensures customer retention, and meets/exceeds financial targets while coordinating with internal teams.
Top Skills: 23)Business ContinuityCustomer Premise Equipment (Cpe)CybersecurityDisaster RecoveryEthernetLanMan (Metropolitan Area Network)Network SecurityNetworking Protocols (Layers 1Sd-WanVoipVpnWanWdm
An Hour Ago
Remote or Hybrid
Pennsylvania, USA
99K-231K Annually
Senior level
99K-231K Annually
Senior level
Digital Media • Information Technology • News + Entertainment
Design, develop, and deploy microservices and ETL applications for a SaaS cybersecurity platform. Collaborate with product, UX, DevOps, and architects to implement features, support production deployments, triage customer issues, and build reusable components while applying DevSecOps practices and production incident mitigation tools.
Top Skills: Apache IcebergContent Management SystemsData LakeDevsecopsDockerETLGitGoJIRAMicroservicesPythonSaaSUnit Test Frameworks
An Hour Ago
Remote or Hybrid
Pennsylvania, USA
Expert/Leader
Expert/Leader
Digital Media • Information Technology • News + Entertainment
Lead and scale a national Inside Enterprise Account Management organization to drive expansion, retention, and multi-product adoption for enterprise customers under $3,500 monthly telco spend. Set strategy, coverage models, quotas, and operating rhythms; coach managers and account teams; partner cross-functionally to remove barriers; and own performance versus revenue, retention, and customer experience goals.
Top Skills: Advanced VoiceBroadbandCybersecurityManaged ServicesMetro EthernetSecurity

What you need to know about the Charlotte Tech Scene

Ranked among the hottest tech cities in 2024 by CompTIA, Charlotte is quickly cementing its place as a major U.S. tech hub. Home to more than 90,000 tech workers, the city’s ecosystem is primed for continued growth, fueled by billions in annual funding from heavyweights like Microsoft and RevTech Labs, which has created thousands of fintech jobs and made the city a go-to for tech pros looking for their next big opportunity.

Key Facts About Charlotte Tech

  • Number of Tech Workers: 90,859; 6.5% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Lowe’s, Bank of America, TIAA, Microsoft, Honeywell
  • Key Industries: Fintech, artificial intelligence, cybersecurity, cloud computing, e-commerce
  • Funding Landscape: $3.1 billion in venture capital funding in 2024 (CED)
  • Notable Investors: Microsoft, Google, Falfurrias Management Partners, RevTech Labs Foundation
  • Research Centers and Universities: University of North Carolina at Charlotte, Northeastern University, North Carolina Research Campus

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account