Home › Companies › Datapelago › Data Processing Engineer - I/O

Data Processing Engineer - I/O

Job facts

Field	Value
Company	Datapelago
Title	Data Processing Engineer - I/O
Normalized title	-
Department / team	-
Location	-
Work model	Remote / Remote
Employment type	Full Time
Salary	-
Status	active
ATS provider	BambooHR
Posted / first seen	2025-04-02 / 2026-05-30
Changed / last seen	2026-05-30 / 2026-06-06

Related slices

Page	What it contains	Open
Company jobs	Active postings from Datapelago.	Open
Company breakdowns	Role, location, ATS, and work model facets for this company.	Open
ATS provider jobs	Active postings observed through BambooHR.	Open
Provider filtered search	The same provider as a filtered job collection.	Open
Work model jobs	Active Remote postings.	Open
Lifecycle events	Open, update, close, and reopen events for this posting.	Open
Original posting	Canonical source or apply URL captured from the ATS.	Open

Linked records

Company	Datapelago
Source	eaa6c882-d521-4b9b-a50a-5db329d4eb72
ATS provider	BambooHR

Description

Data Processing Engineer - I/O Mountain View, CA / Hyderabad, IN / Remote About DataPelago: DataPelago is at the forefront of revolutionizing data processing for traditional analytics and cutting-edge GenAI preprocessing. We are building an innovative data processing engine that is transforming how Apache Spark, Apache Flink, Ray and others operate on diverse, large-scale data. Our team of engineers drive and adopt advances in hardware-accelerated computing, parallel processing of large-scale data, query optimization, distributed systems, compilers, machine learning, and cloud-native computing. We are looking for specialists to join our engineering team and shape the future of accelerated data processing. The Opportunity: As a Data Processing Engineer - I/O, you will be a key individual contributor in advancing data read and write capabilities of DataPelago’s data processing engine. You will enhance functional breadth, performance, scale, and reliability of the DataPelago engine in reading and writing large scale data of various data types from diverse data sources and data sinks. This is a unique opportunity to make a significant impact on a category-defining product and work with a talented team of engineers. What You'll Do: • Architect: Influence the architecture of how our data processing engine interfaces with data sources and sinks, catalogs, data formats. • Design: Lead design of functional and performance enhancements to adapters/connectors, data representations, data filtering, caching and more in our data processing engine. • Core Development: Individually design, implement, test, optimize, and maintain components of the data processing engine. • Innovation and Differentiation: Analyze technology roadmap of existing and emerging data formats and libraries, open table formats, catalog services, and more (e.g., Apache Arrow, Apache Parquet, Apache Iceberg) and identify opportunities for our engine to enhance technology and product leadership. • Collaboration: Partner effectively with engineering and product management in defining and realizing the data I/O roadmap of our product.. • Continuous Improvement: Foster best practices in design and code reviews, testing, CI/CD, and issue resolution to maintain highest product quality, security, efficiency, & productivity. What You'll Bring: • Bachelor's degree in Computer Science or a related field with 7+ years of relevant experience OR a Master's degree in Computer Science or a related field with 5+ years of relevant experience. • 3+ years of deep technical experience in developing and optimizing data read and write interfaces for large-scale data processing, particularly related to Apache Parquet, Apache ORC, Apache Iceberg, Apache Spark, and similar technologies. • Demonstrated experience in instrumenting, analyzing, and optimizing the performance of data processing engine components on benchmark and customer workloads. • Demonstrated experience in the design, development, and successful release of high-performance data processing engine features for large production deployments. • Good knowledge of the architecture of one or more of Apache Spark, Apache Flink, Presto/ Trino. • Exceptional programming skills in C, C++. Rust experience preferred. • Extensive development experience in Linux environments. • Strong analytical and problem-solving skills with a passion for performance optimization. Location Considerations: We value face-to-face collaboration, but recognize that talent can be found anywhere. Our engineering team works at our headquarters in Mountain View, CA, at our India office in Hyderabad, and at remote locations. Why Join DataPelago? • Technology Leadership: Shape the architecture and development of how our core engine works with advanced data store platforms. • Cutting-Edge Innovation: Work on challenging problems at the forefront of accelerated computing and data processing. • Significant Impact: Your contributions will directly impact the performance and scalability of our mission-critical platform. • Growth: Expand your technical expertise and scope of responsibilities working with other talented engineers and with a growing product. • Competitive compensation, stock options, comprehensive benefits package, leadership de- velopment opportunities.

Full job record

Job ID	8c2cb0e5a0f19c2aac77a9aaf809ca0fdb3da7a7
Org ID	fad8cd3e-2f04-4a77-a77c-aeb512439968
Source ID	eaa6c882-d521-4b9b-a50a-5db329d4eb72
Board ID	eaa6c882-d521-4b9b-a50a-5db329d4eb72
Provider	bamboohr
Provider Job Key	41
Title	Data Processing Engineer - I/O
Normalized Title	—
Status	active
Active	yes
Location Text	—
Department	—
Team	—
Employment Type	full_time
Workplace Type	remote
Remote Policy	remote
Country	—
Region	—
City	—
Salary Raw	—
Salary Min	—
Salary Max	—
Salary Currency	—
Salary Period	—
Source URL	https://datapelago.bamboohr.com/careers/41
Apply URL	https://datapelago.bamboohr.com/careers/41
First Seen At	2026-05-30 06:11:22Z
Last Seen At	2026-06-06 10:26:05Z
Last Checked At	2026-06-06 10:26:05Z
Last Changed At	2026-05-30 06:11:22Z
Inactive At	—
Source Posted At	2025-04-02 00:00:00Z
Source Updated At	—
Raw Payload Uri	s3://job-postings-prod-raw-590183727216/raw/provider=bamboohr/board=datapelago/date=2026-06-06/2026-06-06T10-26-04-697Z-7c727efe35d0e861fa42a8f6f2f986697f92d80acc804a198a3eef676d0d16da.json

Event Fields

{
  "content_hash": "7b1e546787fb53e3166248799e772f0368932ee432ee7e9c4f290c472eaa20e3",
  "source_hash": "75efabf96147170b17f36ca71b0aa87b67763eb30de199447283cef4cc3c2c95",
  "last_changed_at": "2026-05-30T06:11:22.133Z",
  "active_status": "active"
}

Parsed Structured

{
  "language": "en",
  "location": {
    "raw": null,
    "city": null,
    "region": null,
    "country": null,
    "is_remote": true,
    "confidence": null
  },
  "salary_max": null,
  "salary_min": null,
  "inferred_at": "2026-06-06T10:26:05.937Z",
  "launch_scope": {
    "reason": "bamboohr_production_catalog",
    "included": true,
    "location": {
      "raw": null,
      "city": null,
      "region": null,
      "country": null,
      "is_remote": true,
      "confidence": null
    },
    "countries": []
  },
  "remote_policy": "remote",
  "salary_period": null,
  "workplace_type": "remote",
  "salary_currency": null
}

Extensions

{}

Native Structured

{
  "list_job": {
    "id": "41",
    "isRemote": null,
    "location": {
      "city": null,
      "state": null
    },
    "atsLocation": {
      "city": null,
      "state": null,
      "country": null,
      "province": null
    },
    "departmentId": null,
    "locationType": "1",
    "jobOpeningName": "Data Processing Engineer - I/O",
    "departmentLabel": null,
    "employmentStatusLabel": "Full-Time"
  },
  "detail_errors": [],
  "detail_job_opening": {
    "location": {
      "city": null,
      "state": null,
      "postalCode": null,
      "addressCountry": null
    },
    "datePosted": "2025-04-02",
    "atsLocation": {
      "city": null,
      "state": null,
      "country": null,
      "countryId": null
    },
    "description": "<p><span style=\"font-weight: bold\">Data Processing Engineer - I/O</span><br>Mountain View, CA / Hyderabad, IN / Remote</p>\n<p><br><span style=\"font-weight: bold\">About DataPelago:</span></p>\n<p>DataPelago is at the forefront of revolutionizing data processing for traditional analytics and cutting-edge GenAI preprocessing. We are building an innovative data processing engine that is transforming how Apache Spark, Apache Flink, Ray and others operate on diverse, large-scale data. Our team of engineers drive and adopt advances in hardware-accelerated computing, parallel processing of large-scale data, query optimization, distributed systems, compilers, machine learning, and cloud-native computing. We are looking for specialists to join our engineering team and shape the future of accelerated data processing.</p>\n<p><br><span style=\"font-weight: bold\">The Opportunity:</span><br>As a Data Processing Engineer - I/O, you will be a key individual contributor in advancing data<br>read and write capabilities of DataPelago’s data processing engine. You will enhance functional<br>breadth, performance, scale, and reliability of the DataPelago engine in reading and writing large scale data of various data types from diverse data sources and data sinks. This is a unique opportunity to make a significant impact on a category-defining product and work with a talented team of engineers.</p>\n<p><br><span style=\"font-weight: bold\">What You'll Do:</span><br>• Architect: Influence the architecture of how our data processing engine interfaces with data<br>sources and sinks, catalogs, data formats.<br>• Design: Lead design of functional and performance enhancements to adapters/connectors,<br>data representations, data filtering, caching and more in our data processing engine.</p>\n<p>• Core Development: Individually design, implement, test, optimize, and maintain components of the data processing engine.</p>\n<p>• Innovation and Differentiation: Analyze technology roadmap of existing and emerging data<br>formats and libraries, open table formats, catalog services, and more (e.g., Apache Arrow,</p>\n<p>Apache Parquet, Apache Iceberg) and identify opportunities for our engine to enhance technology and product leadership.</p>\n<p>• Collaboration: Partner effectively with engineering and product management in defining and<br>realizing the data I/O roadmap of our product..<br>• Continuous Improvement: Foster best practices in design and code reviews, testing, CI/CD,<br>and issue resolution to maintain highest product quality, security, efficiency, &amp; productivity.</p>\n<p><br><span style=\"font-weight: bold\">What You'll Bring:</span></p>\n<p>• Bachelor's degree in Computer Science or a related field with 7+ years of relevant experience OR a Master's degree in Computer Science or a related field with 5+ years of relevant</p>\n<p>experience.</p>\n<p>• 3+ years of deep technical experience in developing and optimizing data read and write interfaces for large-scale data processing, particularly related to Apache Parquet, Apache</p>\n<p>ORC, Apache Iceberg, Apache Spark, and similar technologies.</p>\n<p>• Demonstrated experience in instrumenting, analyzing, and optimizing the performance of<br>data processing engine components on benchmark and customer workloads.</p>\n<p>• Demonstrated experience in the design, development, and successful release of high-performance data processing engine features for large production deployments.</p>\n<p>• Good knowledge of the architecture of one or more of Apache Spark, Apache Flink, Presto/<br>Trino.<br>• Exceptional programming skills in C, C++. Rust experience preferred.<br>• Extensive development experience in Linux environments.<br>• Strong analytical and problem-solving skills with a passion for performance optimization.</p>\n<p><br><span style=\"font-weight: bold\">Location Considerations:</span></p>\n<p>We value face-to-face collaboration, but recognize that talent can be found anywhere. Our engineering team works at our headquarters in Mountain View, CA, at our India office in Hyderabad, and at remote locations.</p>\n<p><br><span style=\"font-weight: bold\">Why Join DataPelago?</span><br>• Technology Leadership: Shape the architecture and development of how our core engine<br>works with advanced data store platforms.<br>• Cutting-Edge Innovation: Work on challenging problems at the forefront of accelerated<br>computing and data processing.<br>• Significant Impact: Your contributions will directly impact the performance and scalability<br>of our mission-critical platform.<br>• Growth: Expand your technical expertise and scope of responsibilities working with other<br>talented engineers and with a growing product.</p>\n<p>• Competitive compensation, stock options, comprehensive benefits package, leadership de-<br>velopment opportunities.</p>",
    "compensation": null,
    "departmentId": null,
    "locationType": "1",
    "seekPromoted": false,
    "jobCategoryId": null,
    "jobOpeningName": "Data Processing Engineer - I/O",
    "departmentLabel": "",
    "jobOpeningStatus": "Open",
    "minimumExperience": "Experienced",
    "jobOpeningShareUrl": "https://datapelago.bamboohr.com/careers/41",
    "employmentStatusLabel": "Full-Time"
  }
}

Get this page with API

Rendered from the bluedoor Job Postings API. Reproduce it:

GET https://api.bluedoor.sh/job-postings/v1/jobs/8c2cb0e5a0f19c2aac77a9aaf809ca0fdb3da7a7?include=descriptionJSON

GET https://api.bluedoor.sh/job-postings/v1/orgs/fad8cd3e-2f04-4a77-a77c-aeb512439968JSON

GET https://api.bluedoor.sh/job-postings/v1/sources/eaa6c882-d521-4b9b-a50a-5db329d4eb72JSON

GET https://api.bluedoor.sh/job-postings/v1/jobs/8c2cb0e5a0f19c2aac77a9aaf809ca0fdb3da7a7/eventsJSON

Docs · Get an API key