Home › Companies › Careers Tier1 Icims Com › Sr. Data Engineer-Databricks SME (Remote)

Sr. Data Engineer-Databricks SME (Remote)

Careers Tier1 Icims Com · Raleigh, NC, US · Remote · Active · iCIMS

Job facts

Field	Value
Company	Careers Tier1 Icims Com
Title	Sr. Data Engineer-Databricks SME (Remote)
Normalized title	-
Department / team	Information Technology
Location	Raleigh, NC, United States
Work model	Remote / Remote
Employment type	OTHER
Salary	-
Status	active
ATS provider	iCIMS
Posted / first seen	2024-06-18 / 2026-05-31
Changed / last seen	2026-06-18 / 2026-06-18

Related slices

Page	What it contains	Open
Company jobs	Active postings from Careers Tier1 Icims Com.	Open
Company breakdowns	Role, location, ATS, and work model facets for this company.	Open
ATS provider jobs	Active postings observed through iCIMS.	Open
Provider filtered search	The same provider as a filtered job collection.	Open
City jobs	Active postings in Raleigh.	Open
Department jobs	Active postings in Information Technology.	Open
Work model jobs	Active Remote postings.	Open
Lifecycle events	Open, update, close, and reopen events for this posting.	Open
Original posting	Canonical source or apply URL captured from the ATS.	Open

Linked records

Company	Careers Tier1 Icims Com
Source	8cb0cabd-a91d-4a7e-9738-d054fb4b2b53
ATS provider	iCIMS

Description

Overview Tier One Technologies is seeking a Data Engineer to support our US Government client with data ingestion, data deduplication and data tagging for migration of a large-scale data environment into Databricks. This remote contract-to-hire position will be originated in Raleigh, NC. SELECTED CANDIDATES WITHOUT REQUIRED CLEARANCE WILL BE SUBJECT TO A FEDERAL GOVERNMENT BACKGROUND INVESTIGATION TO RECEIVE IT. Responsibilities Design, develop, and maintain scalable data ingestion pipelines to onboard structured, semi-structured, and unstructured data from batch and streaming sources (e.g., APIs, databases, flat files, message queues) into the Azure/Databricks environment. Implement de-duplication strategies across large-scale datasets using deterministic and probabilistic matching techniques to ensure data integrity and reduce redundancy within the Data Lake. Develop and enforce data tagging frameworks to classify, label, and annotate datasets with appropriate metadata (e.g., sensitivity, source, domain, lineage) to support data governance, discoverability, and compliance requirements. Assist with Operationalizing deployments and support of Cloud services for ETL Operations. This will include standardizing and automating processes and workflows, creating documentation/knowledge articles, and overall assisting Operations staff who have limited experience in Cloud. Written and oral presentations to high-level CIO management on status of current efforts. Possesses skills and experience related to business management, systems engineering, operations research, and management engineering. Typically has specialization in a particular technology or business application. Keeps abreast of technological developments and industry trends. Assist with deployment, configuration, and management of Azure Cloud environment. Assist with migration efforts of existing ETL jobs into Azure/Databricks cloud environment. Ability to share optimization and efficiencies with the larger team and management. Ability to automate solutions to repetitive problems/tasks. Qualifications A degree from an accredited College/University in the applicable field of services is required. If the degree is not in the applicable field, then four additional years of related experience is required. 13+ years of overall IT experience. 5+ years demonstrated experience designing and implementing data ingestion pipelines using tools such as Azure Data Factory, Apache Kafka, Apache NiFi, Spark Structured Streaming, or equivalent technologies. 5+ years of experience applying de-duplication techniques at scale, including record linkage, fuzzy matching, and entity resolution across structured and unstructured datasets. 5+ years of hands-on experience with data tagging and metadata management, including the use of tagging schemas, data catalogs (e.g., Azure Purview, Apache Atlas), and automated classification tools to support data governance and lineage tracking. 5+ years of demonstrated experience working with unstructured data. 2+ years of experience in using Databricks or other Spark-based platforms. Fluency in at least one scripting language (Python, Perl, Ruby, or equivalent). Experience with one or more of the following products and technologies: SAS, C++, Hadoop, SQL Database/Coding, Teradata, Oracle, Amazon S3, Apache Spark, Machine Learning, Natural Language Processing, and visualization tools such as Tableau, Strategy and QLIK is a plus. Integration of Git in continuous deployment and experience with DevOps monitoring tools is a plus. Familiarity with Cloud Operations support in Azure is a plus. Excellent communication skills. Must be able to obtain a Position of Public Trust Clearance. Must be a US Citizen or have US Permanent Residence status (Green Card). Must have resided in the US for the last 5 years and not have traveled outside the US for a combined total of 6 months or more in last 5 years.

Full job record

Job ID	dab02dd1e3b517fce872cf772c66600b9d2a28ae
Org ID	c7db933e-11ae-4a1c-b379-bf11ff35535c
Source ID	8cb0cabd-a91d-4a7e-9738-d054fb4b2b53
Board ID	8cb0cabd-a91d-4a7e-9738-d054fb4b2b53
Provider	icims
Provider Job Key	22059
Title	Sr. Data Engineer-Databricks SME (Remote)
Normalized Title	—
Status	active
Active	yes
Location Text	Raleigh, NC, US
Department	Information Technology
Team	—
Employment Type	OTHER
Workplace Type	remote
Remote Policy	remote
Country	United States
Region	NC
City	Raleigh
Salary Raw	Overview Tier One Technologies is seeking a Data Engineer to support our US Government client with data ingestion, data deduplication and data tagging for migration of a large-scale data environment into Databricks. This remote contract-to-hire position will be originated in Raleigh, NC. SELECTED CANDIDATES WITHOUT REQUIRED CLEARANCE WILL BE SUBJECT TO A FEDERAL GOVERNMENT BACKGROUND INVESTIGATION TO RECEIVE IT. Responsibilities Design, develop, and maintain scalable data ingestion pipelines to onboard structured, semi-structured, and unstructured data from batch and streaming sources (e.g., APIs, databases, flat files, message queues) into the Azure/Databricks environment. Implement de-duplication strategies across large-scale datasets using deterministic and probabilistic matching techniques to ensure data integrity and reduce redundancy within the Data Lake. Develop and enforce data tagging frameworks to classify, label, and annotate datasets with appropriate metadata (e.g., sensitivity, source, domain, lineage) to support data governance, discoverability, and compliance requirements. Assist with Operationalizing deployments and support of Cloud services for ETL Operations. This will include standardizing and automating processes and workflows, creating documentation/knowledge articles, and overall assisting Operations staff who have limited experience in Cloud. Written and oral presentations to high-level CIO management on status of current efforts. Possesses skills and experience related to business management, systems engineering, operations research, and management engineering. Typically has specialization in a particular technology or business application. Keeps abreast of technological developments and industry trends. Assist with deployment, configuration, and management of Azure Cloud environment. Assist with migration efforts of existing ETL jobs into Azure/Databricks cloud environment. Ability to share optimization and efficiencies with the larger team and management. Ability to automate solutions to repetitive problems/tasks. Qualifications A degree from an accredited College/University in the applicable field of services is required. If the degree is not in the applicable field, then four additional years of related experience is required. 13+ years of overall IT experience. 5+ years demonstrated experience designing and implementing data ingestion pipelines using tools such as Azure Data Factory, Apache Kafka, Apache NiFi, Spark Structured Streaming, or equivalent technologies. 5+ years of experience applying de-duplication techniques at scale, including record linkage, fuzzy matching, and entity resolution across structured and unstructured datasets. 5+ years of hands-on experience with data tagging and metadata management, including the use of tagging schemas, data catalogs (e.g., Azure Purview, Apache Atlas), and automated classification tools to support data governance and lineage tracking. 5+ years of demonstrated experience working with unstructured data. 2+ years of experience in using Databricks or other Spark-based platforms. Fluency in at least one scripting language (Python, Perl, Ruby, or equivalent). Experience with one or more of the following products and technologies: SAS, C++, Hadoop, SQL Database/Coding, Teradata, Oracle, Amazon S3, Apache Spark, Machine Learning, Natural Language Processing, and visualization tools such as Tableau, Strategy and QLIK is a plus. Integration of Git in continuous deployment and experience with DevOps monitoring tools is a plus. Familiarity with Cloud Operations support in Azure is a plus. Excellent communication skills. Must be able to obtain a Position of Public Trust Clearance. Must be a US Citizen or have US Permanent Residence status (Green Card). Must have resided in the US for the last 5 years and not have traveled outside the US for a combined total of 6 months or more in last 5 years.
Salary Min	—
Salary Max	—
Salary Currency	—
Salary Period	—
Source URL	https://careers-tier1.icims.com/jobs/22059/sr.-data-engineer-databricks-sme-%28remote%29/job
Apply URL	https://careers-tier1.icims.com/jobs/22059/sr.-data-engineer-databricks-sme-%28remote%29/job
First Seen At	2026-05-31 18:43:13Z
Last Seen At	2026-06-18 08:31:30Z
Last Checked At	2026-06-18 08:31:30Z
Last Changed At	2026-06-18 08:31:30Z
Inactive At	—
Source Posted At	2024-06-18 08:31:30Z
Source Updated At	2026-05-07 15:05:12Z
Raw Payload Uri	s3://job-postings-prod-raw-590183727216/raw/provider=icims/board=careers-tier1.icims.com/date=2026-06-18/2026-06-18T08-31-29-438Z-ba79d5f66ad4584fd3bf8dabd22a8a465f2b840babd74fe8cf63ed24750d97f2.json

Event Fields

{
  "content_hash": "33d99e99e4e5abf2d2e59a5f80f9c5316de43aa36413784479803a6de32d5c2f",
  "source_hash": "b55fb596c7672763647bc4ed5ebb591e83f12b1c7bc30be23b92af7a52570a33",
  "last_changed_at": "2026-06-18T08:31:30.990Z",
  "active_status": "active"
}

Parsed Structured

{
  "language": "en",
  "location": {
    "raw": "Raleigh, NC, US",
    "city": "Raleigh",
    "region": "NC",
    "country": "United States",
    "is_remote": false,
    "confidence": 0.8
  },
  "salary_max": null,
  "salary_min": null,
  "inferred_at": "2026-06-18T08:31:30.983Z",
  "launch_scope": {
    "reason": "english_us_canada",
    "included": true,
    "language": "en",
    "location": {
      "raw": "Raleigh, NC, US",
      "city": "Raleigh",
      "region": "NC",
      "country": "United States",
      "is_remote": false,
      "confidence": 0.8
    },
    "countries": [
      "United States"
    ]
  },
  "remote_policy": "remote",
  "salary_period": null,
  "workplace_type": "remote",
  "salary_currency": null
}

Extensions

{}

Native Structured

{
  "json_ld": {
    "url": "https://careers-tier1.icims.com/jobs/22059/sr.-data-engineer-databricks-sme-%28remote%29/job",
    "@type": "JobPosting",
    "title": "Sr. Data Engineer-Databricks SME (Remote)",
    "@context": "http://schema.org",
    "datePosted": "2024-06-18T08:31:30.257Z",
    "description": "<h2>Overview</h2>\n<ul>\n <li>Tier One Technologies is seeking a  Data Engineer to support our US Government client with data ingestion, data deduplication and data tagging for migration of a large-scale data environment into Databricks. </li>\n <li>This remote contract-to-hire position will be originated in Raleigh, NC.</li>\n <li>SELECTED CANDIDATES WITHOUT REQUIRED CLEARANCE WILL BE SUBJECT TO A FEDERAL GOVERNMENT BACKGROUND INVESTIGATION TO RECEIVE IT.</li>\n</ul>\n<h2>Responsibilities</h2>\n<ul>\n <li>Design, develop, and maintain scalable data ingestion pipelines to onboard structured, semi-structured, and unstructured data from batch and streaming sources (e.g., APIs, databases, flat files, message queues) into the Azure/Databricks environment.</li>\n <li>Implement de-duplication strategies across large-scale datasets using deterministic and probabilistic matching techniques to ensure data integrity and reduce redundancy within the Data Lake.</li>\n <li>Develop and enforce data tagging frameworks to classify, label, and annotate datasets with appropriate metadata (e.g., sensitivity, source, domain, lineage) to support data governance, discoverability, and compliance requirements.</li>\n <li>Assist with Operationalizing deployments and support of Cloud services for ETL Operations. This will include standardizing and automating processes and workflows, creating documentation/knowledge articles, and overall assisting Operations staff who have limited experience in Cloud.</li>\n <li>Written and oral presentations to high-level CIO management on status of current efforts.</li>\n <li>Possesses skills and experience related to business management, systems engineering, operations research, and management engineering. Typically has specialization in a particular technology or business application. Keeps abreast of technological developments and industry trends.</li>\n <li>Assist with deployment, configuration, and management of Azure Cloud environment.</li>\n <li>Assist with migration efforts of existing ETL jobs into Azure/Databricks cloud environment.</li>\n <li>Ability to share optimization and efficiencies with the larger team and management. </li>\n <li>Ability to automate solutions to repetitive problems/tasks.</li>\n</ul>\n<h2>Qualifications</h2>\n<ul>\n <li>A degree from an accredited College/University in the applicable field of services is required. If the degree is not in the applicable field, then four additional years of related experience is required. </li>\n <li>13+ years of overall IT experience.</li>\n <li>5+ years demonstrated experience designing and implementing data ingestion pipelines using tools such as Azure Data Factory, Apache Kafka, Apache NiFi, Spark Structured Streaming, or equivalent technologies.</li>\n <li>5+ years of experience applying de-duplication techniques at scale, including record linkage, fuzzy matching, and entity resolution across structured and unstructured datasets.</li>\n <li>5+ years of hands-on experience with data tagging and metadata management, including the use of tagging schemas, data catalogs (e.g., Azure Purview, Apache Atlas), and automated classification tools to support data governance and lineage tracking.</li>\n <li>5+ years of demonstrated experience working with unstructured data.</li>\n <li>2+ years of experience in using Databricks or other Spark-based platforms.</li>\n <li>Fluency in at least one scripting language (Python, Perl, Ruby, or equivalent).</li>\n <li>Experience with one or more of the following products and technologies: SAS, C++, Hadoop, SQL Database/Coding, Teradata, Oracle, Amazon S3, Apache Spark, Machine Learning, Natural Language Processing, and visualization tools such as Tableau, Strategy and QLIK is a plus.</li>\n <li>Integration of Git in continuous deployment and experience with DevOps monitoring tools is a plus.</li>\n <li>Familiarity with Cloud Operations support in Azure is a plus.</li>\n <li>Excellent communication skills.</li>\n <li>Must be able to obtain a Position of Public Trust Clearance.</li>\n <li>Must be a US Citizen or have US Permanent Residence status (Green Card).</li>\n <li>Must have resided in the US for the last 5 years and not have traveled outside the US for a combined total of 6 months or more in last 5 years.</li>\n</ul>",
    "directApply": true,
    "jobLocation": [
      {
        "@type": "Place",
        "address": {
          "@type": "PostalAddress",
          "postalCode": "UNAVAILABLE",
          "addressRegion": "NC",
          "streetAddress": "UNAVAILABLE",
          "addressCountry": "US",
          "addressLocality": "Raleigh",
          "postOfficeBoxNumber": "UNAVAILABLE"
        }
      }
    ],
    "validThrough": "2027-06-18T08:31:30.257Z",
    "employmentType": "OTHER",
    "hiringOrganization": {
      "name": "A.C. Coy",
      "@type": "Organization",
      "sameAs": "www.accoy.com"
    },
    "occupationalCategory": "Information Technology"
  },
  "detail_meta": {
    "url": "https://careers-tier1.icims.com/jobs/22059/sr.-data-engineer-databricks-sme-%28remote%29/job?in_iframe=1",
    "http_status": 200,
    "content_type": "text/html;charset=UTF-8",
    "response_bytes": 36047,
    "compact_response_bytes": 5266,
    "original_response_bytes": 36047
  },
  "sitemap_job": {
    "id": "22059",
    "url": "https://careers-tier1.icims.com/jobs/22059/sr.-data-engineer-databricks-sme-%28remote%29/job",
    "slug": "sr.-data-engineer-databricks-sme-%28remote%29",
    "lastmod": "2026-05-07T11:05:12-04:00"
  },
  "detail_errors": []
}

Get this page with API

Rendered from the bluedoor Job Postings API. Reproduce it:

GET https://api.bluedoor.sh/job-postings/v1/jobs/dab02dd1e3b517fce872cf772c66600b9d2a28ae?include=descriptionJSON

GET https://api.bluedoor.sh/job-postings/v1/orgs/c7db933e-11ae-4a1c-b379-bf11ff35535cJSON

GET https://api.bluedoor.sh/job-postings/v1/sources/8cb0cabd-a91d-4a7e-9738-d054fb4b2b53JSON

GET https://api.bluedoor.sh/job-postings/v1/jobs/dab02dd1e3b517fce872cf772c66600b9d2a28ae/eventsJSON

Docs · Get an API key