72 open positions
Showing 1-20 of 72 matching jobs.
The Opportunity We are looking for a Stats Engineering leader to help us build two new Datadog products from scratch - Feature Flags and Experiments. Our goal with these products is to help developers and product teams ship features quickly, experiment as second-nature, and make decisions with confidence. To do this, we need to build a world-class experimentation engine, backed by state-of-the-art statistical methods like sequential analysis, CUPED, and change point detection, which help to solve the big problems in the experimentation world of early peeking, long experiment durations, and c...
The Opportunity We’re looking for an experienced engineer to join Datadog’s Workflow Engine Team, the group behind Atlas — our platform for building reliable, long-running workflows as code. Atlas is built on Temporal, the leading open-source workflow orchestration technology, and plays a central role in how Datadog builds complex distributed systems at scale. We work closely with the Temporal community and have contributed upstream improvements in areas like reliability, performance, and developer tooling, strengthening both Atlas and the broader ecosystem. As a Staff Engineer, you’ll help...
The ML Observability team builds cutting-edge tools to monitor, explain, and improve AI systems in production, particularly those leveraging Large Language Models (LLMs) and generative AI. We provide robust, scalable observability for AI workloads, including drift detection and model evaluation, and behavior tracing, enabling customers to ship AI with confidence. As a Staff Engineer, you’ll lead the development of new features and foundational capabilities within Datadog’s LLM Observability product. You will shape product direction, drive experimentation, and apply your deep understanding of...
About Datadog We're on a mission to build the best observability platform in the world for engineers to understand and scale their systems, applications, and teams. We operate at high scale—trillions of data points per day—providing always-on alerting, metrics visualization, logs, and application tracing for tens of thousands of companies. Our engineering culture values pragmatism, honesty, and simplicity to solve hard problems the right way. The Opportunity The Metrics Query team owns and operates the “read” side of Datadog’s metrics platform. Our systems provide the APIs and interfaces fo...
As the Staff Product Designer for Bits AI , you will own the end-to-end design of Datadog’s User Assistant Agent. The User Assistant Agent is a multi-modal assistant that allows the user to interact with Datadog in natural language, surface insights from a user’s environment, and correlate key data from across the Datadog platform through generative AI and large language models (LLM). The feature meets users where they are across desktop, mobile, team communication tools, and the Datadog web platform, surfacing insights where and when the user needs them. Example workflows the assistant help...
We’re looking for a Staff Software Engineer with deep experience in GenAI/ML to join Datadog’s Application Performance Monitoring (APM) team. APM is a product which provides deep visibility into applications, enabling users to identify performance bottlenecks, troubleshoot issues, and optimize services. With distributed tracing, profiling, out-of-the-box dashboards, and seamless correlation with other telemetry data, Datadog APM provides some of the deepest and most structured visibility into the health and performance of applications. This context sets us up for an opportunity to be the wor...
We are looking for a staff applied scientist to join our Applied AI organization. Reporting directly to the director of engineering for applied science, your scope and impact will span across the whole applied science organization. As a staff applied scientist, you will help design, implement and scale new and existing Datadog’s features. You will thrive to answer questions such as: “Is it feasible?”, “can we do it more efficiently with another technique?” and “what’s the business opportunity of this new data science framework?”. Your roles will cover the whole data science lifecycle (de...
We're on a mission to build the best platform for our engineers to deliver stateful services at high scale. We provide High Performance Transaction Systems to all the Datadog developers, we empower them with solutions which focus on simplicity, reliability, durability, and availability. We operate services and databases in a scalable and cost-efficient way, evolving in a high-volume, low-latency environment that is continuing to double in size. We value pragmatism, honesty, and simplicity to solve hard problems the right way. At Datadog, we place value in our office culture - the relations...
We’re looking for a Software Engineer to join the REDAPL (Referential Data Platform) team and help evolve Datadog’s core platform for tracking infrastructure resources and relationships. REDAPL powers several Datadog products - including Cloud Security Posture Management, Resource Catalog, Cloud Cost Management, and Service Catalog - by enabling customers to understand and analyze how their infrastructure components interact, perform, and scale. As more teams integrate with REDAPL, our ingestion and query volumes are growing rapidly, presenting exciting scaling and optimization challenges. I...
We’re looking for interns to join us to help collect, aggregate, visualize, and analyze high-scale metrics, logs, and application data. Software engineering at Datadog includes a variety of exciting opportunities across backend, frontend, infrastructure, libraries, data engineering, and data science. Interested in distributed systems or Kubernetes across tens of thousands of nodes? Looking to build out products like CI Visibility or Cloud Security Management or perhaps join one of our internal teams helping engineering teams ship apps quickly? Want to see your work actually impact and improv...
The Data & Analytics platform is responsible for batch and streaming data processing infrastructure at Datadog. Our mission is to provide a managed, integrated ecosystem to enable everyone at Datadog to easily interact with and derive value from data. We support products such as Cloud Cost Management, Metrics, Security, Data Science and Product Analytics, running 300k batch jobs per day and processing millions of points per second in Flink. We're looking for a Software Engineer to collaborate with the infrastructure and product organizations to build and improve the deployment, management, a...
At Datadog, we process millions of monitors every minute, alerting our customers when configured thresholds are crossed. Engineers across the world trust us to alert them when necessary, so our Alerting stack needs to be reliable and scalable. Data is at the heart of our decision-making. Our Data & Analytics Governance team is responsible for managing our Data Catalog, which tracks millions of data assets and their metadata, supports event-driven pipelines, and ensures compliance at scale. This platform empowers our engineers to build reliable pipelines and our analysts to make data-driven...
We’re looking for a Senior Staff Software Engineer with deep experience in GenAI/ML to join Datadog’s Application Performance Monitoring (APM) team. APM is a product which provides deep visibility into applications, enabling users to identify performance bottlenecks, troubleshoot issues, and optimize services. With distributed tracing, profiling, out-of-the-box dashboards, and seamless correlation with other telemetry data, Datadog APM provides some of the deepest and most structured visibility into the health and performance of applications. This context sets us up for an opportunity to be ...
We are looking for a Staff Engineer to join our Event Platform Storage team. Our Event Platform ingests, transforms and stores events to provide more than 30 Datadog products data retrieved by a query API at a rate of ~15 Million messages/second. With a focus on a high-level of reliability, you will contribute to a platform using Java, Go and Rust. Engineers with a background or interest in the challenges of optimizing distributed systems for durability, high availability, low latency and scalability are encouraged to apply. Husky Blog Husky Deep Dive This is a unique opportunity to contri...
About Datadog Datadog is the essential monitoring and security platform for cloud applications. As we expand our footprint in AI and machine learning, we’re investing deeply in teams building large-scale, production-grade AI systems that power observability, automation, and intelligence across Datadog’s products. We’re a fast-moving, collaborative community of builders—engineers, researchers, and technical leaders—working to shape the next generation of AI capabilities at Datadog. About the Role We’re looking for a Senior Recruiting Sourcer to help grow Datadog’s AI Engineering, AI Research,...
The Cross-Product Queries (XPQ) team designs, builds and operates query languages, systems and services that join, aggregate, and transform the observability data that reside in Datadog’s petabyte-scale storage systems - such as metrics, logs and application traces. The platform serves live traffic of thousands of queries per second, powering key Datadog product features. At Datadog, we place value in our office culture - the relationships that it builds, the creativity it brings to the table, and the collaboration of being together. We operate as a hybrid workplace to ensure our employees c...
The Universal Service Monitoring team is building a zero-instrumentation observability solution that automatically discovers services on every host. We’re looking for a senior distributed systems engineer to design, implement, and run in production the foundational platforms powering this application. You’ll get to apply your strong systems-level thinking to create data pipelines that ingest, store, analyze, and query in real-time billions of events per second from companies all over the globe. You’ll join a high-impact team tackling ambitious technical challenges, like decoding traffic acro...
At Datadog, we’re building the monitoring and security platform for the cloud age—used by developers, IT teams, and business stakeholders to understand and optimize their infrastructure. We're looking for a Senior Software Engineer to help advance REDAPL, our Referential Data Platform that tracks and contextualizes our customers’ infrastructure resources and their relationships. About REDAPL REDAPL powers multiple customer-facing products such as Cloud Security Posture Management, Cloud Cost Management, Resource Catalog, and more. It handles large-scale data ingestion and querying, providing...
We are looking for a Senior Software Engineer who brings strong backend and distributed systems experience to join our Cross-Product Query (XPQ) organization. As part of the Query Connectors team, you'll help build and maintain critical data connectivity infrastructure while constantly exploring and adapting to new technologies and data platforms. This role combines deep backend expertise with a passion for learning new technologies. While experience with databases and query systems is valuable, we're primarily looking for engineers who have built robust distributed systems and are excited a...
We are looking for a Senior Software Engineer who brings strong backend and distributed systems experience to join our Cross-Product Query (XPQ) organization. As part of the Query Connectors team, you'll help build and maintain critical data connectivity infrastructure while constantly exploring and adapting to new technologies and data platforms. This role combines deep backend expertise with a passion for learning new technologies. While experience with databases and query systems is valuable, we're primarily looking for engineers who have built robust distributed systems and are excited a...