Principal Kafka Site Reliability Engineer Devops

Principal Kafka Site Reliability Engineer Devops
Company:

Palo_Alto_Networks


Details of the offer

We are reshaping the cybersecurity market through our cloud-delivered security services, and our cloud infrastructure is quickly and massively growing with a global footprint. We're looking for great SREs, as well as software engineers interested in production engineering, to help us scale the largest enterprise security cloud infrastructure in the world.

Description

Palo Alto Networks reinvented the enterprise firewall, growing from a start-up to a multi-billion-dollar company. Our Application Framework, the latest offering in our cloud-delivered security services, ingests security events from hundreds of thousands of firewalls deployed across the globe to provide a massive data analytics platform for deep inspection, anomaly detection, and actionable security automation. Our cloud infrastructure is home to a series of massive and complicated distributed systems and virtualization software platforms which enable big data processing around security services, sandboxing and malware detection, URL categorization and malicious site/domain identification, and security research/response.

RESPONSIBILITIES:

You will be responsible for maintaining and scaling production Kafka clusters with very high ingestion rates, Zookeeper clusters, as well as other big data pipeline systems such as Kafka and HDFS.

You will improve scalability, service reliability, capacity, and performance.

You will write automation code for managing, monitoring, measuring, expanding, and healing clusters.

You are not an operator, you're an experienced software engineer focused on operations.

You will do Kafka tuning, capacity planning, and deep dive troubleshooting.

You will participate in the occasional on-call rotation supporting the infrastructure.

You will roll up the sleeves to troubleshoot incidents, formulate theories and test your hypothesis, and narrow down possibilities to find the root cause.

QUALIFICATIONS:

Hands on experience with managing production Kafka clusters.

Strong development/automation skills. Must be very comfortable with reading and writing Python. Commits to Kafka source code would be a big plus.

In-depth understanding of the internals of Kafka cluster management, Zookeeper, partitioning, topic replication and mirroring.

Very good grasp of monitoring and metrics collection, performance tuning, and troubleshooting complicated situations with distributed systems.

Tools-first mindset. You build tools for yourself and others to increase efficiency and to make hard or repetitive tasks easy and quick.

Organized, focused on building, improving, resolving and delivering. Good communicator in and across teams, great teamwork, and a character of taking ownership.

Learn more about Palo Alto Networkshereand check out ourfast facts #LI-MB1

#J-18808-Ljbffr


Source: Grabsjobs_Co

Requirements

Principal Kafka Site Reliability Engineer Devops
Company:

Palo_Alto_Networks


Client Development Support Specialist

Position OverviewThe Support Specialist provides operational/transactional support to all Renovate America customers (contractors, property owners, lenders, ...


From Renovate America - California

Published 19 days ago

Sr. Cloudsec Engineer

Job Summary:The Cybersecurity Engineer position is a hands-on role that involves evaluating and enforcing cybersecurity and compliance controls. This positio...


From Iherb - California

Published 16 days ago

Specialist, Full Stack Developer Ii

Team Rubicon (TR)isseekingaSpecialist, Full Stack Developer II. TheFull Stack DeveloperIIwillworkalongsidethe rest of the Technology Department to build and ...


From Team Rubicon - California

Published 16 days ago

Senior Software Engineer - Developer Experience

The mission of the Developer Experience team is to design, develop, and support tools and processes that make common developer workflows efficient and reliab...


From Reddit - California

Published 16 days ago

Built at: 2024-05-20T21:55:08.679Z