* Own SLOs/SLIs for availability (99.9%), latency, error rate, and quality of service across microservices.* Design/operate end‑to‑end observability: metrics, logs, traces, synthetic checks, real‑user monitoring (RUM).* Instrument services (Windows services, APIs, background jobs) with structured logs and trace context.* Build health probes and SLA monitors for critical transactions and cross-service dependencies.* Monitor system issues using various metrics, such as uptime, latency, error rate, throughput, and availability* Deploy and maintain monitoring and on-call tools i.e.: Splunk on-call, Prometheus, Datadog, etc.* Lead incident response (triage, comms, coordination, real-time mitigation) and conduct blameless postmortems with actionable follow-ups.* Maintain and continuously improve runbooks, escalation paths, on call rotations, and paging policies.* Implement MTTA/MTTR reduction programs.* Stand up war room protocols and ensure stakeholder updates during incidents.* Forecast compute, storage, network needs, track headroom against growth and peak patterns.* Conduct performance profiling and bottleneck analyses (CPU, memory, I/O, thread pools, connection pools).* Optimize resource allocation on VMware (DRS, affinity rules, reservations) and Windows VM tuning (kernel, TCP stack, NICs).* Validate scaling strategies (horizontal vs. vertical) and implement auto-scaling where supported.* Standardize gold images, configuration baselines, and desired state for Windows Server (PowerShell DSC or equivalent).* Manage patching (OS, middleware, runtime) with maintenance windows aligned to error budgets.* Ensure backup, snapshot, and restore strategies meet RPO/RTO; regularly test restores.* Maintain secure baselines (CIS benchmarks for Windows/VMware), vulnerability management, and patch cadence.* Support compliance audits (PCI-CP, PCI-DSS, SOC 2/ISO 27001), produce evidence (configs, logs, access reviews), and remediate gaps.* Automate provisioning (VM templates, DSC/Ansible for Windows, Terraform for VMware) and configuration drift detection/correction.* Build runbooks to reduce toil (deploy, scale, rollback, etc)* Create reliability guardrails (pre‑flight checks, change freeze rules, policy controls) as code.* Continuously refactor scripts/runbooks into idempotent automation.* Collaborate with development teams and other stakeholders to identify potential risks, such as security vulnerabilities, performance bottlenecks, deployment issues, or configuration errors* implement various risk mitigation strategies, such as patching, backup, redundancy, encryption, or testing* Collaborate with product teams and other teams to understand the user needs, expectations, and satisfaction.* Coach engineers on SRE principles, incident handling, and reliability centric design.* Lead knowledge sharing, runbooks quality, and postmortem culture (blameless, action-oriented).* Provide after-hours support for production issues on a rotational basis with other team members to ensure system availability 24/7/365.* Bachelor’s degree in computer science, Software Engineering, or equivalent combination of education and experience* 5+ years of related experience as a Software Engineer, DevOps Engineer, Site Reliability Engineer or a role in similar capacity* Extensive experience working with enterprise level micro-services applications, including deployment and maintenance of the applications in distributed environments.* Demonstrated hands-on experience and expertise with DevOps tooling (Ansible, Terraform, Jenkins, Octopus deploy, etc.) networks, network security, high-level managerial skills* In-Depth hands-on experience with on-prem and cloud compute, storage and networking solutions (vmWare, NetApp, Azure, AWS, etc)**Where You Will Be:** This role is **fully in-office**, requiring **five days a week onsite** at one of Entrust’s offices in **Minneapolis, Colorado, or Dallas**, as specified in the job description. Entrust operates with a distributed workforce, and this position is aligned with our in-office product development teams.At Entrust, we don’t just offer jobs – we offer career journeys. Here is what you can expect when you join our team:Flexibility: Life is all about balance. Whether you’re remote, hybrid, or on-site, we offer flexible options that fit your lifestyle. #J-18808-Ljbffr Entrust Corporation
...Job Description Production Supervisor Wire Harness Manufacturing Morgan Hill, CA Salary: 110k Full-Time | Onsite Industry: Electrical / Wire Harness & Cable Assembly About the Company Join a growing manufacturer specializing in custom wire harnesses...
...Description Amtec Staffing has partnered with Progressive Roofing , a premier Commercial Roofing specialty contractor based in... ...roofing professionals on-site. Coordinate with clients and subcontractors, ensuring clear communication and smooth project execution....
...About the job Registered Nurses - TN Visa (Mexico or Canada) Advance Your Nursing Career in Missouri Were hiring nurses... ...holidays ~ Professional development opportunities ~90 days of free housing for TN Visa candidates ~ Green Card processing after...
...Education Group is a multi-brand education network of superior private school institutions spanning infant care through high school. The... ...through unique and carefully crafted curricula. Middle School Teacher A school for gifted learners, The Rhoades School has...
...Job Description Join Our Team in Camp Hill, PA! Accountant Full-Time | Property Management, Inc. (PMI) Are you ready to take the next step in your accounting career with a company that values your growth and success? PMI, a leader in real estate management...