Lead, Site Reliability Engineering (SRE) Job at Intercontinental Exchange Holdings, Inc., Jacksonville, FL

dkxNRVVZWklRK1F6aXZBQ3NRT0d5Rzl5
  • Intercontinental Exchange Holdings, Inc.
  • Jacksonville, FL

Job Description

Overview:

Job Purpose

Ice Mortgage Technology is the leading provider for the mortgage finance industry. Ice Mortgage Technology provides best-in-class servicing solutions to help manage all aspects of loan servicing — from loan boarding to default. Transform your performance with automation and insights and enhance the customer experience. Our solutions support first mortgages as well as home equity loans, and help servicers lower costs, reduce risk and operate more efficiently. We're looking for motivated, results-oriented people to join our team. 

 

We are seeking a Lead Site Reliability Engineer - an individual who is service oriented, delivery focused and can build rapport with key members of the Operations and SRE teams specifying and implementing automation changes, fixes, and improvement projects. The ideal candidate will have excellent time and customer management skills combined with a range of technical skills and knowledge.

 

Responsibilities

 

Lead SRE to assist with day-to-day activities supporting Mortgage Servicing Application services related to production support, releases, and incident management. Build actionable alerts/automation for preventing incidents, detecting performance bottlenecks, and identifying maintenance activities.

  • Build and maintain tools and solutions for our operations platform, ensuring that we meet our customer service standards and reduce errors
  • Lead complex projects such as data center migrations, major systems upgrades, tech stacks
  • Update existing processes and design new processes as needed to optimize performance
  • Actively participate in or own continuous improvement projects driven by automation
  • Employ deep troubleshooting skills to improve the availability, performance, and security of IMT Services.
  • Implement automated tests, automated deployments, and operational tools
  • Collaborate with Product and Support teams to plan and deploy product releases
  • Conduct root cause analysis and post-mortems for production incidents
  • Participate in on-call rotations and lead incident response efforts
  • Work with Engineering leadership to build shared services that meet the requirements and need of the platform and application teams
  • Ensure services are designed with 24/7 availability and operational readiness and rigor
  • Implementation of proactive monitoring, alerting, trend analysis and self-healing systems
  • Define non-functional requirements as part of the product lifecycle to influence the new designs, standards, and methods for scalable, highly available distributed systems
  • Identify, evaluate, and execute preventive measures to minimize/avoid impact to the customers experience. Proactive v/s Customer escalated
  • Resolution of product/service defects or design changes, infrastructure changes, or operational changes
  • Partner with other SREs and lead by example - contributor more than a delegator

Knowledge and Experience

  • 7+ years of experience in DevOps, SRE, or infrastructure engineering roles in 24x7 Production support services environments
  • BS in Computer Science, Computer Engineering, Math, or equivalent professional experience
  • Fluency with one or more current generation scripting language (Python/Shell/Perl/ PHP/Ruby) AND/OR Java Development and .NET
  • Excellent troubleshooting skills, utilizing a systematic problem-solving approach
  • Demonstrated experience in designing, analysing, and diagnosing large-scale distributed systems + Windows Server and Linux systems internals (system libraries, file systems, client-server protocols)
  • Experience in Windows, Linux, OCP, and AWS
  • Experience with Continuous Integration and Continuous Delivery concepts
  • Hand-on experience in Infrastructure as code tools like Terraform, Spacelift AND/OR Chef, Salt Stack, Ansible, Puppet
  • Good to have experience in Containerization concepts like Kubernetes, Docker
  • Proven strength in SaaS services, experience in massive scale web operations
  • Experience with monitoring and alerting tools (Splunk, BigPanda, PagerDuty)
  • Experience with automation of business continuity/disaster recovery/application resiliency
  • Process-oriented with great documentation skills (Confluence)
  • Experience with data structures/formats such as XML, JSON, YAML, and HCL
  • Must be able to multitask in a fast-paced environment with focus on timeliness, documentation, and communications with peers and business users alike
  • Experience with deployment automation tools like UCD and Azure DevOps (ADO)

#LI-RS1

#LI-Onsite

 

----------: Intercontinental Exchange, Inc. is an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to legally protected characteristics.

Job Tags

Full time,

Similar Jobs

Yale New Haven Health

Palliative Care/Hospice Home Health Aide Job at Yale New Haven Health

 ...centered, respect, accountability, and compassion - must guide what we do, as individuals and professionals, every day. The Home Health Aide is a certified individual who provides direct patient care (personal care) as designated by the Registered Nurse Case Manager,... 

Electric Boat

Test Engineer - Systems Job at Electric Boat

 ...Responsibilities for this Position Test Engineer - Systems US-CT-Groton Job ID: 2025-15541 Type: Full-Time # of Openings: 1 Category: Engineering EB Groton Shipyard Overview Construction Support Engineering (D460) has an opportunity for... 

Get It - Real Estate

Litigation Associate Attorney - Insurance Defense - Remote | WFH Job at Get It - Real Estate

 ...We Offer:** - A collegial and down-to-earth atmosphere where attorneys enjoy lunches, happy hours, and various outside work events together...  ...including: - A flexible working environment with options for remote work. - Paid lunches and parking. - A wellness program that... 

Mattel

Senior Operations Manager Job at Mattel

 ...distribution center that handles B2B and DTC operations with two different technological systems...  ...Opportunity: The Senior Operation Manager is responsible for ensuring distribution...  ..., Human Resources, Procurement, Legal, AG Retail, Internal Audit; and b) external... 

Lazy Acre Trucking LLC

Class A OTR Truck Driver Job Job at Lazy Acre Trucking LLC

Class A OTR Truck Driver JobLazy Acre Trucking LLC currently has an open position for an over the road truck driver. Applicant must have a current CDL and must have held it for a minimum of two years. Must be able to pass background and drug/alcohol screening test. Flat...