|      The Site Reliability Engineers    (SRE) works independently and is responsible for the overall performance,    stability, reliability and scalability of enterprise-wide internet-facing    systems; ensures ’s complex, web-scale systems are healthy, monitored,    automated, and designed to scale. The SRE will direct and lead continuous    improvement efforts and incident response of cross-functional support teams    to troubleshoot and address database, OS, application, network and any other    issues.   |    
|      PRIMARY DUTIES AND RESPONSIBILITES  |    
|      Essential    Functions: ·            Subject matter expert in several    technical domains, including infrastructure, storage design, operating    systems, networking, engineering.   ·            Lead and drive continuous    improvement initiatives in domain expertise. ·            Uses    technical expertise to reviews and makes recommendations for engineering    reliability into code, infrastructure, OS, network, and processes used to    ensure the application is always fast, available, and scalable.   ·            Oversees    the monitoring of software performance, packets flow, and hardware and how    code interacts in  support of managing services; predicting and    preventing failures ·            As    a Subject Matter Expert, makes recommendations to development teams to    ensures the availability, speed, scalability and efficiency of services by    engineering reliability into software and monitoring systems ·            Respond to and resolve emergent service problems; builds custom tools    to automate daily functions to prevent problem recurrence ·            Works    in close contact with Architecture, Development and Infrastructure teams on    software and system performance analysis and tuning, service capacity    planning and demand forecasting; coordinates efforts of cross-functional    teams to design and implement solutions  Required Knowledge and skills Education:       ·            10+    years engineering and/or administering a high-volume or critical production    service environment running on a UNIX/Linux platform ·            Strong    working knowledge of C, C++ or Java and Shell, Perl or Python. ·            Hands-on experience in    Apache,JBoss,Tomcat, Load Balancers (F5) and Firewalls ·            Understanding of IP networking, network devices and common topologies. ·            Proven technical troubleshooting and performance tuning experience. ·            Excellent    analytical skills, coupled with a strong sense of ownership, urgency and    drive. ·            Ability to troubleshoot and resolve customer problems that arise and    with a high degree of independence.as well as manage multiple task    assignments ·            Excellent    written communication skills. Preferred Knowledge and skills ·            Working    knowledge in the following areas: o           PureData (DB2) o           Mongo DB o           Redis o           Qpid (MRG) o           DataPower Working conditions Decision Making, Autonomy, Budgeting Authority &    Supervisory Responsibility Scope of impact of decisions:  Result of decisions:  Problem Solving:  Knowledge Level:  Level of Autonomy:  Budget Responsibility:  Management:   |    
Please call or  email
Janelle Razzino
Razzino Associates, Inc.
Suite E-1
(O) 201-722-3111
(c) 201-925-6086
(f) 201-722-3113
http://www.razzinoassociates.com/
|   			  				 |   		  			 This email is free from viruses and malware because avast! Antivirus protection is active.  |   	
Posted by: John Rechenberg <jar1@optimum.net>
| Reply via web post | • | Reply to sender | • | Reply to group | • | Start a New Topic | • | Messages in this topic (1) | 
No comments:
Post a Comment