MS-SQL DBA Expert - Optimization & Tuning
Lahore, Punjab, Pakistan
Full Time
Data Engineering
Experienced
Location: Lahore/Islamabad/Karachi
Experience: 8+ years
NorthBay is looking for a hands-on Microsoft SQL Server expert who can own the performance, stability, and availability of multiple clustered SQL Server environments running on AWS Windows-based EC2. The role is focused on assessment/discovery, deep-dive/troubleshooting & performance optimization against high-volume production systems. You will be the go-to expert for identifying bottlenecks, tuning SQL & T‑SQL (Including SPs/Functions), Stabilizing Cluster/DR failovers, and designing/improving observability including proactive monitoring and alerting
Requirements:
- Overall 8+ years of experience with at least 5 years of Core SQL Server hands-on experience as a SQL Server DBA / Engineer, with a strong focus on performance and HA/DR in mission-critical environments
- Deep understanding of SQL Server internals (storage engine, query optimizer, plan cache, buffer pool, TempDB behavior etc.).
- Expert-level experience in diagnosing Wait stats, DMVs, Extended Events, Profiler/Tracing, Execution plan analysis, query rewrites, indexing strategy design
- Troubleshooting CPU spikes, I/O bottlenecks, memory pressure, TempDB contention, log bottlenecks (e.g. WRITELOG waits) etc.
- Expert in SQL & T‑SQL optimizations including SPs, Triggers, Views, Functions etc. along with in depth understanding of locking, blocking, deadlocks, and transaction-related performance issues
- Ability to identify & analyze resource-heavy queries/processes and systematically optimize them (indexing, hints, plan guides, code refactoring, re-architecture where needed)
- Hands-on expertise in designing, deploying, and troubleshooting Windows Failover Clustering (WSFC), Always On Availability Groups (AGs), Failover Cluster Instances (FCI), Log Shipping, Mirroring & Replication
- Experience with AWS Cloud, EC2, Instance Families/Sizes, EBS Volumes, IOPS, Throughput, Networking, and Multi‑AZ Designs in relation to MSSQL Server Performance, HA/DR & Monitoring & Observability
- Strong experience with database monitoring tools and enterprise level monitoring/observability expectations, including SQL Server dashboards (performance, waits, blocking, deadlocks, availability), Focus on Implementing actionable Alerting to reduce the noise
- Experience with T-SQL, PowerShell & SQL Server Agent etc.
- Having experience with Reporting Services (SSRS) performance and optimization will be preferred
- Strong communication skills with the ability to explain complex database issues to non-DBA stakeholders and mentor developers on SQL best practices.
Responsibilities:
- Optimize database architecture, configuration, indexing strategy, statistics, TempDB, log & data file layout in line with SQL Server best practices
- Review MS SQL Server Configurations, Analyze and resolve performance bottlenecks across CPU, memory, I/O, and network
- Conduct wait stats, DMVs, execution plans, and XEvents analysis to identify root causes
- Review and refine MAXDOP, cost threshold for parallelism, memory settings, TempDB configuration, and disk layout to improve performance and stability
- Review, tune, and refactor complex SQL and T‑SQL stored procedures, views, and functions
- Identify and resolve locking, blocking, deadlocks, transactional contentions, and long-running queries by optimizing indexing and/or query tuning and rewrites
- Assess and document existing HA/DR implementations across clusters, Troubleshoot and stabilize cluster switchovers/failovers, including issues seen during planned and unplanned events
- Define and implement HA/DR designs to meet RPO/RTO targets & Develop/Test failover/runbooks
- Improve the Proactive & Reactive Monitoring / Observability with recommending/implementing comprehensive monitoring & alerting across SQL Server health and performance, HADR/Cluster state, sync latency etc.
- Recommend and implement best-fit monitoring tools (e.g. advanced Datadog usage, plus tools such as SQL Sentry, Redgate SQL Monitor, SolarWinds DPA, native SQL tools, etc.).
- Incident Response & Root Cause Analysis as lead on critical incident investigations for production performance and availability issues
- Recommendations based on Architecture, Design Best Practices by reviewing existing SQL Server architecture, and configurations and bring them in line with current best practices
- Advise on capacity planning, scaling strategies, backup/restore strategy, indexing & maintenance plans
- Automate repetitive tasks e.g., index maintenance, stats updates, log cleanups, data archiving
- Contribute to documentation, standards, and knowledge sharing for DB usage across teams
What We Offer
- Competitive salary and benefits
- Fuel expense reimbursement
- Paid holidays and vacations
- Medical outpatient reimbursement & health insurance facility
- Career path to make a difference in a highly productive environment
Apply for this position
Required*