Performance Testing Plan#
Define Testing Objectives#
- Business Requirements: Identify performance metrics (response time, throughput, error rate, etc.) for key business scenarios (such as login, payment, query, etc.).
- Performance Metrics: Set baseline values (e.g., 95% of requests should respond in under 2 seconds), peak values (e.g., support 100,000 concurrent users), and fault tolerance.
- Non-functional Requirements: Stability, scalability, resource utilization (CPU, memory, disk I/O, network bandwidth), etc.
Testing Scope#
- System Scope: Boundaries of the system under test (SUT) (frontend, backend, database, third-party services, etc.).
- Testing Types: Load testing, stress testing, stability testing, capacity testing, etc.
Testing Environment#
- Environment Setup: Ensure consistency with the production environment as much as possible (hardware configuration, network topology, database size).
- Data Preparation: Use real or simulated data (must cover typical scenarios, avoid data skew).
Scenario Design#
- Single Scenario Testing: Performance testing for a single function (e.g., user login).
- Mixed Scenario Testing: Simulate real user behavior (e.g., logging in, placing orders, querying simultaneously).
- Peak Testing: Simulate burst traffic (e.g., flash sales).
- Stability Testing: Run for an extended period (e.g., 7×24 hours) to observe memory leaks or performance degradation.
Execution Strategy#
- Gradual Load Increase: Start from low load and gradually increase to peak load, observing performance inflection points.
- Multiple Iterations: Repeat testing after optimizations to validate improvements.
Result Analysis and Reporting#
- Performance Baseline: Record the current performance state as a baseline.
- Problem Localization: Combine logs and monitoring data to identify bottlenecks (e.g., slow database queries, code deadlocks).
- Optimization Suggestions: Propose optimization solutions at the code, configuration, or architecture level.
Performance Testing Focus Areas#
System Level#
- Utilization rates of CPU, memory, disk I/O, and network bandwidth.
- Operating system parameter configurations (e.g., Linux kernel parameters).
Application Level#
- Code execution efficiency (e.g., algorithm complexity, thread pool configuration).
- Database performance (slow queries, missing indexes, connection pool configuration).
- Middleware performance (e.g., Redis cache hit rate, message queue backlog).
Network Level#
- Latency, packet loss rate, bandwidth bottlenecks.
- Performance of CDN or load balancers.
Business Level#
- User concurrency, TPS (transactions per second), RT (response time).
- Business success rate (e.g., error rate not exceeding 0.1%).
Performance Testing Methods#
Benchmark Test#
- Purpose: Determine the performance baseline of the system under normal load.
- Method: Single user or low concurrency scenario testing.
Load Test#
- Purpose: Validate system performance under expected load.
- Method: Gradually increase the number of concurrent users, observing response time and resource consumption.
Stress Test#
- Purpose: Identify the system's performance limits and breaking points.
- Method: Continuously apply pressure until the system crashes, recording maximum capacity.
Concurrency Test#
- Purpose: Validate resource contention issues when multiple users operate simultaneously.
- Method: Simulate high concurrency scenarios (e.g., flash sales).
Endurance Test#
- Purpose: Detect memory leaks or performance degradation during prolonged operation.
- Method: Run continuously for over 12 hours, observing resource usage trends.
Capacity Test#
- Purpose: Determine the maximum processing capacity of the system, providing a basis for scaling.
- Method: Gradually increase data volume or user count until performance standards are not met.
Common Performance Testing Tools#
Load Generation Tools#
- JMeter (Open Source):
- Supports various protocols like HTTP, JDBC, FTP, etc.
- Highly extensible, supports BeanShell scripts and plugins.
- Suitable for web applications and API testing.
- LoadRunner (Commercial):
- Comprehensive functionality, supports complex scenario recording and IP spoofing.
- Suitable for enterprise-level large systems.
- Gatling (Open Source):
- Based on Scala, high performance, intuitive reporting.
- Suitable for high concurrency scenarios.
- Locust (Open Source):
- Based on Python, distributed load testing, supports custom scripts.
- Highly flexible.
Monitoring and Analysis Tools#
- Prometheus + Grafana:
- Real-time monitoring of system resources, application metrics, and custom metrics.
- Visual representation of performance trends.
- APM Tools (e.g., New Relic, SkyWalking):
- Trace code-level performance issues (e.g., slow SQL, method execution time).
- JVM Monitoring Tools (e.g., JConsole, VisualVM):
- Analyze Java application memory, threads, and GC status.
- Database Tools (e.g., Percona Toolkit, MySQL slow query log):
- Identify SQL performance issues.
Cloud Testing Platforms#
- BlazeMeter: Cloud-based load testing service compatible with JMeter scripts.
- AWS Load Testing: Distributed load testing solution based on AWS.
Test Scenarios and Test Cases#
Based on Testing Focus Areas#
System Level#
1. CPU Utilization
- Scenario: High concurrent user requests lead to increased CPU load.
- Use Case: Simulate 1000 concurrent users executing query operations for 10 minutes.
- Steps:
- Configure 1000 thread groups in JMeter to loop through the query API.
- Monitor server CPU utilization (e.g., via Prometheus + Grafana).
- Expected Result: CPU utilization peak ≤ 80%, no sustained full load.
- Pass Criteria: CPU did not reach bottleneck, system did not crash.
- Steps:
- Use Case: Simulate 1000 concurrent users executing query operations for 10 minutes.
2. Memory Leak
- Scenario: Memory not released after long system operation.
- Use Case: Continuously run the system for 48 hours, simulating user login, operations, and logout processes.
- Steps:
- Use Locust to simulate daily active user behavior (e.g., 500 users per hour).
- Monitor JVM heap memory (e.g., via VisualVM).
- Expected Result: Memory usage curve stable, no sustained growth trend.
- Pass Criteria: Memory utilization fluctuation within ±5%.
- Steps:
- Use Case: Continuously run the system for 48 hours, simulating user login, operations, and logout processes.
3. Disk I/O Performance
- Scenario: Large file uploads/downloads lead to disk read/write bottlenecks.
- Use Case: Simulate 100 users uploading a 1GB file simultaneously.
- Steps:
- Use JMeter's HTTP request to upload large files.
- Monitor disk IOPS and read/write latency (e.g., via Linux
iostat
).
- Expected Result: Disk utilization ≤ 90%, average latency ≤ 50ms.
- Pass Criteria: File uploads successful with no timeout errors.
- Steps:
- Use Case: Simulate 100 users uploading a 1GB file simultaneously.
Application Level#
1. Slow Database Queries
- Scenario: Complex SQL queries lead to long response times.
- Use Case: Execute multi-table join queries (100,000 records).
- Steps:
- Trigger query interface through the application, recording SQL execution time.
- Check MySQL slow query log for the recorded statement.
- Expected Result: Query time ≤ 2 seconds, no records in the slow query log.
- Pass Criteria: Query time decreased by 50% after optimizing indexes.
- Steps:
- Use Case: Execute multi-table join queries (100,000 records).
2. Redis Cache Hit Rate
- Scenario: Cache expiration leads to frequent database hits.
- Use Case: Simulate a read scenario with a 90% cache hit rate.
- Steps:
- Use Gatling to simulate users frequently reading hot data (e.g., product details).
- Monitor Redis's
keyspace_hits
andkeyspace_misses
metrics.
- Expected Result: Cache hit rate ≥ 90%.
- Pass Criteria: Significant reduction in database query counts (e.g., 80% decrease).
- Steps:
- Use Case: Simulate a read scenario with a 90% cache hit rate.
3. Thread Pool Configuration
- Scenario: A small thread pool leads to request queuing.
- Use Case: Simulate burst traffic (e.g., 500 concurrent requests, thread pool capacity 100).
- Steps:
- Use JMeter to send 500 concurrent requests to the backend service.
- Monitor active thread count and queue backlog (e.g., via Spring Boot Actuator).
- Expected Result: Queue wait time ≤ 1 second, no requests rejected.
- Pass Criteria: Throughput increased by 30% after adjusting the thread pool.
- Steps:
- Use Case: Simulate burst traffic (e.g., 500 concurrent requests, thread pool capacity 100).
Network Level#
1. High Latency Scenario
- Scenario: Cross-regional access leads to increased network latency.
- Use Case: Call APIs from different regions (e.g., East US, Europe).
- Steps:
- Use BlazeMeter to configure multi-geographical node load testing.
- Measure average response time (RT).
- Expected Result: RT ≤ 3 seconds (cross-border access).
- Pass Criteria: After enabling CDN, RT drops to ≤ 1 second.
- Steps:
- Use Case: Call APIs from different regions (e.g., East US, Europe).
2. Bandwidth Bottleneck
- Scenario: Video streaming services consume a large amount of bandwidth.
- Use Case: Simulate 1000 users watching 1080P live streams simultaneously.
- Steps:
- Use LoadRunner to simulate video stream requests.
- Monitor server outbound bandwidth (e.g., via
iftop
).
- Expected Result: Bandwidth utilization ≤ 80%, no stuttering.
- Pass Criteria: After enabling load balancing, bandwidth distribution is balanced.
- Steps:
- Use Case: Simulate 1000 users watching 1080P live streams simultaneously.
Business Level#
1. High Concurrency Ordering (E-commerce Scenario)
- Scenario: Flash sales lead to a sudden surge in traffic.
- Use Case: Simulate 10,000 users simultaneously purchasing 100 items.
- Steps:
- Configure 10,000 threads in JMeter to start within 1 second.
- Verify order success rate and inventory consistency.
- Expected Result: Successful order count = 100, no overselling or duplicate orders.
- Pass Criteria: No dirty reads after optimizing database transaction isolation levels.
- Steps:
- Use Case: Simulate 10,000 users simultaneously purchasing 100 items.
2. Business Success Rate
- Scenario: Error rates of payment interfaces under high load.
- Use Case: Simulate 500 payment requests per second (for 5 minutes).
- Steps:
- Use Gatling to send payment requests, monitoring HTTP status codes.
- Calculate error rate (e.g., percentage of 5xx errors).
- Expected Result: Error rate ≤ 0.1%.
- Pass Criteria: Error rate drops to 0% after triggering the circuit breaker mechanism.
- Steps:
- Use Case: Simulate 500 payment requests per second (for 5 minutes).
3. Mixed Scenario Performance
- Scenario: Users simultaneously perform login, browsing, and ordering operations.
- Use Case: Simulate 70% of users browsing, 20% adding to cart, and 10% making payments.
- Steps:
- Use Locust to proportionally allocate behavior models.
- Monitor overall TPS and RT (e.g., 95th percentile ≤ 2 seconds).
- Expected Result: TPS ≥ 500, no blocking in business links.
- Pass Criteria: Response time fluctuations for each interface within ±10%.
- Steps:
- Use Case: Simulate 70% of users browsing, 20% adding to cart, and 10% making payments.
Based on Testing Methods#
Benchmark Test#
Objective
Determine the performance baseline of the system under low load or single user scenarios, providing a comparison basis for subsequent tests.
Scenario
- Single User Operations: No concurrent pressure, testing the system's optimal performance.
- Typical Use Cases: User login, product detail page loading, simple queries.
Test Case
Use Case 1: Single User Login Response Time
- Steps:
- Configure 1 thread (user) in JMeter to loop 10 times executing login API requests.
- Record the response time (RT) for each request.
- Expected Result:
- Average RT ≤ 500ms, standard deviation ≤ 50ms.
- Pass Criteria:
- No HTTP errors, RT fluctuations within 10%.
Load Test#
Objective
Validate system performance under expected maximum load, such as whether response time and throughput meet requirements.
Scenario
- Expected Concurrent Users: Simulate daily peak traffic (e.g., 5000 concurrent users during e-commerce promotions).
- Typical Use Cases: Multiple users placing orders, querying inventory, making payments simultaneously.
Test Case
Use Case 2: 5000 Concurrent Users Product Search
- Steps:
- Use LoadRunner to simulate 5000 users simultaneously executing product keyword searches (e.g., "mobile").
- Gradually increase load: add 500 users every 30 seconds until reaching 5000 concurrent users.
- Monitor metrics: TPS (transactions per second), RT (95th percentile), error rate.
- Expected Result:
- TPS ≥ 1000, RT ≤ 2 seconds, error rate ≤ 0.5%.
- Pass Criteria:
- Throughput stable, no resource (CPU/memory) sustained above 80%.
Stress Test#
Objective
Identify the system's performance limits and breaking points, validating fault tolerance under overload.
Scenario
- Exceeding Expected Load: Simulate burst traffic far exceeding design capacity (e.g., 10 times the daily peak).
- Typical Use Cases: Flash sale traffic surge, API being maliciously bombarded.
Test Case
Use Case 3: 100,000 Concurrent Users Flash Sale Load Test
- Steps:
- Configure 100,000 users in Gatling to request the flash sale interface instantaneously.
- Continuously apply pressure until the system crashes (e.g., large amounts of 5xx errors or service unavailability).
- Record the concurrency level, error types, and recovery time at the point of crash.
- Expected Result:
- The system should handle at least 80,000 concurrent users before crashing, with automatic recovery within 10 minutes.
- Pass Criteria:
- Key services (e.g., order inventory) show no data inconsistencies.
Concurrency Test#
Objective
Validate resource contention and synchronization issues (e.g., deadlocks, dirty reads) when multiple users operate simultaneously.
Scenario
- High Contention Operations: The same resource is frequently modified (e.g., inventory deduction, account balance updates).
- Typical Use Cases: Multiple people modifying the same order, ticket grabbing scenarios.
Test Case
Use Case 4: 100 Users Concurrently Modifying the Same Product Inventory
- Steps:
- Configure 100 threads in JMeter to simultaneously send requests to "deduct inventory by 1".
- Initial inventory is 100; verify the final inventory is 0.
- Check the database transaction logs for deadlocks or timeouts.
- Expected Result:
- Final inventory = 0, no overselling or negative inventory.
- Pass Criteria:
- Database transaction isolation level (e.g., using optimistic locking) effectively avoids dirty reads.
Endurance Test#
Objective
Detect performance degradation (e.g., memory leaks, connection pool exhaustion) during prolonged system operation.
Scenario
- Continuous Load Operation: Simulate the system providing 7×24 hours of service, processing requests without interruption.
- Typical Use Cases: Financial system end-of-day batch processing, IoT devices continuously reporting data.
Test Case
Use Case 5: 72 Hours Continuous Order Processing
- Steps:
- Use Locust to simulate 1000 users placing orders every hour for 72 hours.
- Monitor JVM heap memory, database connection pool usage, and thread leaks.
- Restart the load testing tool every 12 hours to avoid resource leaks in the tool itself.
- Expected Result:
- Memory utilization fluctuation ≤ 10%, no OOM (out of memory) errors.
- Pass Criteria:
- Throughput (TPS) decline ≤ 5%.
Capacity Test#
Objective
Determine the maximum processing capacity of the system (user count/data volume), providing a basis for scaling.
Scenario
- Data Volume Expansion: Increase database single table data from millions to hundreds of millions.
- User Volume Expansion: Gradually increase registered users from 100,000 to 1,000,000.
Test Case
Use Case 6: Query Performance Under Hundreds of Millions of Data Volume
- Steps:
- Insert 100 million test records into the database, building composite indexes.
- Use JMeter to simulate 100 concurrent users executing paginated queries (page=10000, size=10).
- Monitor SQL execution time and disk I/O.
- Expected Result:
- Query time ≤ 1 second, index coverage rate reaches 100%.
- Pass Criteria:
- Query performance improved by 50% after sharding.
Best Practices#
-
Layered Design:
- Foundation Layer: Use benchmark tests to cover the single-user performance of core interfaces (Focus: response time).
- Business Layer: Use load/stress tests to validate complex scenarios (Focus: throughput, error rate).
- System Layer: Use stability tests to check for resource leaks (Focus: memory, connection pool).
-
Matrix Coverage Method:
Combine focus areas and methods into a matrix to ensure each focus area is covered by at least one method. For example:Load Test Stress Test Stability Test CPU Utilization ✅ ✅ Memory Leak ✅ Database Performance ✅ ✅