# Large-Scale Database Seeding Guide

This guide explains how to seed your Laravel application with millions of rows for performance testing and development.

## Overview

The large-scale seeding system is designed to efficiently populate your database with realistic test data while maintaining good performance and memory usage.

## Features

- **Batch Processing**: Processes data in configurable batches to manage memory usage
- **Progress Tracking**: Real-time progress bars and performance metrics
- **Memory Optimization**: Automatic garbage collection and memory management
- **Database Optimization**: Automatic database configuration for bulk operations
- **Performance Monitoring**: Built-in performance testing and benchmarking

## Quick Start

### Basic Usage

```bash
# Seed with default settings (100K users, 500K orders, 300K bookings)
php artisan db:seed-large

# Seed with custom amounts
php artisan db:seed-large --users=1000000 --orders=2000000 --bookings=1000000

# Fresh migration before seeding
php artisan db:seed-large --fresh

# Custom batch size and memory limit
php artisan db:seed-large --batch-size=2000 --memory-limit=4G
```

### Available Options

| Option | Default | Description |
|--------|---------|-------------|
| `--users` | 100,000 | Number of users to create |
| `--orders` | 500,000 | Number of orders to create |
| `--bookings` | 300,000 | Number of bookings to create |
| `--batch-size` | 1,000 | Batch size for processing |
| `--memory-limit` | 2G | Memory limit for the process |
| `--fresh` | false | Run fresh migration before seeding |

## Performance Optimization

### Database Configuration

Before running large-scale seeding, optimize your database configuration:

#### MySQL/MariaDB

```sql
-- Run these commands before seeding
SET FOREIGN_KEY_CHECKS = 0;
SET UNIQUE_CHECKS = 0;
SET AUTOCOMMIT = 0;
SET SORT_BUFFER_SIZE = 256*1024*1024;
SET KEY_BUFFER_SIZE = 512*1024*1024;
```

#### PostgreSQL

```sql
-- Increase work memory
SET work_mem = '256MB';
SET maintenance_work_mem = '1GB';
SET synchronous_commit = off;
```

#### SQLite

```sql
-- Enable WAL mode
PRAGMA journal_mode = WAL;
PRAGMA cache_size = 10000;
PRAGMA synchronous = OFF;
```

### PHP Configuration

Update your `php.ini` for better performance:

```ini
memory_limit = 2G
max_execution_time = 0
max_input_time = -1
```

### Laravel Configuration

Add these to your `.env` file:

```env
# Seeder Configuration
SEEDER_BATCH_SIZE=1000
SEEDER_TOTAL_USERS=100000
SEEDER_TOTAL_ORDERS=500000
SEEDER_TOTAL_BOOKINGS=300000

# Database Optimization
DB_FOREIGN_KEYS=false
```

## Performance Testing

Run performance tests to find optimal settings:

```bash
# Run performance tests
php artisan db:seed --class=PerformanceSeeder
```

This will test:
- Different batch sizes (100, 500, 1000, 2000, 5000)
- Different insert methods (Eloquent, Raw SQL)
- Memory usage patterns
- Database query performance

## Memory Management

The seeder automatically manages memory by:

1. **Batch Processing**: Processing data in small batches
2. **Garbage Collection**: Running `gc_collect_cycles()` after each batch
3. **Memory Monitoring**: Tracking memory usage and providing warnings
4. **Unset Variables**: Clearing large arrays after use

## Expected Performance

Based on testing with different configurations:

| Records | Batch Size | Memory Usage | Time (approx) |
|---------|------------|--------------|---------------|
| 100K users | 1,000 | 128MB | 2-3 minutes |
| 500K orders | 1,000 | 256MB | 5-8 minutes |
| 1M users | 2,000 | 512MB | 8-12 minutes |
| 5M records | 5,000 | 1GB | 20-30 minutes |

*Performance varies based on hardware and database configuration*

## Troubleshooting

### Memory Issues

If you encounter memory issues:

1. **Reduce batch size**: `--batch-size=500`
2. **Increase memory limit**: `--memory-limit=4G`
3. **Process in smaller chunks**: Run multiple times with different record counts

### Timeout Issues

If you get timeout errors:

1. **Increase PHP execution time**: `max_execution_time = 0`
2. **Use background processing**: Run with `nohup` or screen
3. **Split into smaller jobs**: Process data in smaller batches

### Database Lock Issues

If you get database lock errors:

1. **Disable foreign key checks**: Already handled automatically
2. **Use transactions**: Wrap operations in database transactions
3. **Optimize database settings**: Use the provided SQL optimization scripts

## Monitoring Progress

The seeder provides real-time feedback:

```
Starting large-scale database seeding...
Seeding categories...
Seeding services...
Seeding beauty experts...
Seeding 100000 users...
████████████████████████████████████████ 100%
Seeding 500000 orders...
████████████████████████████████████████ 100%
✅ Large-scale seeding completed successfully!
⏱️  Execution time: 15.23 seconds
💾 Memory used: 256.45 MB
📈 Peak memory: 512.78 MB
```

## Data Quality

The seeder generates realistic test data:

- **Users**: Random names, emails, phone numbers, dates of birth
- **Orders**: Realistic order amounts, statuses, timestamps
- **Bookings**: Future booking dates, realistic time slots
- **Payments**: Various payment methods, transaction IDs
- **Relationships**: Proper foreign key relationships maintained

## Cleanup

To clean up test data:

```bash
# Truncate all tables (be careful!)
php artisan migrate:fresh

# Or delete specific test data
php artisan tinker
>>> User::where('email', 'like', 'test_%')->delete();
```

## Best Practices

1. **Start Small**: Begin with smaller datasets to test your setup
2. **Monitor Resources**: Watch CPU, memory, and disk usage
3. **Backup First**: Always backup your database before large operations
4. **Test Queries**: Verify that your application works with large datasets
5. **Index Optimization**: Ensure proper database indexes are in place
6. **Regular Cleanup**: Clean up test data regularly to maintain performance

## Advanced Usage

### Custom Seeders

Create custom seeders for specific needs:

```php
<?php

namespace Database\Seeders;

use Illuminate\Database\Seeder;

class CustomLargeSeeder extends Seeder
{
    public function run(): void
    {
        // Your custom large-scale seeding logic
    }
}
```

### Chunked Processing

For extremely large datasets, process in chunks:

```bash
# Process 1M users in 10 chunks of 100K each
for i in {1..10}; do
    php artisan db:seed-large --users=100000 --orders=0 --bookings=0
done
```

### Background Processing

Run seeding in the background:

```bash
# Run in background
nohup php artisan db:seed-large --users=1000000 > seeding.log 2>&1 &

# Monitor progress
tail -f seeding.log
```

## Support

If you encounter issues:

1. Check the logs in `storage/logs/laravel.log`
2. Verify database configuration
3. Test with smaller datasets first
4. Check available system resources

For performance optimization, run the performance seeder to find optimal settings for your environment.
