A Python script that processes CSV files to filter and extract email addresses based on login status and optional exclusion lists. Creates output files for account management and cleanup.
- Interactive File Selection: GUI dialogs for selecting input files
- Login Status Filtering: Filters users based on login activity
- Email Exclusion: Optional exclusion of specific emails from a text file (emails to keep)
- Domain-based Output: Automatically detects email domain and creates appropriately named output files
- Data Validation: Validates email formats and CSV structure
- Error Handling: Comprehensive error handling with user-friendly messages
- Python 3.6 or higher
- pandas library
-
Install Python dependencies:
pip install -r requirements.txt
Or install pandas directly:
pip install pandas
-
Verify installation:
python --version python -c "import pandas; print('pandas version:', pandas.__version__)"
python csv_email_processor.pyFor a user-friendly graphical interface:
python csv_email_processor_gui.pyGUI Features:
- ✅ Simple button interface (Select/Clear for files, Start Processing)
- ✅ Clear buttons to remove file selections with smart enable/disable
- ✅ Real-time status log with progress updates
- ✅ No console commands needed
- ✅ Intuitive file selection with filename display
- ✅ Multithreaded processing to prevent UI freezing
- ✅ Success/error dialogs with detailed results
For the smallest, optimized executable:
-
Build the GUI executable:
python build_gui_exe.py
-
Run the executable:
./dist/CSVEmailProcessorGUI.exe
Optimizations applied:
- 🎯 Minimal file size (~8-12 MB)
- 🚀 Excluded unnecessary modules
- ⚡ Bytecode optimization (--optimize=2)
- 📦 Optional UPX compression for additional size reduction
- 🖥️ Pure GUI (no console window)
- 🔒 Debug symbols stripped
- Launch: Run
csv_email_processor_gui.pyor the GUI executable - Select CSV: Click "Select CSV" button to choose your input file
- File must contain:
Email Address [Required]andLast Sign In [READ ONLY]columns - Use "Clear" button to remove selection if needed
- File must contain:
- Select TXT (Optional): Click "Select TXT" to choose emails to exclude
- Optional file with emails to exclude from deletion (one per line)
- Use "Clear" button to remove selection if needed
- Start Processing: Click "Start Processing" button
- Monitor Progress: Watch the status log for real-time updates
- Complete: Success dialog shows output file location
- Select CSV File: Dialog opens asking for input CSV file
- Optional TXT File: Choose whether to select exclusion file
- Processing: Automatic processing with console output
- Output: Creates
[domain]-to-delete.csvfile
- ✅ Filters: Users who have logged in (NOT "Never logged in") for deletion
- ✅ Excludes: Emails found in optional TXT file from deletion
- ✅ Validates: Email format and CSV structure
- ✅ Outputs:
[domain]-to-delete.csvwithprimaryEmailcolumn
Expected CSV structure:
First Name [Required],Last Name [Required],Email Address [Required],Status [READ ONLY],Last Sign In [READ ONLY],Email Usage [READ ONLY]
John,Doe,john.doe@company.com,Active,2024-01-15,1.2GB
Jane,Smith,jane.smith@company.com,Active,Never logged in,0.0GBGenerated file: company-to-delete.csv
primaryEmail
john.doe@company.comTXT file with emails to exclude from deletion (one per line):
admin@company.com
support@company.com
noreply@company.com
-
Login Status Filtering:
- Includes users who have logged in (NOT "Never logged in") in the deletion list
- Excludes users with "Never logged in" status from deletion
- Rationale: Active accounts that have been used are identified for potential deletion or management
-
Email Exclusion:
- Removes any emails found in the optional TXT file from the deletion list
- Case-insensitive matching
-
Domain Detection:
- Automatically extracts domain from the first email address
- Warns if multiple domains are detected
The script handles various error conditions:
- Missing Required Columns: Validates CSV structure
- Invalid Email Formats: Skips malformed emails with warnings
- File Access Errors: Handles permission and file not found errors
- Empty Results: Warns when no data remains after filtering
- Multiple Domains: Warns and uses the primary domain
- Input: CSV with mixed login statuses
- No exclusion file
- Output: All users who have logged in (active accounts)
- Input: CSV with user data
- Exclusion file: Admin and service accounts
- Output: Active users excluding protected accounts
- Process multiple CSV files from different domains
- Generate separate deletion lists per domain
-
"Missing required columns" error:
- Ensure CSV has
Email Address [Required]andLast Sign In [READ ONLY]columns - Check for extra spaces or different capitalization
- Ensure CSV has
-
"No valid email addresses found":
- Verify email format in CSV (must contain @ symbol)
- Check if all users have "Never logged in" (no active users)
-
Permission denied when saving:
- Ensure write permissions in the CSV file directory
- Close the input CSV if it's open in Excel
-
Empty output file:
- All users may have "Never logged in" (no active accounts)
- All emails may be in the exclusion list
If you get ModuleNotFoundError: No module named 'pandas':
pip install pandasIf you get tkinter errors on Linux:
sudo apt-get install python3-tkIf you get "No module named 'secrets'" error:
- Update dependencies:
pip install --upgrade pandas numpy pyinstaller
General build troubleshooting:
-
PyInstaller not found:
pip install pyinstaller
-
Missing dependencies in executable:
- Update pandas/numpy to latest versions
- Ensure all required modules are installed
-
Build process fails:
- Check if antivirus is blocking PyInstaller
- Try running as administrator (Windows)
- Ensure all dependencies are up to date
-
Executable file size:
- Standard build: ~8-12 MB
- With UPX compression: ~4-8 MB (additional optimization)
The script processes CSV files containing user data and creates a deletion list:
- Filters: Users who have logged in (excludes "Never logged in" users)
- Processes: Email validation and domain detection
- Outputs: Domain-specific CSV file with emails ready for account management
- Excludes: Any emails specified in the optional TXT exclusion file
The script identifies active users who have actually used their accounts for potential deletion or management.
- Framework: Pure Python with tkinter dialogs and pandas
- Interface: Console output with GUI file dialogs
- Threading: Single-threaded execution
- Framework: tkinter GUI with pandas for data processing
- Interface: Full graphical interface with Select/Clear buttons and real-time status log
- File Management: Smart file selection with clear buttons and filename display
- Threading: Multi-threaded processing to prevent UI freezing
- User Experience: Real-time status updates, success/error dialogs with detailed results
- Layout: Responsive grid layout (600x500 resizable window)
- Status Logging: Scrollable text area with automatic scrolling to latest updates
- Optimized build: ~8-12 MB (excluded unnecessary modules)
- UPX compressed: ~4-8 MB (additional compression if UPX available)
- Excluded modules: matplotlib, test, unittest, doctest, pdb, PIL, setuptools, pkg_resources
- Included optimizations: bytecode optimization, debug symbol stripping, windowed mode
- File Handling: Supports various CSV encodings and formats
- Memory Efficient: Processes large CSV files using pandas
- Cross-platform: Works on Windows, macOS, and Linux
- Error Handling: Comprehensive validation and user feedback
For issues or questions:
- Check the troubleshooting section above
- Verify your CSV file format matches the expected structure
- Ensure all dependencies are installed correctly