Instructions & Pro Tips
Master the art of AI-powered data transformation
Everything you need to know to transform data like a pro
Quick Navigation
Getting Started
Learn the basics
Pro Tips
Advanced techniques
Templates
Ready-made examples
FAQ
Common questions
Getting Started
Basic Workflow
- Upload your CSV file - Drag and drop or click to browse. Files up to 10MB are supported.
- Preview your data - Review the columns and sample data to understand what you're working with.
- Describe transformations - Tell the AI what you want to do in plain English.
- Review results - Check the transformed data preview to ensure it meets your needs.
- Visualize data - Make sure you have things 100% dialled in.
- Download - Get your cleaned data with a custom filename.
Speed Up Future Processing: Save your transformations as recipes! Once you've perfected a transformation, save it as a recipe. This makes applying the same transformation to new files incredibly fast - perfect for recurring data processing tasks.
Pro Tips for Better Results
Writing Effective Instructions
✓ Good Example:
"Remove columns A, B, and C. Add a time column starting at 0 and incrementing by 0.001 seconds for each row. Rename 'Voltage_1' to 'U1' and 'Current_1' to 'I1'. Fill any missing values in electrical columns with 0."
"Remove columns A, B, and C. Add a time column starting at 0 and incrementing by 0.001 seconds for each row. Rename 'Voltage_1' to 'U1' and 'Current_1' to 'I1'. Fill any missing values in electrical columns with 0."
✗ Poor Example:
"Clean the data and make it better."
"Clean the data and make it better."
Key Principles
- Be Specific: Mention exact column names, values, and operations you want
- Use Clear Language: "Remove column A" instead of "delete the first column"
- Specify Data Types: "Keep time as numbers" or "convert to text format"
- Mention Order: "Put time column first" or "arrange columns as: time, voltage, current"
- Handle Missing Data: Specify what to do with empty cells - fill with 0, remove rows, etc.
Recipe Power User Tip: Create a library of recipes for different data sources. Name them clearly like "Yokogawa WT5000 Cleanup" or "Temperature Sensor Data Processing." This builds a personal toolkit of proven transformations.
Advanced Mode Benefits
- Custom Prompts: Fine-tune the AI's behavior with your own instructions
- Model Selection: Choose between GPT-4, GPT-4 Turbo, or GPT-3.5 based on your needs
- Temperature Control: Lower values (0.1) for consistent results, higher for creative solutions
- Token Limits: Adjust for simple (500 tokens) or complex (2000+ tokens) transformations
Electrical Data Warning: When working with electrical measurements, negative values are normal for AC systems. Always specify "preserve negative values" if your data contains voltage/current measurements.
Template Library
Here are proven transformation templates you can use as starting points:
Yokogawa WT5000
Clean up Yokogawa WT5000 electrical measurement data:
- Remove first 5 columns (keep only the electrical measurement columns)
- Add a time column starting at zero increasing in increments of 0.000001 seconds
- Put the time column first
- If electrical columns U1, U2, U3, I1, I2, I3 don't exist, create them with zero values
- Final column order should be: time, U1, I1, U2, I2, U3, I3
- Keep time values as numbers (float), do not convert to strings
Time Series Data
Clean up time series data:
- Add proper time column with sequential timestamps
- Remove any empty or null rows
- Interpolate missing values where appropriate
- Sort data by timestamp
- Ensure consistent time intervals
Basic Cleanup
Basic data cleanup:
- Remove completely empty columns and rows
- Remove columns that are mostly empty (>90% missing)
- Standardize column names (remove spaces, special characters)
- Fill remaining missing values with appropriate defaults
- Remove duplicate rows
Sensor Data
Process sensor readings:
- Convert timestamps to proper datetime format
- Add sensor ID column if missing
- Scale temperature readings from Celsius to Fahrenheit
- Flag outlier readings beyond normal range
- Group readings by hour and calculate averages
Template Customization: Use these templates as starting points, then modify them for your specific needs. Once you've perfected a variation, save it as your own recipe for instant reuse.
Batch Processing Mastery
Perfect Your Recipe First
- Use Single File Mode - Test your transformation on one representative file first
- Refine Until Perfect - Adjust instructions until you get exactly what you want
- Save as Recipe - Once perfect, save it with a descriptive name
- Switch to Batch Mode - Use your proven recipe to process hundreds of files
Batch Processing Secret: Recipes make batch processing lightning fast! The system remembers exactly how to transform your data, so subsequent files process almost instantly. This can turn a 30-minute job into a 30-second job.
Batch Best Practices
- File Naming: Keep original filenames consistent for easier output organization
- File Limits: Maximum 500 files per batch, 10MB per file
- Quality Control: Check the processing summary for any failed files
- Output Organization: Processed files are automatically renamed with "(cleaned)" suffix
Frequently Asked Questions
Q: Why should I save recipes instead of just retyping instructions?
A: Recipes provide massive speed improvements for repeated tasks. They ensure consistency and eliminate the need to regenerate transformations each time. Perfect for recurring data processing workflows.
Q: What file formats are supported?
A: Currently, only CSV files are supported. Files must be under 10MB for single processing, with up to 500 files allowed in batch mode.
Q: Can I see the code that was generated?
A: Yes! After processing, click "View Applied Transformations" to see the exact pandas code that was generated. This is helpful for learning and verification.
Q: What happens if the AI makes a mistake?
A: You can always review the preview before downloading. If results aren't correct, refine your instructions and try again. More specific instructions typically yield better results.
Q: When should I use Advanced Mode?
A: Use Advanced Mode when you need specific AI model settings, want to write custom prompts, or need fine control over the transformation process. Most users find Simple Mode sufficient.
Q: Are my files stored or shared?
A: Files are processed in memory and not permanently stored. Your data remains private and secure throughout the transformation process.
Q: How do I handle very large datasets?
A: For files larger than 10MB, split them into smaller chunks first. The system is optimized for quick processing of reasonably-sized datasets.
Performance Optimization
Speed Up Your Workflow
- Use Recipes: Saved recipes process much faster than generating new transformations
- Clear Instructions: Specific instructions lead to faster, more accurate results
- Template Starting Points: Begin with templates and modify rather than starting from scratch
- Batch Processing: Process multiple similar files together for maximum efficiency
Workflow Optimization: Create a systematic approach - perfect one file, save the recipe, then batch process the rest. This methodology works for any recurring data transformation task.