Data engineering tools must be capable of performing the following tasks.
- Data Acquisition: Sourcing the data from different systems
- Data Cleansing: Detecting and correcting errors in data
- Data Conversion: Converting data from one format to another
- Data Disambiguation: Interpreting data that has multiple meanings
- Data De-duplication: Removing duplicate copies of data
Popular data engineering technology and tools are,
- Talend
- Five Tran
- Informatica
- Apache Spark
- Infoveave
- SQL
- Python